Evaluating the model

As soon as the trained model is available, you can further score the model and evaluate the metrics.

  • Score Model: view statistics and histograms for separate columns of the dataset.

    To score the model and check its performance on unseen data, you must feed the test dataset from Split Data activity and the trained model into the Score Model activity.

  • Evaluate Model: view metrics, confusion matrices, error distribution, ROC curve graph.

    To evaluate the model, you must feed the Score Model output into the Evaluate Model activity.

    Metrics are demonstrated for both classification and regression type machine learning problems, in the output port of evaluate model activity.

    In classification problems, you will be presented these metrics:

    • Accuracy: ratio of correctly predicted observations to the total observations
    • Precision: percentage of tuples that the classifier labeled as positive and are positive
    • Recall: the number of correct positive results divided by the number of all relevant samples. This is all samples that should have been identified as positive.
    • Confusion matrix: the performance of the model presented in a matrix depicting true positives, true negatives, false positives and false negatives. The number of samples from the test set classified correctly, the number of misclassified values or incorrect predictions
    • AUC – Area Under the Curve: the probability that the classifier will rank a randomly chosen positive example higher than a randomly chosen negative example
    • F1 Score: tries to find the balance between precision and recall - how precise is your classifier, for example, how many instances it classifies correctly. Additionally, how robust is your classifier such as, does it miss a significant number of instances.

    In regression problems, these are the metrics reported:

    • Coefficient of determination (R2) is representing the predictive power of the model. Value can range between 0 and 1, 1 meaning the model is a perfect fit.
    • Mean absolute error (MAE) refers to the closeness of the model predictions to the actual outcomes. Lower MAE value means better results.
    • Root mean squared error (RMSE) represents a value which sums the error which occurred between the test values and the predicted values. RMSE value closer to 0 indicates better model performance.
    • Relative absolute error (RAE) refers to the relative absolute difference between expected and actual predictions. Value can range from 0 to infinite, 0 representing best performance.
    • Relative squared error (RSE) measures the performance based on comparison with a simple predictor performance and normalizes the total squared error of the predictions returned from the model. Value can range from 0 to infinite, 0 representing best performance.
  • Compare Models: Select up to four trained models that are similar in parallel flows. Scoring and evaluation results of the connected trained models as inputs are presented in a single screen.
  1. Select a quest.
  2. Drag and drop the Evaluate Model activity box to the canvas. The Evaluate Model Activity Catalog is displayed.
  3. Select the evaluate model activity.
    • Score Model
    • Evaluate Model
    • Compare models
  4. Specify the parameters required.
  5. Click Save.
  6. Click Run.
  7. Click the output port to view the evaluation of the trained model. Depending on the algorithm that has been used, corresponding contents of the outcome are presented.