12 OML4Py Metrics

OML4Py provides a metric module that contains functions to compute metrics for model evaluation in the database.

For information on the oml.metrics.metric_function class attributes and methods, call help(oml.metrics.metric_function) or see Oracle Machine Learning for Python API Reference. For example, to learn more about confusion_matrix, call help(oml.metrics.confusion_matrix).

The following table lists the functions supported by the oml.metrics in the metrics module on the prediction data frame.

All functions in the table require pred_df(prediction data frame), y_true(true values), and y_pred(predicted values) as input arguments, where:
  • pred_df(prediction data frame): Data frame that contains the ground truth or actual values, the predicted values, and optionally the sample weights. If the data frame does not contain the required columns the function will throw an exception. The data frame can contain any number of columns; however, only the columns specified in the parameters will be used for computation. There are no restrictions on the column names.

    The target and prediction columns can be of any type, except oml.Bytes, when performing classification or regression metrics. Additionally, for regression metrics, both the prediction and true value columns must not be of the type oml.String. If a weighted metric is calculated, both the data frame and the metric must include the weight column. The weight column must be of type oml.float or oml.integer, and its sum must be greater than zero.

  • y_true(true values): The name of the column that contains the target (actual) values.
  • y_pred(predicted values): The name of the column that contains the predicted values.

Table 12-1 Metrics Module

Metric Function Description

confusion_matrix

Computes the confusion matrix.

precision_score

Computes the precision.

recall_score

Computes the recall.

f1_score

Computes the f1 score.

accuracy_score

Computes the accuracy.

balanced_accuracy_score

Computes the balanced accuracy.

roc_auc_score

Computes the Area Under the Receiver Operating Characteristic Curve (ROC AUC).

mean_squared_error

Computes the mean squared error.

mean_squared_log_error

Computes the mean squared log error.

mean_absolute_error

Computes the mean absolute error.

median_absolute_error

Computes the median absolute error.

r2_score

Computes the coefficient of determination.

To learn more about the metric functions, see Oracle Machine Learning for Python API Reference.

Example 12-1 Using the confusion matrix Function.

This example uses the function confusion_matrix to compute the confusion matrix for the provided DataFarme.

import oml
import pandas as pd
from oml.metrics import confusion_matrix

y_true = [1, 0, 1, 0, 1, 1, 0, 1, 1, 0]
y_pred = [1, 1, 1, 0, 0, 1, 1, 0, 0, 0]
sample_weight = [1, 2, 1, 1, 1, 1, 1, 3, 1, 1]
    
pdf = pd.DataFrame({'y_true': y_true, 'y_pred': y_pred, 'sample_weight': sample_weight})
df = oml.create(pdf, 'test_class')
confusion_matrix(df, 'y_true', 'y_pred', sample_weight='sample_weight')
array([[2, 3], [5, 3]])