mlm_insights.core.metrics.regression_metrics package¶
Submodules¶
mlm_insights.core.metrics.regression_metrics.max_error module¶
- class mlm_insights.core.metrics.regression_metrics.max_error.MaxError(config: ~typing.Dict[str, ~mlm_insights.constants.definitions.ConfigParameter] = <factory>, target_column: str = 'y_true', prediction_column: str = 'y_predict', max_of_residual: float = 0.0)¶
Bases:
DatasetMetricBase
MaxError metric computes the maximum residual error. This is a dataset level metricIt is an accurate metric which can process any column type and only numerical (int, float) data types.This metric falls under regression category, and is used for predictive modeling problems that involve predicting a numeric value or to measure error/performance etc for regression models etc.Both Ground truth and Prediction target columns should not contain any NaN values otherwise InvalidTargetPredictionException will be thrownConfiguration¶
None
Parameters¶
- y_true: array-like of shape (n_samples,)
Ground truth (correct) target values.
- y_predarray-like of shape (n_samples,)
Estimated target values.
Returns¶
- max_errorfloat
A positive floating point value (the best value is 0.0).
Exceptions¶
MissingRequiredParameterException
InvalidTargetPredictionException
Examples
from mlm_insights.builder.builder_component import MetricDetail, EngineDetail from mlm_insights.builder.insights_builder import InsightsBuilder from mlm_insights.constants.types import FeatureType, DataType, VariableType, ColumnType from mlm_insights.core.metrics.regression_metrics.max_error import MaxError from mlm_insights.core.metrics.metric_metadata import MetricMetadata import pandas as pd def main(): input_schema = { 'square_feet': FeatureType( data_type=DataType.FLOAT, variable_type=VariableType.CONTINUOUS, column_type=ColumnType.INPUT), 'house_price_prediction': FeatureType( data_type=DataType.FLOAT, variable_type=VariableType.CONTINUOUS, column_type=ColumnType.PREDICTION), 'house_price_target': FeatureType( data_type=DataType.FLOAT, variable_type=VariableType.CONTINUOUS, column_type=ColumnType.TARGET) } data_frame = pd.DataFrame({'square_feet': [11.23, 23.45, 11.23, 45.56, 11.23], {'house_price_target': [1, 2, 3, 4, 5], 'house_price_prediction': [1.1, 2.5, 3.8, 5.1, 4.9]}}) metric_details = MetricDetail(univariate_metric={}, dataset_metrics=[MetricMetadata(klass=MaxError)]) runner = InsightsBuilder(). with_input_schema(input_schema). with_data_frame(data_frame=data_frame). with_metrics(metrics=metric_details). with_engine(engine=EngineDetail(engine_name="native")). build() profile_json = runner.run().profile.to_json() dataset_metrics = profile_json['dataset_metrics'] print(dataset_metrics["MaxError"]) # {'value': 1.1} if __name__ == "__main__": main() Returns the standard metric result as: { 'metric_name': 'MaxError', 'metric_description': 'MaxError metric computes the maximum residual error', 'variable_count': 1, 'variable_names': ['max_error'], 'variable_types': [CONTINUOUS], 'variable_dtypes': [FLOAT], 'variable_dimensions': [0], 'metric_data': [1.1], 'metadata': {}, 'error': None }
- compute(dataset: DataFrame, **kwargs: Any) None ¶
Computes Max error of residual for the passed in dataset
Parameters¶
- datasetpd.DataFrame
DataFrame object for either the entire dataset for a partition on which a Metric is being computed
- classmethod create(config: Dict[str, ConfigParameter] | None = None, **kwargs: Any) MaxError ¶
Create a MaxError metric using the configuration and kwargs
Parameters¶
- configOptional[Dict[str, ConfigParameter]]
Metric configuration
- get_result(**kwargs: Any) Dict[str, Any] ¶
Returns the computed value of the metric
Returns¶
Dict[str, Any]: Dictionary with key as string and value as any metric property.
- get_standard_metric_result(**kwargs: Any) StandardMetricResult ¶
This method returns metric output in standard format.
Returns¶
StandardMetricResult
- max_of_residual: float = 0.0¶
- merge(other_metric: MaxError, **kwargs: Any) MaxError ¶
Merge two MaxError into one, without mutating the others.
Parameters¶
- other_metricMaxError
Other MaxError that need be merged.
Returns¶
- MaxError: MaxError
A new instance of MaxError
- prediction_column: str = 'y_predict'¶
- target_column: str = 'y_true'¶
mlm_insights.core.metrics.regression_metrics.mean_absolute_error module¶
- class mlm_insights.core.metrics.regression_metrics.mean_absolute_error.MeanAbsoluteError(config: ~typing.Dict[str, ~mlm_insights.constants.definitions.ConfigParameter] = <factory>, target_column: str = 'y_true', prediction_column: str = 'y_predict', total_count: int = 0, sum_of_residuals: float = 0.0)¶
Bases:
DatasetMetricBase
Computes Mean Absolute Error regression loss. This is a dataset level metricIt is an accurate metric which can process any column type and only numerical (int, float) data types.This metric falls under regression category, and is used for predictive modeling problems that involve predicting a numeric value or to measure error/performance etc for regression models.Both Ground truth and Prediction target columns should not contain any NaN values otherwise InvalidTargetPredictionException will be thrownConfiguration¶
None
Parameters¶
- y_truearray-like of shape (n_samples,) or (n_samples, n_outputs)
Ground truth (correct) target values.
- y_predarray-like of shape (n_samples,) or (n_samples, n_outputs)
Estimated target values.
Returns¶
float: Mean Absolute Error
Exceptions¶
MissingRequiredParameterException
InvalidTargetPredictionException
Examples
from mlm_insights.builder.builder_component import MetricDetail, EngineDetail from mlm_insights.builder.insights_builder import InsightsBuilder from mlm_insights.constants.types import FeatureType, DataType, VariableType, ColumnType from mlm_insights.core.metrics.regression_metrics.mean_absolute_error import MeanAbsoluteError from mlm_insights.core.metrics.metric_metadata import MetricMetadata import pandas as pd def main(): input_schema = { 'square_feet': FeatureType( data_type=DataType.FLOAT, variable_type=VariableType.CONTINUOUS, column_type=ColumnType.INPUT), 'house_price_prediction': FeatureType( data_type=DataType.FLOAT, variable_type=VariableType.CONTINUOUS, column_type=ColumnType.PREDICTION), 'house_price_target': FeatureType( data_type=DataType.FLOAT, variable_type=VariableType.CONTINUOUS, column_type=ColumnType.TARGET) } data_frame = pd.DataFrame({'square_feet': [11.23, 23.45, 11.23, 45.56, 11.23], {'house_price_target': [1, 2, 3, 4, 5], 'house_price_prediction': [1.1, 2.5, 3.8, 5.1, 4.9]}}) metric_details = MetricDetail(univariate_metric={}, dataset_metrics=[MetricMetadata(klass=MeanAbsoluteError)]) runner = InsightsBuilder(). with_input_schema(input_schema). with_data_frame(data_frame=data_frame). with_metrics(metrics=metric_details). with_engine(engine=EngineDetail(engine_name="native")). build() profile_json = runner.run().profile.to_json() dataset_metrics = profile_json['dataset_metrics'] print(dataset_metrics["MeanAbsoluteError"]) if __name__ == "__main__": main() Returns the standard metric result as: { 'metric_name': 'MeanAbsoluteError', 'metric_description': 'MaxError metric computes the maximum residual error', 'variable_count': 1, 'variable_names': ['mean_absolute_error'], 'variable_types': [CONTINUOUS], 'variable_dtypes': [FLOAT], 'variable_dimensions': [0], 'metric_data': [1.1], 'metadata': {}, 'error': None }
- compute(dataset: DataFrame, **kwargs: Any) None ¶
Computes Numerator for the Mean Absolute Error for the passed in dataset
Parameters¶
- datasetpd.DataFrame
DataFrame object for either the entire dataset for a partition on which a Metric is being computed
- classmethod create(config: Dict[str, ConfigParameter] | None = None, **kwargs: Any) MeanAbsoluteError ¶
Create a MeanAbsoluteError metric using the configuration and kwargs
Parameters¶
- configOptional[Dict[str, ConfigParameter]]
Metric configuration
- features_metadata: FeatureMetadata
Contains input schema for each feature, supplied as a keyword argument
- get_result(**kwargs: Any) Dict[str, Any] ¶
Returns the computed value of the metric
Returns¶
Dict[str, Any]: Dictionary with key as string and value as any metric property.
- get_standard_metric_result(**kwargs: Any) StandardMetricResult ¶
This method returns metric output in standard format.
Returns¶
StandardMetricResult
- merge(other_metric: MeanAbsoluteError, **kwargs: Any) MeanAbsoluteError ¶
Merge two MeanAbsoluteError into one, without mutating the others.
Parameters¶
- other_metricMeanAbsoluteError
Other MeanAbsoluteError that need be merged.
Returns¶
- MeanAbsoluteError: MeanAbsoluteError
A new instance of MeanAbsoluteError
- prediction_column: str = 'y_predict'¶
- sum_of_residuals: float = 0.0¶
- target_column: str = 'y_true'¶
- total_count: int = 0¶
mlm_insights.core.metrics.regression_metrics.mean_absolute_percentage_error module¶
- class mlm_insights.core.metrics.regression_metrics.mean_absolute_percentage_error.MeanAbsolutePercentageError(config: ~typing.Dict[str, ~mlm_insights.constants.definitions.ConfigParameter] = <factory>, target_column: str = 'y_true', prediction_column: str = 'y_predict', total_count: int = 0, sum_of_relative_error: float = 0.0)¶
Bases:
DatasetMetricBase
Mean absolute percentage error (MAPE) regression loss. This is a dataset level metricIt is an accurate metric which can process any column type and only numerical (int, float) data types.This metric falls under regression category, and is used for predictive modeling problems that involve predicting a numeric value or to measure error/performance etc. for regression models.Both Ground truth and Prediction target columns should not contain any NaN values otherwise InvalidTargetPredictionException will be thrownConfiguration¶
None
Parameters¶
- y_truearray-like of shape (n_samples,) or (n_samples, n_outputs)
Ground truth (correct) target values.
- y_predarray-like of shape (n_samples,) or (n_samples, n_outputs)
Estimated target values.
Returns¶
float: Mean absolute percentage error (MAPE) regression loss
Exceptions¶
MissingRequiredParameterException
InvalidTargetPredictionException
Examples
from mlm_insights.builder.builder_component import MetricDetail, EngineDetail from mlm_insights.builder.insights_builder import InsightsBuilder from mlm_insights.constants.types import FeatureType, DataType, VariableType, ColumnType from mlm_insights.core.metrics.regression_metrics.mean_absolute_percentage_error import MeanAbsolutePercentageError from mlm_insights.core.metrics.metric_metadata import MetricMetadata import pandas as pd def main(): input_schema = { 'square_feet': FeatureType( data_type=DataType.FLOAT, variable_type=VariableType.CONTINUOUS, column_type=ColumnType.INPUT), 'house_price_prediction': FeatureType( data_type=DataType.FLOAT, variable_type=VariableType.CONTINUOUS, column_type=ColumnType.PREDICTION), 'house_price_target': FeatureType( data_type=DataType.FLOAT, variable_type=VariableType.CONTINUOUS, column_type=ColumnType.TARGET) } data_frame = pd.DataFrame({'square_feet': [11.23, 23.45, 11.23, 45.56, 11.23], {'house_price_target': [1, 2, 3, 4, 5], 'house_price_prediction': [1.1, 2.5, 3.8, 5.1, 4.9]}}) metric_details = MetricDetail(univariate_metric={}, dataset_metrics=[MetricMetadata(klass=MeanAbsolutePercentageError)]) runner = InsightsBuilder(). with_input_schema(input_schema). with_data_frame(data_frame=data_frame). with_metrics(metrics=metric_details). with_engine(engine=EngineDetail(engine_name="native")). build() profile_json = runner.run().profile.to_json() dataset_metrics = profile_json['dataset_metrics'] print(dataset_metrics["MeanAbsolutePercentageError"]) if __name__ == "__main__": main() Returns the standard metric result as: { 'metric_name': 'MeanAbsolutePercentageError', 'metric_description': 'Mean absolute percentage error (MAPE) regression loss', 'variable_count': 1, 'variable_names': ['mean_absolute_percentage_error'], 'variable_types': [CONTINUOUS], 'variable_dtypes': [FLOAT], 'variable_dimensions': [0], 'metric_data': [1.1], 'metadata': {}, 'error': None }
- compute(dataset: DataFrame, **kwargs: Any) None ¶
Computes Sum of Relative Error for the Mean absolute percentage Error for the passed in dataset
Parameters¶
- datasetpd.DataFrame
DataFrame object for either the entire dataset for a partition on which a Metric is being computed
- classmethod create(config: Dict[str, ConfigParameter] | None = None, **kwargs: Any) MeanAbsolutePercentageError ¶
Create a MeanAbsolutePercentageError metric using the configuration and kwargs
Parameters¶
- configOptional[Dict[str, ConfigParameter]]
Metric configuration
- get_result(**kwargs: Any) Dict[str, Any] ¶
Returns the computed value of the metric
Returns¶
Dict[str, Any]: Dictionary with key as string and value as any metric property.
- get_standard_metric_result(**kwargs: Any) StandardMetricResult ¶
This method returns metric output in standard format.
Returns¶
StandardMetricResult
- merge(other_metric: MeanAbsolutePercentageError, **kwargs: Any) MeanAbsolutePercentageError ¶
Merge two MeanAbsolutePercentageError metrics into one, without mutating the others.
Parameters¶
- other_metricMeanAbsolutePercentageError
Other MeanAbsolutePercentageError that needs be merged.
Returns¶
- MeanAbsolutePercentageError: MeanAbsolutePercentageError
A new instance of MeanAbsolutePercentageError
- prediction_column: str = 'y_predict'¶
- sum_of_relative_error: float = 0.0¶
- target_column: str = 'y_true'¶
- total_count: int = 0¶
mlm_insights.core.metrics.regression_metrics.mean_squared_error module¶
- class mlm_insights.core.metrics.regression_metrics.mean_squared_error.MeanSquaredError(config: ~typing.Dict[str, ~mlm_insights.constants.definitions.ConfigParameter] = <factory>, target_column: str = 'y_true', prediction_column: str = 'y_predict', total_count: int = 0, sum_of_squared_residuals: float = 0.0)¶
Bases:
DatasetMetricBase
Computes Mean Squared Error regression loss. This is a dataset level metricIt is an accurate metric which can process any column type and only numerical (int, float) data types.This metric falls under regression category, and is used for predictive modeling problems that involve predicting a numeric value or to measure error/performance etc. for regression models.Both Ground truth and Prediction target columns should not contain any NaN values otherwise InvalidTargetPredictionException will be thrownConfiguration¶
None
Parameters¶
- y_truearray-like of shape (n_samples,) or (n_samples, n_outputs)
Ground truth (correct) target values.
- y_predarray-like of shape (n_samples,) or (n_samples, n_outputs)
Estimated target values.
Returns¶
float: Mean Squared Error
Exceptions¶
MissingRequiredParameterException
InvalidTargetPredictionException
Examples
from mlm_insights.builder.builder_component import MetricDetail, EngineDetail from mlm_insights.builder.insights_builder import InsightsBuilder from mlm_insights.constants.types import FeatureType, DataType, VariableType, ColumnType from mlm_insights.core.metrics.regression_metrics.mean_squared_error import MeanSquaredError from mlm_insights.core.metrics.metric_metadata import MetricMetadata import pandas as pd def main(): input_schema = { 'square_feet': FeatureType( data_type=DataType.FLOAT, variable_type=VariableType.CONTINUOUS, column_type=ColumnType.INPUT), 'house_price_prediction': FeatureType( data_type=DataType.FLOAT, variable_type=VariableType.CONTINUOUS, column_type=ColumnType.PREDICTION), 'house_price_target': FeatureType( data_type=DataType.FLOAT, variable_type=VariableType.CONTINUOUS, column_type=ColumnType.TARGET) } data_frame = pd.DataFrame({'square_feet': [11.23, 23.45, 11.23, 45.56, 11.23], {'house_price_target': [1, 2, 3, 4, 5], 'house_price_prediction': [1.1, 2.5, 3.8, 5.1, 4.9]}}) metric_details = MetricDetail(univariate_metric={}, dataset_metrics=[MetricMetadata(klass=MeanSquaredError)]) runner = InsightsBuilder(). with_input_schema(input_schema). with_data_frame(data_frame=data_frame). with_metrics(metrics=metric_details). with_engine(engine=EngineDetail(engine_name="native")). build() profile_json = runner.run().profile.to_json() dataset_metrics = profile_json['dataset_metrics'] print(dataset_metrics["MeanSquaredError"]) if __name__ == "__main__": main() Returns the standard metric result as: { 'metric_name': 'MeanSquaredError', 'metric_description': 'Computes Mean Squared Error regression loss', 'variable_count': 1, 'variable_names': ['mean_squared_error'], 'variable_types': [CONTINUOUS], 'variable_dtypes': [FLOAT], 'variable_dimensions': [0], 'metric_data': [1.1], 'metadata': {}, 'error': None }
- compute(dataset: DataFrame, **kwargs: Any) None ¶
Computes Numerator for the Mean MeanSquaredError Error for the passed in dataset
Parameters¶
- datasetpd.DataFrame
DataFrame object for either the entire dataset for a partition on which a Metric is being computed
- classmethod create(config: Dict[str, ConfigParameter] | None = None, **kwargs: Any) MeanSquaredError ¶
Create a MeanSquaredError metric using the configuration and kwargs
Parameters¶
- configOptional[Dict[str, ConfigParameter]]
Metric configuration
- get_result(**kwargs: Any) Dict[str, Any] ¶
Returns the computed value of the metric
Returns¶
Dict[str, Any]: Dictionary with key as string and value as any metric property.
- get_standard_metric_result(**kwargs: Any) StandardMetricResult ¶
This method returns metric output in standard format.
Returns¶
StandardMetricResult
- merge(other_metric: MeanSquaredError, **kwargs: Any) MeanSquaredError ¶
Merge two MeanSquaredError into one, without mutating the others.
Parameters¶
- other_metricMeanSquaredError
Other MeanSquaredError that need be merged.
Returns¶
- MeanSquaredError: MeanSquaredError
A new instance of MeanSquaredError
- prediction_column: str = 'y_predict'¶
- sum_of_squared_residuals: float = 0.0¶
- target_column: str = 'y_true'¶
- total_count: int = 0¶
mlm_insights.core.metrics.regression_metrics.mean_squared_log_error module¶
- class mlm_insights.core.metrics.regression_metrics.mean_squared_log_error.MeanSquaredLogError(config: ~typing.Dict[str, ~mlm_insights.constants.definitions.ConfigParameter] = <factory>, target_column: str = 'y_true', prediction_column: str = 'y_predict', total_count: int = 0, sum_of_squared_log: float = 0.0)¶
Bases:
DatasetMetricBase
Computes Mean Squared Log Error regression loss. This Metric must be used when both target and prediction values are positive.This is a dataset level metric. It is an accurate metric which can process any column type and only numerical (int, float) data types.This metric falls under regression category, and is used for predictive modeling problems that involve predicting a numeric value or to measure error/performance etc. for regression models.Both Ground truth and Prediction target columns should not contain any NaN values otherwise InvalidTargetPredictionException will be thrownConfiguration¶
None
Parameters¶
- y_truearray-like of shape (n_samples,) or (n_samples, n_outputs)
Ground truth (correct) target values.
- y_predarray-like of shape (n_samples,) or (n_samples, n_outputs)
Estimated target values.
Returns¶
float: Mean Squared Log Error
Exceptions¶
MissingRequiredParameterException
InvalidTargetPredictionException
Examples
from mlm_insights.builder.builder_component import MetricDetail, EngineDetail from mlm_insights.builder.insights_builder import InsightsBuilder from mlm_insights.constants.types import FeatureType, DataType, VariableType, ColumnType from mlm_insights.core.metrics.regression_metrics.mean_squared_log_error import MeanSquaredLogError from mlm_insights.core.metrics.metric_metadata import MetricMetadata import pandas as pd def main(): input_schema = { 'square_feet': FeatureType( data_type=DataType.FLOAT, variable_type=VariableType.CONTINUOUS, column_type=ColumnType.INPUT), 'house_price_prediction': FeatureType( data_type=DataType.FLOAT, variable_type=VariableType.CONTINUOUS, column_type=ColumnType.PREDICTION), 'house_price_target': FeatureType( data_type=DataType.FLOAT, variable_type=VariableType.CONTINUOUS, column_type=ColumnType.TARGET) } data_frame = pd.DataFrame({'square_feet': [11.23, 23.45, 11.23, 45.56, 11.23], {'house_price_target': [1, 2, 3, 4, 5], 'house_price_prediction': [1.1, 2.5, 3.8, 5.1, 4.9]}}) metric_details = MetricDetail(univariate_metric={}, dataset_metrics=[MetricMetadata(klass=MeanSquaredLogError)]) runner = InsightsBuilder(). with_input_schema(input_schema). with_data_frame(data_frame=data_frame). with_metrics(metrics=metric_details). with_engine(engine=EngineDetail(engine_name="native")). build() profile_json = runner.run().profile.to_json() dataset_metrics = profile_json['dataset_metrics'] print(dataset_metrics["MeanSquaredLogError"]) if __name__ == "__main__": main() Returns the standard metric result as: { 'metric_name': 'MeanSquaredLogError', 'metric_description': 'Computes Mean Squared Log Error regression loss', 'variable_count': 1, 'variable_names': ['mean_squared_log_error'], 'variable_types': [CONTINUOUS], 'variable_dtypes': [FLOAT], 'variable_dimensions': [0], 'metric_data': [1.1], 'metadata': {}, 'error': None }
- compute(dataset: DataFrame, **kwargs: Any) None ¶
Computes Numerator for the MeanSquaredLogError for the passed in dataset
Parameters¶
- datasetpd.DataFrame
DataFrame object for either the entire dataset for a partition on which a Metric is being computed
- classmethod create(config: Dict[str, ConfigParameter] | None = None, **kwargs: Any) MeanSquaredLogError ¶
Create a MeanSquaredLogError metric using the configuration and kwargs
Parameters¶
- configOptional[Dict[str, ConfigParameter]]
Metric configuration
- get_result(**kwargs: Any) Dict[str, Any] ¶
Returns the computed value of the metric
Returns¶
Dict[str, Any]: Dictionary with key as string and value as any metric property.
- get_standard_metric_result(**kwargs: Any) StandardMetricResult ¶
This method returns metric output in standard format.
Returns¶
StandardMetricResult
- merge(other_metric: MeanSquaredLogError, **kwargs: Any) MeanSquaredLogError ¶
Merge two MeanSquaredLogError into one, without mutating the others.
Parameters¶
- other_metricMeanSquaredLogError
Other MeanSquaredLogError that need be merged.
Returns¶
- MeanSquaredLogErrorMeanSquaredLogError
A new instance of MeanSquaredLogError
- prediction_column: str = 'y_predict'¶
- sum_of_squared_log: float = 0.0¶
- target_column: str = 'y_true'¶
- total_count: int = 0¶
mlm_insights.core.metrics.regression_metrics.r2_score module¶
- class mlm_insights.core.metrics.regression_metrics.r2_score.R2Score(config: ~typing.Dict[str, ~mlm_insights.constants.definitions.ConfigParameter] = <factory>, target_column: str = 'y_true', prediction_column: str = 'y_predict', total_count: int = 0, sum_of_squared_residuals: float = 0.0)¶
Bases:
DatasetMetricBase
Computes R2-Score between target and prediction columns.In the particular case whentarget
is constant, the :math: R2 score is not finite: it is eitherNaN
(perfect predictions) or-Inf
(imperfect predictions). By default, these cases are replaced with 1.0 (perfect predictions) or 0.0 (imperfect predictions) respectively.This is a dataset level and accurate metric which can process any column type and only numerical (int, float) data types.This metric falls under regression category, and is used for predictive modeling problems that involve predicting a numeric value or to measure error/performance etc. for regression models.Both Ground truth and Prediction target columns should not contain any NaN values otherwise InvalidTargetPredictionException will be thrownConfiguration¶
None
Parameters¶
- y_truearray-like of shape (n_samples,) or (n_samples, n_outputs)
Ground truth (correct) target values.
- y_predarray-like of shape (n_samples,) or (n_samples, n_outputs)
Estimated target values.
Returns¶
float: R2 score
Exceptions¶
MissingRequiredParameterException
InvalidTargetPredictionException
Examples
from mlm_insights.builder.builder_component import MetricDetail, EngineDetail from mlm_insights.builder.insights_builder import InsightsBuilder from mlm_insights.constants.types import FeatureType, DataType, VariableType, ColumnType from mlm_insights.core.metrics.regression_metrics.r2_score import R2Score from mlm_insights.core.metrics.metric_metadata import MetricMetadata import pandas as pd def main(): input_schema = { 'square_feet': FeatureType( data_type=DataType.FLOAT, variable_type=VariableType.CONTINUOUS, column_type=ColumnType.INPUT), 'house_price_prediction': FeatureType( data_type=DataType.FLOAT, variable_type=VariableType.CONTINUOUS, column_type=ColumnType.PREDICTION), 'house_price_target': FeatureType( data_type=DataType.FLOAT, variable_type=VariableType.CONTINUOUS, column_type=ColumnType.TARGET) } data_frame = pd.DataFrame({'square_feet': [11.23, 23.45, 11.23, 45.56, 11.23], {'house_price_target': [1, 2, 3, 4, 5], 'house_price_prediction': [1.1, 2.5, 3.8, 5.1, 4.9]}}) metric_details = MetricDetail(univariate_metric={}, dataset_metrics=[MetricMetadata(klass=R2Score)]) runner = InsightsBuilder(). with_input_schema(input_schema). with_data_frame(data_frame=data_frame). with_metrics(metrics=metric_details). with_engine(engine=EngineDetail(engine_name="native")). build() profile_json = runner.run().profile.to_json() dataset_metrics = profile_json['dataset_metrics'] print(dataset_metrics["R2Score"]) if __name__ == "__main__": main() Returns the standard metric result as: { 'metric_name': 'R2Score', 'metric_description': 'Computes R2-Score between target and prediction columns', 'variable_count': 1, 'variable_names': ['r2_score'], 'variable_types': [CONTINUOUS], 'variable_dtypes': [FLOAT], 'variable_dimensions': [0], 'metric_data': [1.1], 'metadata': {}, 'error': None }
- compute(dataset: DataFrame, **kwargs: Any) None ¶
Computes Numerator for the Mean R2Score Error for the passed in dataset
Parameters¶
- datasetpd.DataFrame
DataFrame object for either the entire dataset for a partition on which a Metric is being computed
- classmethod create(config: Dict[str, ConfigParameter] | None = None, **kwargs: Any) R2Score ¶
Factory Method to create an object. The configuration will be available in config.
Returns¶
- MetricBase
An Instance of MetricBase.
Returns the Shareable Feature Components that a Metric requires to compute its state and values Metrics which do not require SFC need not override this property
Returns¶
Dict where feature_name as key and List of SFCMetadata as value. Each SFCMetadata must contain the klass attribute which points to the SFC class
- get_result(**kwargs: Any) Dict[str, Any] ¶
Returns the computed value of the metric
Returns¶
Dict[str, Any]: Dictionary with key as string and value as any metric property.
- get_standard_metric_result(**kwargs: Any) StandardMetricResult ¶
This method returns metric output in standard format.
Returns¶
StandardMetricResult
- merge(other_metric: R2Score, **kwargs: Any) R2Score ¶
Merge two R2Score into one, without mutating the others.
Parameters¶
- other_metricR2Score
Other R2Score that need be merged.
Returns¶
- R2Score: R2Score
A new instance of R2Score
- prediction_column: str = 'y_predict'¶
- sum_of_squared_residuals: float = 0.0¶
- target_column: str = 'y_true'¶
- total_count: int = 0¶
mlm_insights.core.metrics.regression_metrics.root_mean_squared_error module¶
- class mlm_insights.core.metrics.regression_metrics.root_mean_squared_error.RootMeanSquaredError(config: ~typing.Dict[str, ~mlm_insights.constants.definitions.ConfigParameter] = <factory>, target_column: str = 'y_true', prediction_column: str = 'y_predict', prediction_score_column: str = 'y_score', total_count: int = 0, sum_of_squared_residuals: float = 0.0)¶
Bases:
DatasetMetricBase
Computes Root Mean Square Error regression loss. This is a dataset level metricIt is an accurate metric which can process any column type and only numerical (int, float) data types.This metric falls under regression category, and is used for predictive modeling problems that involve predicting a numeric value or to measure error/performance etc. for regression models.Both Ground truth and Prediction target columns should not contain any NaN values otherwise InvalidTargetPredictionException will be thrownConfiguration¶
None
Parameters¶
- y_truearray-like of shape (n_samples,) or (n_samples, n_outputs)
Ground truth (correct) target values.
- y_predarray-like of shape (n_samples,) or (n_samples, n_outputs)
Estimated target values.
Returns¶
float: Root Mean Squared Error
Exceptions¶
MissingRequiredParameterException
InvalidTargetPredictionException
Examples
from mlm_insights.builder.builder_component import MetricDetail, EngineDetail from mlm_insights.builder.insights_builder import InsightsBuilder from mlm_insights.constants.types import FeatureType, DataType, VariableType, ColumnType from mlm_insights.core.metrics.regression_metrics.root_mean_squared_error import RootMeanSquaredError from mlm_insights.core.metrics.metric_metadata import MetricMetadata import pandas as pd def main(): input_schema = { 'square_feet': FeatureType( data_type=DataType.FLOAT, variable_type=VariableType.CONTINUOUS, column_type=ColumnType.INPUT), 'house_price_prediction': FeatureType( data_type=DataType.FLOAT, variable_type=VariableType.CONTINUOUS, column_type=ColumnType.PREDICTION), 'house_price_target': FeatureType( data_type=DataType.FLOAT, variable_type=VariableType.CONTINUOUS, column_type=ColumnType.TARGET) } data_frame = pd.DataFrame({'square_feet': [11.23, 23.45, 11.23, 45.56, 11.23], {'house_price_target': [1, 2, 3, 4, 5], 'house_price_prediction': [1.1, 2.5, 3.8, 5.1, 4.9]}}) metric_details = MetricDetail(univariate_metric={}, dataset_metrics=[MetricMetadata(klass=RootMeanSquaredError)]) runner = InsightsBuilder(). with_input_schema(input_schema). with_data_frame(data_frame=data_frame). with_metrics(metrics=metric_details). with_engine(engine=EngineDetail(engine_name="native")). build() profile_json = runner.run().profile.to_json() dataset_metrics = profile_json['dataset_metrics'] print(dataset_metrics["RootMeanSquaredError"]) if __name__ == "__main__": main() Returns the standard metric result as: { 'metric_name': 'RootMeanSquaredError', 'metric_description': 'Computes Root Mean Squared Error regression loss', 'variable_count': 1, 'variable_names': ['root_mean_squared_error'], 'variable_types': [CONTINUOUS], 'variable_dtypes': [FLOAT], 'variable_dimensions': [0], 'metric_data': [1.1], 'metadata': {}, 'error': None }
- compute(dataset: DataFrame, **kwargs: Any) None ¶
Computes Numerator for the Root Mean Square Error for the passed in dataset
Parameters¶
- datasetpd.DataFrame
DataFrame object for either the entire dataset for a partition on which a Metric is being computed
- classmethod create(config: Dict[str, ConfigParameter] | None = None, **kwargs: Any) RootMeanSquaredError ¶
Create a RootMeanSquareError metric using the configuration and kwargs
Parameters¶
- configOptional[Dict[str, ConfigParameter]]
Metric configuration
- get_result(**kwargs: Any) Dict[str, Any] ¶
Returns the computed value of the metric
Returns¶
Dict[str, Any]: Dictionary with key as string and value as any metric property.
- get_standard_metric_result(**kwargs: Any) StandardMetricResult ¶
This method returns metric output in standard format.
Returns¶
StandardMetricResult
- merge(other_metric: RootMeanSquaredError, **kwargs: Any) RootMeanSquaredError ¶
Merge two RootMeanSquaredError metrics into one, without mutating the others.
Parameters¶
- other_metricRootMeanSquaredError
Other RootMeanSquaredError that needs be merged.
Returns¶
- RootMeanSquaredErrorRootMeanSquaredError
A new instance of RootMeanSquaredError
- prediction_column: str = 'y_predict'¶
- prediction_score_column: str = 'y_score'¶
- sum_of_squared_residuals: float = 0.0¶
- target_column: str = 'y_true'¶
- total_count: int = 0¶