# Release Notes ## 23.2.3 ### Possibly breaking changes * The automlx package has been renamed to “oracle-automlx”. You can still import the package with `import automl`; however, you will need to install it as `pip install oracle-automlx`. ## 23.2.2 #### Bug fixes * Fixed a bug that was causing logging messages to be written to stderr rather than stdout by default ## 23.2.1 #### Features and Improvements * Added install options automlx[forecasting], automlx[onnx], and automlx[deep-learning] alongside automlx[viz]. Install options create minimal sized wheels for the associated task. You can overload install options if combined functionality is desired. e.g., automlx[forecasting,viz]. #### Bug fixes * Fixed bug where ETSForecaster could fail the entire pipeline when it fails to convergence. * Fixed bug which causes pipeline to set forecast horizon to zero when forecasting short length time series (less than 8 datapoints). * Fixed bug which could cause model fit failure for some Seasonal Decompose (e.g., STL) for series which have short length (less than 3 times seasonality period). * Fixed bug where BoxCox transformer could produce NaNs as the result of inverse transformation. * Fixed a bug that caused the advanced feature importance sampling strategies to raise an exception. #### Possibly breaking changes * Deep-learning models for classification (TorchMLPClassifier, CatboostClassifier, TabNetClassifier), regression (TorchMLPRegressor) and anomaly detection (AutoEncoderOD) now require install option automlx[deep-learning]. * Changed the initialization of the logging module to: * no longer log to file by default; * not overwrite the global logging configuration if it was already setup. ## 23.2.0 #### Features and Improvements * Added support for TabNet classifier. * Training TabNet with CPUs is slow, so it is disabled by default until GPU support is added. * To enable TabNet, add 'TabNetClassifier' to the `model_list` when initializing the AutoML Pipeline. * New counterfactual Explainer (ACE) * Added the AutoMLx Counterfactual Explainer (ACE) for classification and anomaly detection tasks. * ACE is faster and finds more valid counterfactuals than DiCE. * It guarantees to find a counterfactual for each query instance if the reference dataset set contains an example with the desired class. * Fairness Feature Importance is now available for tabular datasets! `MLExplainer` has a new `explain_model_fairness()` function to compute global feature importance attributions for fairness metrics. * Added threshold tuning for binary and multi-class classification tasks. Threshold Tuning can be enabled by passing `threshold_tuning=True` to the Pipeline object when it is created. * Python 3.10 support added. #### Deprecations * Removed support for Uber Orbit forecaster due to in-built bayesian inference engine instability. * Added deprecation warnings to objects that will be removed or replaced in 23.3.0. * Deprecations include: * Internal (never-documented) attributes of the AutoML pipeline. * The dask and spark execution engines and related options. * The ModelTune interface. * All Pipeline attributes matching `*_trials_`, which contain information about the trials performed by the AutoML pipeline. These will be replaced by two new dataframe attributes `completed_trials_summary_` and `completed_trials_detailed_`,. * AutoML optimization levels 1 and 2. * The Pipeline attribute `selected_features_`. Instead, users should use `selected_features_names_` or `selected_features_names_raw_` to access the names of the selected engineered or raw features, respectively. * Deprecation warnings can be suppressed using `from automl import init; init(check_deprecation_warnings=False)` #### Miscellaneous * Bump packages * fbprophet==0.7.1 to prophet==1.1.2 * torch to 1.13.1 * onnx to 1.12.0 * onnxruntime to 1.12.1 #### Possibly breaking changes * `score_metric` is no longer accepted in the `MLExplainer` factory function. It is now an optional argument to the `TabularExplainer`'s `explain_model` and `explain_model_fairness` methods. ## 23.1.1 #### Features and Improvements * Unsupervised anomaly detection * Implemented N-1 experts for hyperparameter tuning * Added N-1 experts-based contamination factor identification * Overhauled package documentation #### Bug fixes * Fixed a bug in feature importance explainers for when the dataset contains feature names that are numpy integers and an AutoML pipeline is being explained. ## 23.1.0 #### Features and Improvements * Fairness metrics are now available to measure bias in both datasets and trained models. Fairness metrics can be imported from `automl.fairness.metrics`. * Explanations can now be computed from custom user-defined metrics. * Introduced `max_tuning_trials` option that controls maximum HPO trials per algorithm. * New explainer (Counterfactual) * Added a model-agnostic counterfactual explainer for classification, regression, and anomaly detection tasks. * The explainer can find diverse counterfactuals for the desired prediction, while the user is able to choose which features to vary and their permitted range. * Counterfactual explanations can be visualized either with What-if explainer or dataframe. * Added support of surrogate explainer for local text explanation. * Code updated to comply with security checks with Python Bandit. * Added catboost as a new classification model. #### Bug fixes * Fixed a bug on LIME's explanation Bar Chart where annotations were misplaced for dataset stringified integers feature names. * Fixed a bug where features would be placed incorrectly on plots' axis when trying to visualize explanations for categorical features. * Deleted internal state to reduce memory consumption in explanations * Fixed a bug where dataset downcasting to `int32` and `float32` was only applied during training but not for doing the final fit or collecting predictions. * Preprocessing of `datetime` columns is now much faster. * Fixed a bug where dependencies of automl would on import initialize a rootLogger preventing subsequent applications from using `logging.basicConfig()`. * Fixed a bug where the AutoTune step would override default params even if it did not find any better params than the default ones. * Propogated dataset downcasting to all relevant pipeline stages, potentially reducing memory consumption for very large datasets. * Changed AutoTune behavior to consider using default hyper-parameters scored at the end of feature selection step if they performed better than those AutoTune tried within timebudget. . #### Deprecations * Added deprecation warnings for the following: * Some attributes in the pipeline that are not publicly documented. * Attributes of the pipeline containing trial information, which were renamed to `completed_trials_summary_` and `completed_trials_detailed_`. The `stage` column is renamed to `step`. * Optimization levels of 1 and 2. * Dask and spark engines and engine options. * The ModelTune class. * To disable the warnings: * In the initialization, set the argument `check_deprecation_warnings` to False. ## 22.4.2 #### Features and Improvements * Added support for explaining selected features in local and global permutation importance, as well as automatically detecting which features were selected by an AutoML model. #### Bug fixes * Fixed a bug in local perturbation-based feature attribution explainers for the `n_iter='auto'` option that caused the iterations to be set too high. * Enhanced performance of local feature importance explainers to improve running times by batching inference calls together. ## 22.4.1 #### Features and Improvements * Pipeline now accepts a ``min_class_instances`` input argument to manually specify the number of examples every class must have when doing classification. The value for ``min_class_instances`` must be at least 2. #### Bug fixes * Fixed a bug where IPython and ipywidgets are not properly guarded as an optional dependencies which makie them required. * Fixed a bug introduced by last dependency update which caused fbprophet to not produce forecasts with correct index type, when fbprophet was installed manually. ## 22.4.0 #### Features and Improvements * New feature dependence explainers * Added an Accumulated Local Effects (ALE) explainer * ALE explanations can be computed for up to two features if at least one is not categorical. * New explainer (What-IF) * Added a What-IF explainer for classification and regression tasks * What-IF explanations include exploration of the behavior of an ML model on a single sample as well as on the entire dataset. * Sample exploration (edit a sample value and see how the model predictions changes) and relationships' visualization (how a feature is related to predictions or other features) are supported. * New feature importance aggregators * Added ALFI (Aggregate Local Feature Importance) that gives a visual summary of multiple local explanations. * New local feature importance explainer * Added support for surrogate-based (LIME+) local feature importance explainers #### Bug fixes * Import failure due to CUDA: The package no longer crashes when imported on a machine with CUDA installed. * Fixed a bug where `TorchMLPClassifier` would fail when trying to predict a single instance. * Fixed a bug where `OracleAutoMLx_Forecasting.ipynb` would fail if visualization packages were not already installed. * Fixed a bug that caused the pipeline.transform to raise an exception if a single row was passed. * Explanation documentation * Our documentation website (http://automl.oraclecorp.com/) now includes documentation for the explanation objects returned by our explainers. * Enhanced performance of local feature importance explainers to address long running times. * Improved visualization of facet for the columns with cardinality equal to 1 by selecting the bars' width and pads properly. ## 22.3.0 #### Features and Improvements * New Explainer * Added support for KernelSHAP (a new feature importance tabulator), which provides fast approximations for the Shapley feature importance method. * Support ARM architecture (`aarch64`) * Released platform-specific wheel file for ARM machines. #### Miscellaneous * Clarified documentation on the accepted data formats for input datasets and added a more meaningful corresponding error message. ## 22.2.0 #### Features and Improvements * New profiler * Profiler tracks CPU and memory utilization * Timeseries forecasting pipeline * Added the support for multivariate datasets * Added the support for exogenous variables * Enhanced heteroskedasticity detection technique * Applied Box-Cox transform-inverse_transform with params determined via MLE to handle heteroskedasticity * Explainers / MLX integration * New global text explainer * Added support * New feature importance attribution explainers * Added several local and global feature importance explainers, including permutation importance, exactly Shapley, and SHAP-PI. * The explainers support for classification, regression and anomaly detection * The explainers can also be configured to explain the importance of features to any model (explanation_type='observational') as well as for a particular model (explanation_type='interventional'). * Observational explanations are supported for all tasks; interventional explanations are only supported for classification and regression. * New feature dependence explainers * Added a partial dependence plot (PDP) and individual conditional expectations (ICE) explainer * PDP explanations include vizualization support for up to 4 dimensions. PDPs in higher dimension can be returned as dataframes. * Unsupervised Anomaly Detection * Added N-1 Experts: a new experimental metric for UAD Model Selection * Documentation * Added the description of `init` function of the automl to documentation * Cleaned up documentation for more consistency among different sections and added cross-references #### Bug fixes * Timeseries forecasting pipeline * Statsmodel exception for some frequencies, users are now able to pass in timeperiod as a parameter * Preprocessing * Datetime preprocessor * Fixed the bug regarding column expansion and None/Null/Nan values * Standard preprocessor refitting * The standard preprocessor used to first be fit on a subsample of the training set, and then re-fit at the very end of the pipeline using the full training set. This occasionaly resulted in a different number of engineered features being produced. As a result, the features identified during the model selection module could no longer exist. The standard preprocessor is now fit only once. * ONNX predictions inconsistency * Changed the ONNX conversion function to reduce the difference between the ONNX dumped model and the original pipeline object predictions * Improved ONNX conversion runtime * ONNX conversion now only requires a sample from the training or test set as input. This sample is used to infer the final types and shapes #### Possibly breaking changes * Removed matplotlib as a dependency of the AutoMLx package * Forecasting predictions can now instead be visualized only using plotly using the same interface as before, automl.utils.plot_forecast. The alternate visualizations that were provided with plotly using automl.utils.plot_forecast_interactive has been removed. * Updated the AutoMLx package dependencies * All dependency versions have been reviewed and updated to address all known CVEs * A few unneeded dependencies have also been removed.