mlm_insights.workflow package

Subpackages

Submodules

mlm_insights.workflow.component_config module

class mlm_insights.workflow.component_config.FeatureConfig(feature_meta: mlm_insights.core.features.feature.FeatureMetadata, metrics: List[mlm_insights.core.metrics.metric_metadata.MetricMetadata])

Bases: object

feature_meta: FeatureMetadata
metrics: List[MetricMetadata]
class mlm_insights.workflow.component_config.ReaderConfig(klass: Type[mlm_insights.core.readers.interfaces.data_reader.DataReader], config: Dict[str, Any] = <factory>)

Bases: object

config: Dict[str, Any]
klass: Type[DataReader]
class mlm_insights.workflow.component_config.TransformerConfig(klass: Type[mlm_insights.core.transformers.interfaces.transformer.Transformer], config: Dict[str, Any] = <factory>)

Bases: object

config: Dict[str, Any]
klass: Type[Transformer]

mlm_insights.workflow.insights_fugue_workflow module

class mlm_insights.workflow.insights_fugue_workflow.InsightFugueWorkflow(workflow_request: WorkflowRequest)

Bases: WorkflowBase

This class implements the Workflow using Fugue and constructs the DAG based on the fugue API. 1. All Fugue related APIs must only be used here and must not leak out 2. Ensure we do not store any state which should not be available on worker or cannot be available on worker. Doing so will lead to pickling errors. For eg: when using Spark as EE, do not pass or store SparkSession, SparkContex 3. Implement the DAG node based on the Fugue constructs like Creator, Transformer, etc

execute_workflow(engine: Any | None = None, engine_conf: Dict[str, Any] | None = None, data_frame: Any | None = None) Profile

mlm_insights.workflow.workflow_request module

class mlm_insights.workflow.workflow_request.WorkflowRequest(input_schema: Dict[str, FeatureType], transformers: List[Transformer], features: List[FeatureConfig], execution_engine: ExecutionEngine, dataset_metric: List[MetricMetadata] | None = None, reader: DataReader | None = None, data_frame: Any | None = None, reference_profile: Profile | None = None, tags: Tags | None = None, py_arrow_schema: Schema | None = None)

Bases: object