Skip to content

Running Pipelines

Execute end-to-end experiments using Hydra configuration system.

Overview

Pipelines automate the complete workflow: 1. Load dataset 2. Train/load models 3. Generate counterfactuals 4. Compute metrics 5. Log results

Running a Pipeline

# Run PPCEF pipeline
python -m counterfactuals.pipelines.run_ppcef_pipeline

# Override configuration
python -m counterfactuals.pipelines.run_ppcef_pipeline \
    dataset.config_path=config/datasets/compas.yaml \
    counterfactuals_params.epochs=200

Configuration Structure

# pipelines/conf/config.yaml
defaults:
  - gen_model: large_maf
  - disc_model: mlp
  - metrics: default

dataset:
  _target_: counterfactuals.datasets.FileDataset
  config_path: config/datasets/adult.yaml

gen_model:
  train_model: true
  epochs: 200
  lr: 0.0001

disc_model:
  train_model: true
  epochs: 100
  lr: 0.001

counterfactuals_params:
  epochs: 100
  lr: 0.01
  alpha: 1.0
  beta: 0.5

Available Pipelines

Pipeline Method
run_ppcef_pipeline PPCEF
run_dice_pipeline DICE
run_globe_ce_pipeline GLOBE-CE
run_rppcef_pipeline ReViCE
... ...

MLflow Logging

Results are automatically logged to MLflow:

import mlflow

# View logged runs
mlflow.search_runs()

Creating Custom Pipelines

See existing pipelines in counterfactuals/pipelines/ for examples.