TPOT 2 vs TPOT 1 huge runtime difference

**Context**
I am conducting a scientific experiment on AutoML reproducibility across five TPOT versions (0.11.6, 0.11.7, 0.12.1, 0.12.2, and 1.0.0). I am running these on a local machine (no cloud infrastructure) using Docker containers (Linux).

**The Issue**
I have observed a significant runtime regression in v1.0.0 compared to all previous versions, despite using the same dataset and hyperparameter configuration.

Dataset: Regression task, 638 samples, 13 features.

Dependencies: xgboost is installed and available in all environments.

Hardware: Local PC (Docker), identical resources allocated for all runs.

**Observed Results**
TPOT v0.11.6 / v0.11.7 (Python 3.8): Runtime ~4-5 minutes.

TPOT v0.12.1 / v0.12.2 (Python 3.10): Runtime ~4-5 minutes.

TPOT v1.0.0 (Python 3.10): Runtime ~30 minutes.

**Configuration**
The code logic is identical across versions (with v1.0.0 wrapped in if __name__ == '__main__': to support the Dask backend).
Standard configuration used in all tests
pipeline_optimizer = TPOTRegressor(
    generations=20, 
    population_size=20, 
    cv=5, 
    random_state=seed,
    verbose=0
)

In v1.0.0, this is run inside 'if __name__ == "__main__":'
pipeline_optimizer.fit(X_train, y_train)

**Questions**
Is this drastic slowdown expected for small datasets due to the overhead of the Dask backend (introduced in v1.0.0) compared to the older multiprocessing backend?

Has the default config_dict changed in v1.0.0 to prioritize significantly heavier estimators (e.g., more aggressive use of XGBoost or Stacking) compared to v0.12.x, even when XGBoost was present in the older environments?

Is there a recommended configuration to restore the runtime profile of the older versions for benchmarking purposes?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TPOT 2 vs TPOT 1 huge runtime difference #1388

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

TPOT 2 vs TPOT 1 huge runtime difference #1388

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions