configs

Config JSON Schema

Configure benchmarks by editing the config.json file. You can configure some algorithm parameters, datasets, a list of frameworks to use, and the usage of some environment variables. Refer to the tables below for descriptions of all fields in the configuration file.

Root Config Object

Field Name	Type	Description
common	Common Object	REQUIRED common benchmarks setting: frameworks and input data settings
cases	List[Case Object]	REQUIRED list of algorithms, their parameters and training data

Common Object

Field Name	Type	Description
data-format	Union[str, List[str]]	REQUIRED Input data format: numpy, pandas, or cudf.
data-order	Union[str, List[str]]	REQUIRED Input data order: C (row-major, default) or F (column-major).
dtype	Union[str, List[str]]	REQUIRED Input data type: float64 (default) or float32.
check-finitness	List[]	Check finiteness during scikit-learn input check (disabled by default).
device	array[string]	For scikit-learn only. The list of devices to run the benchmarks on. It can be None (default, run on CPU without sycl context) or one of the types of sycl devices: cpu, gpu, host. Refer to SYCL specification for details.

Case Object

Field Name	Type	Description
lib	Union[str, List[str]]	REQUIRED A test framework or a list of frameworks. Must be from [sklearn, daal4py, cuml, xgboost].
algorithm	string	REQUIRED Benchmark file name.
dataset	List[Dataset Object]	REQUIRED Input data specifications.
specific algorithm parameters	Union[int, float, str, List[int], List[float], List[str]]	Other algorithm-specific parameters

Important: You can move any parameter from "cases" to "common" if this parameter is common to all cases

Dataset Object

Field Name	Type	Description
source	string	REQUIRED Data source: synthetic, csv, or npy.
type	string	REQUIRED for synthetic data. The type of task for which the dataset is generated: classification, blobs, or regression.
n_classes	int	For synthetic data and for classification type only. The number of classes (or labels) of the classification problem
n_clusters	int	For synthetic data and for blobs type only. The number of centers to generate
n_features	int	*REQUIRED for synthetic* data**. The number of features to generate.
name	string	Name of the dataset.
training	Training Object	REQUIRED An object with the paths to the training datasets.
testing	Testing Object	An object with the paths to the testing datasets. If not provided, the training datasets are used.

Training Object

Field Name	Type	Description
n_samples	int	REQUIRED The total number of the training samples
x	str	REQUIRED The path to the training samples
y	str	REQUIRED The path to the training labels

Testing Object

Field Name	Type	Description
n_samples	int	REQUIRED The total number of the testing samples
x	str	REQUIRED The path to the testing samples
y	str	REQUIRED The path to the testing labels

Name		Name	Last commit message	Last commit date
parent directory ..
blogs		blogs
modelbuilders		modelbuilders
sklearn/performance		sklearn/performance
svm		svm
testing		testing
xgboost		xgboost
README.md		README.md
config_example.json		config_example.json
cuml_config.json		cuml_config.json
skl_config.json		skl_config.json
skl_public_config.json		skl_public_config.json
skl_xpu_config.json		skl_xpu_config.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Config JSON Schema

Root Config Object

Common Object

Case Object

Dataset Object

Training Object

Testing Object

FilesExpand file tree

configs

Directory actions

More options

Directory actions

More options

Latest commit

History

configs

Folders and files

parent directory

README.md

Config JSON Schema

Root Config Object

Common Object

Case Object

Dataset Object

Training Object

Testing Object