euroeval.benchmark_config_factory
source module euroeval.benchmark_config_factory
Factory class for creating dataset configurations.
Functions
-
build_benchmark_config — Create a benchmark configuration.
-
get_correct_language_codes — Get correct language code(s).
-
prepare_languages — Prepare language(s) for benchmarking.
-
prepare_tasks_and_datasets — Prepare task(s) and dataset(s) for benchmarking.
-
prepare_device — Prepare device for benchmarking.
source build_benchmark_config(benchmark_config_params: BenchmarkConfigParams) → BenchmarkConfig
Create a benchmark configuration.
Parameters
-
benchmark_config_params : BenchmarkConfigParams — The parameters for creating the benchmark configuration.
Returns
-
BenchmarkConfig — The benchmark configuration.
source get_correct_language_codes(language_codes: str | list[str]) → list[str]
Get correct language code(s).
Parameters
-
language_codes : str | list[str] — The language codes of the languages to include, both for models and datasets. Here 'no' means both Bokmål (nb) and Nynorsk (nn). Set this to 'all' if all languages should be considered.
Returns
-
list[str] — The correct language codes.
source prepare_languages(language_codes: str | list[str] | None, default_language_codes: list[str]) → list['Language']
Prepare language(s) for benchmarking.
Parameters
-
language_codes : str | list[str] | None — The language codes of the languages to include for models or datasets. If specified then this overrides the
language
parameter for model or dataset languages. -
default_language_codes : list[str] — The default language codes of the languages to include.
Returns
-
list['Language'] — The prepared dataset languages.
source prepare_tasks_and_datasets(task: str | list[str] | None, dataset_languages: list['Language'], dataset: str | list[str] | None) → tuple[list['Task'], list[str]]
Prepare task(s) and dataset(s) for benchmarking.
Parameters
-
task : str | list[str] | None — The tasks to include for dataset. If None then datasets will not be filtered based on their task.
-
dataset_languages : list['Language'] — The languages of the datasets in the benchmark.
-
dataset : str | list[str] | None — The datasets to include for task. If None then all datasets will be included, limited by the
task
anddataset_languages
parameters.
Returns
-
tuple[list['Task'], list[str]] — The prepared tasks and datasets.
Raises
-
InvalidBenchmark — If the task or dataset is not found in the benchmark tasks or datasets.
source prepare_device(device: Device | None) → torch.device
Prepare device for benchmarking.
Parameters
-
device : Device | None — The device to use for running the models. If None then the device will be set automatically.
Returns
-
torch.device — The prepared device.