euroeval.custom_dataset_configs¶
source module euroeval.custom_dataset_configs
Load custom dataset configs.
This module provides the main entry point for loading dataset configurations from
Hugging Face repositories, including Python-based configs. YAML-specific loading
logic lives in the yaml_config module.
Functions
-
load_custom_datasets_module — Load the custom datasets module if it exists.
-
try_get_dataset_config_from_repo — Try to get a dataset config from a Hugging Face dataset repository.
-
load_python_config — Load a dataset config from a euroeval_config.py file in a Hugging Face repo.
source load_custom_datasets_module(custom_datasets_file: Path) → ModuleType | None
Load the custom datasets module if it exists.
Parameters
-
custom_datasets_file : Path — The path to the custom datasets module.
Returns
-
ModuleType | None — The custom datasets module, or None if it does not exist.
source try_get_dataset_config_from_repo(dataset_id: str, api_key: str | None, cache_dir: Path, trust_remote_code: bool, run_with_cli: bool) → DatasetConfig | None
Try to get a dataset config from a Hugging Face dataset repository.
The function first looks for a YAML config file (eval.yaml) which can be
loaded without executing any remote code. If no YAML file is present the
function falls back to euroeval_config.py, which requires
trust_remote_code=True.
Parameters
-
dataset_id : str — The ID of the dataset to get the config for.
-
api_key : str | None — The Hugging Face API key to use to check if the repositories have custom dataset configs.
-
cache_dir : Path — The directory to store the cache in.
-
trust_remote_code : bool — Whether to trust remote code. Only required when loading a Python config (
euroeval_config.py). YAML configs never require this flag. -
run_with_cli : bool — Whether the code is being run with the CLI.
Returns
-
DatasetConfig | None — The dataset config if it exists, otherwise None.
source load_python_config(hf_api: HfApi, dataset_id: str, cache_dir: Path, trust_remote_code: bool, run_with_cli: bool) → DatasetConfig | None
Load a dataset config from a euroeval_config.py file in a Hugging Face repo.
Parameters
-
hf_api : HfApi — The Hugging Face API object.
-
dataset_id : str — The ID of the dataset to get the config for.
-
cache_dir : Path — The directory to store the cache in.
-
trust_remote_code : bool — Whether to trust remote code.
-
run_with_cli : bool — Whether the code is being run with the CLI.
Returns
-
DatasetConfig | None — The dataset config if it exists, otherwise None.