euroeval.data_loading
source module euroeval.data_loading
Functions related to the loading of the data.
Functions
-
load_data — Load the raw bootstrapped datasets.
-
load_raw_data — Load the raw dataset.
source load_data(rng: Generator, dataset_config: DatasetConfig, benchmark_config: BenchmarkConfig) → list[DatasetDict]
Load the raw bootstrapped datasets.
Parameters
-
rng : Generator — The random number generator to use.
-
dataset_config : DatasetConfig — The configuration for the dataset.
-
benchmark_config : BenchmarkConfig — The configuration for the benchmark.
Returns
-
list[DatasetDict] — A list of bootstrapped datasets, one for each iteration.
Raises
-
InvalidBenchmark — If the dataset cannot be loaded.
-
HuggingFaceHubDown — If the Hugging Face Hub is down.
source load_raw_data(dataset_config: DatasetConfig, cache_dir: str) → DatasetDict
Load the raw dataset.
Parameters
-
dataset_config : DatasetConfig — The configuration for the dataset.
-
cache_dir : str — The directory to cache the dataset.
Returns
-
DatasetDict — The dataset.
Raises