euroeval.finetuning
source module euroeval.finetuning
Functions related to the finetuning of models.
Functions
-
finetune — Evaluate a model on a dataset through finetuning.
-
finetune_single_iteration — Run a single iteration of a benchmark.
-
get_training_args — Get the training arguments for the current iteration.
source finetune(model: BenchmarkModule, datasets: list[DatasetDict], model_config: ModelConfig, dataset_config: DatasetConfig, benchmark_config: BenchmarkConfig) → list[dict[str, float]]
Evaluate a model on a dataset through finetuning.
Parameters
-
model : BenchmarkModule — The model to evaluate.
-
datasets : list[DatasetDict] — The datasets to use for training and evaluation.
-
model_config : ModelConfig — The configuration of the model.
-
dataset_config : DatasetConfig — The dataset configuration.
-
benchmark_config : BenchmarkConfig — The benchmark configuration.
Returns
-
list[dict[str, float]] — A list of dicts containing the scores for each metric for each iteration.
Raises
source finetune_single_iteration(model: BenchmarkModule | None, dataset: DatasetDict, iteration_idx: int, training_args: TrainingArguments, model_config: ModelConfig, dataset_config: DatasetConfig, benchmark_config: BenchmarkConfig) → dict[str, float]
Run a single iteration of a benchmark.
Parameters
-
model : BenchmarkModule | None — The model to use in the benchmark. If None then a new model will be loaded.
-
dataset : DatasetDict — The dataset to use for training and evaluation.
-
iteration_idx : int — The index of the iteration.
-
training_args : TrainingArguments — The training arguments.
-
model_config : ModelConfig — The model configuration.
-
dataset_config : DatasetConfig — The dataset configuration.
-
benchmark_config : BenchmarkConfig — The benchmark configuration.
Returns
-
dict[str, float] — The scores for the test dataset.
Raises
source get_training_args(benchmark_config: BenchmarkConfig, model_config: ModelConfig, iteration_idx: int, dtype: DataType, batch_size: int | None = None) → TrainingArguments
Get the training arguments for the current iteration.
Parameters
-
benchmark_config : BenchmarkConfig — The benchmark configuration.
-
model_config : ModelConfig — The model configuration.
-
iteration_idx : int — The index of the current iteration. This is only used to generate a unique random seed for the current iteration.
-
dtype : DataType — The data type to use for the model weights.
-
batch_size : int | None — The batch size to use for the current iteration, or None if the batch size in the benchmark config should be used.
Returns
-
TrainingArguments — The training arguments for the current iteration.