euroeval.finetuning

Functions related to the finetuning of models.

Functions

finetune — Evaluate a model on a dataset through finetuning.
finetune_single_iteration — Run a single iteration of a benchmark.
get_training_args — Get the training arguments for the current iteration.
remove_extra_tensors_from_logits — If the logits are a tuple, return only the first element.

source finetune(model: BenchmarkModule, datasets: list['DatasetDict'], model_config: ModelConfig, dataset_config: DatasetConfig, benchmark_config: BenchmarkConfig) → list[dict[str, float]]

Evaluate a model on a dataset through finetuning.

Parameters

model : BenchmarkModule — The model to evaluate.
datasets : list['DatasetDict'] — The datasets to use for training and evaluation.
model_config : ModelConfig — The configuration of the model.
dataset_config : DatasetConfig — The dataset configuration.
benchmark_config : BenchmarkConfig — The benchmark configuration.

Returns

list[dict[str, float]] — A list of dicts containing the scores for each metric for each iteration.

Raises

InvalidBenchmark

source finetune_single_iteration(model: BenchmarkModule | None, dataset: DatasetDict, training_args: TrainingArguments, model_config: ModelConfig, dataset_config: DatasetConfig, benchmark_config: BenchmarkConfig) → dict[str, float]

Run a single iteration of a benchmark.

Parameters

model : BenchmarkModule | None — The model to use in the benchmark. If None then a new model will be loaded.
dataset : DatasetDict — The dataset to use for training and evaluation.
training_args : TrainingArguments — The training arguments.
model_config : ModelConfig — The model configuration.
dataset_config : DatasetConfig — The dataset configuration.
benchmark_config : BenchmarkConfig — The benchmark configuration.

Returns

dict[str, float] — The scores for the test dataset.

Raises

e
InvalidBenchmark

source get_training_args(benchmark_config: BenchmarkConfig, model_config: ModelConfig, iteration_idx: int, dtype: DataType, batch_size: int | None = None) → TrainingArguments

Get the training arguments for the current iteration.

Parameters

benchmark_config : BenchmarkConfig — The benchmark configuration.
model_config : ModelConfig — The model configuration.
iteration_idx : int — The index of the current iteration. This is only used to generate a unique random seed for the current iteration.
dtype : DataType — The data type to use for the model weights.
batch_size : int | None — The batch size to use for the current iteration, or None if the batch size in the benchmark config should be used.

Returns

TrainingArguments — The training arguments for the current iteration.

source remove_extra_tensors_from_logits(logits: torch.Tensor | tuple[torch.Tensor, ...], labels: torch.Tensor) → torch.Tensor | tuple[torch.Tensor, ...]

If the logits are a tuple, return only the first element.

Parameters

logits : torch.Tensor | tuple[torch.Tensor, ...] — The logits to process.
labels : torch.Tensor — The labels to use for the processing.

Returns

torch.Tensor | tuple[torch.Tensor, ...] — The processed logits.