euroeval.task_group_utils.text_to_text
source module euroeval.task_group_utils.text_to_text
Utility functions related to the text-to-text task group.
Functions
- 
compute_metrics — Compute the metrics needed for evaluation.
 - 
extract_labels_from_generation — Extract the predicted labels from the generated output.
 
source compute_metrics(model_outputs_and_labels: tuple[Predictions, Labels] | EvalPrediction, dataset_config: DatasetConfig, benchmark_config: BenchmarkConfig, dataset: Dataset) → dict[str, float]
Compute the metrics needed for evaluation.
Parameters
- 
model_outputs_and_labels : tuple[Predictions, Labels] | EvalPrediction — The first sequence contains the model outputs and the second sequence contains the true labels.
 - 
dataset_config : DatasetConfig — The configuration of the dataset.
 - 
benchmark_config : BenchmarkConfig — The configuration of the benchmark.
 - 
dataset : Dataset — The dataset used for evaluation. This is only used in case any additional metadata is used to compute the metrics.
 
Returns
- 
dict[str, float] — A dictionary with the names of the metrics as keys and the metric values as values.
 
Raises
- 
InvalidBenchmark — If the metric computation fails.
 
source extract_labels_from_generation(input_batch: dict[str, list], model_output: GenerativeModelOutput) → c.Sequence[t.Any]
Extract the predicted labels from the generated output.
Parameters
- 
input_batch : dict[str, list] — The input batch, where the keys are the feature names and the values are lists with the feature values.
 - 
model_output : GenerativeModelOutput — The raw generated output of the model.
 
Returns
- 
c.Sequence[t.Any] — The predicted labels.