euroeval.task_group_utils.token_classification

source module euroeval.task_group_utils.token_classification

Utility functions related to the token-classification task group.

Functions

compute_metrics — Compute the metrics needed for evaluation.
extract_labels_from_generation — Extract the predicted labels from the generated output.
tokenize_and_align_labels — Tokenise all texts and align the labels with them.
handle_unk_tokens — Replace unknown tokens in the tokens with the corresponding word.

source compute_metrics(model_outputs_and_labels: tuple[Predictions, Labels] | EvalPrediction, has_misc_tags: bool, dataset_config: DatasetConfig, benchmark_config: BenchmarkConfig, dataset: Dataset) → dict[str, float]

Compute the metrics needed for evaluation.

Parameters

model_outputs_and_labels : tuple[Predictions, Labels] | EvalPrediction — The first array contains the probability predictions and the second array contains the true labels.
has_misc_tags : bool — Whether the dataset has MISC tags.
dataset_config : DatasetConfig — The configuration of the dataset.
benchmark_config : BenchmarkConfig — The configuration of the benchmark.
dataset : Dataset — The dataset used for evaluation. This is only used in case any additional metadata is used to compute the metrics.

Returns

dict[str, float] — A dictionary with the names of the metrics as keys and the metric values as values.

Raises

InvalidBenchmark

source extract_labels_from_generation(input_batch: dict[str, list], model_output: GenerativeModelOutput, dataset_config: DatasetConfig) → list[t.Any]

Extract the predicted labels from the generated output.

Parameters

input_batch : dict[str, list] — The input batch, where the keys are the feature names and the values are lists with the feature values.
model_output : GenerativeModelOutput — The raw generated output of the model.
dataset_config : DatasetConfig — The configuration of the dataset.

Returns

list[t.Any] — The predicted labels.

source tokenize_and_align_labels(examples: dict, tokeniser: PreTrainedTokenizer, label2id: dict[str, int]) → BatchEncoding

Tokenise all texts and align the labels with them.

Parameters

examples : dict — The examples to be tokenised.
tokeniser : PreTrainedTokenizer — A pretrained tokeniser.
label2id : dict[str, int] — A dictionary that converts NER tags to IDs.

Returns

BatchEncoding — A dictionary containing the tokenized data as well as labels.

Raises

InvalidBenchmark

source handle_unk_tokens(tokeniser: PreTrainedTokenizer, tokens: list[str], words: list[str]) → list[str]

Replace unknown tokens in the tokens with the corresponding word.

Parameters

tokeniser : PreTrainedTokenizer — The tokeniser used to tokenise the words.
tokens : list[str] — The list of tokens.
words : list[str] — The list of words.

Returns

list[str] — The list of tokens with unknown tokens replaced by the corresponding word.