euroeval.benchmark_modules.vllm

source module euroeval.benchmark_modules.vllm

Generative models using the vLLM inference framework.

Classes

VLLMModel — A generative model using the vLLM inference framework.

Functions

load_model_and_tokenizer — Load the model and tokenizer.
load_tokenizer — Load the tokenizer.
clear_vllm — Clear the GPU memory used by the vLLM model, enabling re-initialisation.
get_end_of_reasoning_token — Get the end-of-reasoning token for a generative model.
get_custom_stop_tokens — Get the stop tokens for a generative model.

source class VLLMModel(model_config: ModelConfig, dataset_config: DatasetConfig, benchmark_config: BenchmarkConfig)

Bases : HuggingFaceEncoderModel

A generative model using the vLLM inference framework.

Initialise the vLLM model.

Parameters

model_config : ModelConfig — The model configuration.
dataset_config : DatasetConfig — The dataset configuration.
benchmark_config : BenchmarkConfig — The benchmark configuration.

Attributes

generative_type : GenerativeType | None — Get the generative type of the model.
data_collator : c.Callable[[list[t.Any]], dict[str, t.Any]] — The data collator used to prepare samples during finetuning.
compute_metrics : ComputeMetricsFunction — The function used to compute the metrics.
extract_labels_from_generation : ExtractLabelsFunction — The function used to extract the labels from the generated output.
trainer_class : t.Type['Trainer'] — The Trainer class to use for finetuning.

Methods

prepare_dataset — Prepare the dataset for the model.
generate — Generate outputs from the model.
model_exists — Check if a model exists.
get_model_config — Fetch the model configuration.

source property VLLMModel.generative_type: GenerativeType | None

Get the generative type of the model.

Returns

GenerativeType | None — The generative type of the model, or None if it has not been set yet.

source property VLLMModel.extract_labels_from_generation: ExtractLabelsFunction

The function used to extract the labels from the generated output.

Returns

ExtractLabelsFunction — The function used to extract the labels from the generated output.

source method VLLMModel.prepare_dataset(dataset: DatasetDict, task: Task, itr_idx: int) → DatasetDict

Prepare the dataset for the model.

This includes things like tokenisation.

Parameters

dataset : DatasetDict — The dataset to prepare.
task : Task — The task to prepare the dataset for.
itr_idx : int — The index of the dataset in the iterator.

Returns

DatasetDict — The prepared dataset.

source method VLLMModel.generate(inputs: dict) → GenerativeModelOutput

Generate outputs from the model.

Parameters

inputs : dict — A batch of inputs to pass through the model.

Returns

GenerativeModelOutput — The generated model outputs.

Raises

source classmethod VLLMModel.model_exists(model_id: str, benchmark_config: BenchmarkConfig) → bool | NeedsExtraInstalled | NeedsEnvironmentVariable

Check if a model exists.

Parameters

model_id : str — The model ID.
benchmark_config : BenchmarkConfig — The benchmark configuration.

Returns

bool | NeedsExtraInstalled | NeedsEnvironmentVariable — Whether the model exists, or an error describing why we cannot check whether the model exists.

source classmethod VLLMModel.get_model_config(model_id: str, benchmark_config: BenchmarkConfig) → ModelConfig

Fetch the model configuration.

Parameters

model_id : str — The model ID.
benchmark_config : BenchmarkConfig — The benchmark configuration.

Returns

ModelConfig — The model configuration.

Raises

InvalidModel

source property VLLMModel.data_collator: c.Callable[[list[t.Any]], dict[str, t.Any]]

The data collator used to prepare samples during finetuning.

Returns

c.Callable[[list[t.Any]], dict[str, t.Any]] — The data collator.

source property VLLMModel.trainer_class: t.Type['Trainer']

The Trainer class to use for finetuning.

Returns

t.Type['Trainer'] — The Trainer class.

source load_model_and_tokenizer(model_config: ModelConfig, benchmark_config: BenchmarkConfig) → tuple[LLM, PreTrainedTokenizer]

Load the model and tokenizer.

Parameters

model_config : ModelConfig — The model configuration.
benchmark_config : BenchmarkConfig — The benchmark configuration.

Returns

tuple[LLM, PreTrainedTokenizer] — A pair (model, tokenizer), with the loaded model and tokenizer

Raises

source load_tokenizer(model_id: str, revision: str, adapter_base_model_id: str | None, trust_remote_code: bool, model_max_length: int, model_cache_dir: str, token: str | bool) → PreTrainedTokenizer

Load the tokenizer.

Parameters

model_id : str — The model identifier.
revision : str — The revision of the model.
adapter_base_model_id : str | None — The base model ID for the adapter model. Can be None if the model is not an adapter model.
trust_remote_code : bool — Whether to trust remote code.
model_max_length : int — The maximum length of the model.
model_cache_dir : str — The cache directory for the model.
token : str | bool — The Hugging Face API token.

Returns

PreTrainedTokenizer — The loaded tokenizer.

Raises

InvalidModel

source clear_vllm() → None

Clear the GPU memory used by the vLLM model, enabling re-initialisation.

source get_end_of_reasoning_token(model: LLM, tokenizer: PreTrainedTokenizer, model_id: str) → str | None

Get the end-of-reasoning token for a generative model.

Parameters

model : LLM — The vLLM model.
tokenizer : PreTrainedTokenizer — The tokenizer.
model_id : str — The model ID.

Returns

str | None — The end of reasoning token, or None if it could not be found.

source get_custom_stop_tokens(model: LLM, tokenizer: PreTrainedTokenizer, model_id: str, is_reasoning_model: bool) → list[str]

Get the stop tokens for a generative model.

Parameters

model : LLM — The vLLM model.
tokenizer : PreTrainedTokenizer — The tokenizer.
model_id : str — The model ID.
is_reasoning_model : bool — Whether the model is a reasoning model. This is used to determine the number of generated tokens to allow before stopping the generation.

Returns

list[str] — A list of stop tokens.