Skip to content

euroeval.eee_utils

source module euroeval.eee_utils

Utility functions for the Every Eval Ever (EEE) output format.

Functions

source benchmark_result_to_eee_dict(result: BenchmarkResult)dict

Convert a BenchmarkResult to the Every Eval Ever (EEE) format.

Produces a dictionary conforming to the Every Eval Ever JSON schema v0.2.1 (https://github.com/evaleval/every_eval_ever/blob/main/eval.schema.json). The resulting dict can be written directly to euroeval_benchmark_results.jsonl and later reconstructed without loss via benchmark_result_from_eee_dict.

The mapping is as follows:

  • Top-level fields: schema_version, evaluation_id, evaluation_timestamp, retrieved_timestamp, source_metadata.
  • model_info: model id/name plus EuroEval-specific details (num_model_parameters, max_sequence_length, vocabulary_size, merge, generative, generative_type) in additional_details.
  • eval_library: name="euroeval", library version, and evaluation context (languages, task, shot config, library versions, raw per-iteration scores) in additional_details.
  • evaluation_results: one entry per metric. The 95 % confidence interval half-width stored in the _se keys is exposed as a confidence_interval with confidence_level: 0.95. Speed metrics (test_speed, test_speed_short) do not include score_type, min_score, or max_score because tokens-per-second has no fixed upper bound.

Parameters

Returns

  • dict A dictionary matching the EEE JSON schema v0.2.1.

source benchmark_result_from_eee_dict(config: dict)BenchmarkResult

Create a BenchmarkResult from an Every Eval Ever format dictionary.

Reconstructs a full BenchmarkResult from a dictionary conforming to the Every Eval Ever (EEE) JSON schema v0.2.1. This function is the inverse of benchmark_result_to_eee_dict and enables lossless round-trips.

Parameters

Returns

source parse_optional_str(value: str | None)str | None

Parse a string-encoded optional string value.

Parameters

  • value : str | None The string to parse. None maps to None.

Returns

  • str | None None if value is None, otherwise the original string.

source parse_optional_bool(value: str | None)bool | None

Parse a string-encoded optional boolean value.

Parameters

  • value : str | None The string to parse. None maps to None; any other value is compared case-insensitively to "true".

Returns

  • bool | None None if value is None, otherwise a boolean.