euroeval.metrics.pipeline
source module euroeval.metrics.pipeline
Metrics based on a scikit-learn Pipeline.
Classes
-
PreprocessingFunction — A protocol for a preprocessing function.
-
PipelineMetric — Load a scikit-learn pipeline and use it to get scores from the predictions.
Functions
-
european_values_preprocessing_fn — Preprocess the model predictions for the European Values metric.
-
european_values_scoring_function — Scoring function for the European Values metric.
source class PreprocessingFunction()
Bases : t.Protocol
A protocol for a preprocessing function.
source class PipelineMetric(name: str, pretty_name: str, pipeline_repo: str, pipeline_scoring_function: c.Callable[['Pipeline', c.Sequence], float], pipeline_file_name: str = 'pipeline.pkl', preprocessing_fn: PreprocessingFunction | None = None, postprocessing_fn: c.Callable[[float], tuple[float, str]] | None = None)
Bases : Metric
Load a scikit-learn pipeline and use it to get scores from the predictions.
Initialise the pipeline transform metric.
Parameters
-
name : str — The name of the metric in snake_case.
-
pretty_name : str — The pretty name of the metric, used for display purposes.
-
pipeline_repo : str — The Hugging Face repository ID of the scikit-learn pipeline to load.
-
pipeline_scoring_method — The method to use for scoring the predictions with the pipeline. Takes a 1D sequence of predictions and returns a float score.
-
pipeline_file_name : optional — The name of the file to download from the Hugging Face repository. Defaults to "pipeline.joblib".
-
preprocessing_fn : optional — A function to apply to the predictions before they are passed to the pipeline. This is useful for preprocessing the predictions to match the expected input format of the pipeline. Defaults to a no-op function that returns the input unchanged.
-
postprocessing_fn : optional — A function to apply to the metric scores after they are computed, taking the score to the postprocessed score along with its string representation. Defaults to x -> (100 * x, f"{x:.2%}").
source european_values_preprocessing_fn(predictions: c.Sequence[int], dataset: Dataset) → c.Sequence[int]
Preprocess the model predictions for the European Values metric.
Parameters
-
predictions : c.Sequence[int] — The model predictions, a sequence of integers representing the predicted choices for each question.
-
dataset : Dataset — The dataset used for evaluation. This is only used in case any additional metadata is used to compute the metrics.
Returns
-
c.Sequence[int] — The preprocessed model predictions, a sequence of integers representing the final predicted choices for each question after any necessary aggregation and mapping.
Raises
-
AssertionError — If the number of predictions is not a multiple of 53, which is required for the European Values metric.
source european_values_scoring_function(pipeline: Pipeline, predictions: c.Sequence[int]) → float
Scoring function for the European Values metric.