driftbench
This is the documentation of driftbench, a framwork to synthetically generate process curves
and to benchmark process drift detectors.
Please consider citing if you use driftbench in your research:
@article{wolf_method_2025,
title = {A method to benchmark high-dimensional process drift detection},
issn = {1572-8145},
url = {https://doi.org/10.1007/s10845-025-02590-9},
doi = {10.1007/s10845-025-02590-9},
journal = {Journal of Intelligent Manufacturing},
author = {Wolf, Edgar and Windisch, Tobias},
year = {2025},
}
About
driftbench is a benchmarking framework for detecting drifts in high-dimensional process curve
data, commonly found in industrial manufacturing. Process curves are multivariate time series
generated by repeated executions of manufacturing steps (e.g., pressing, screwing).
The key contributions of this package the following:
Synthetic Data Generation Framework
A statistical generative model to create synthetic process curves with controllable drift behavior. Each process curve is modeled as a function \(f(w(t), x)\), where \(w(t)\) are latent parameters evolving over time. Drifts are injected by moving “support points” (e.g., maxima, inflection points) in the curves over time. Parameters \(w(t)\) are optimized to satisfy these support point constraints, allowing for highly controlled and realistic drift patterns.
A novel Temporal Evaluation Metric
We introduce a metric called Temporal Area Under the Curve (TAUC) that evaluates how well drift detectors capture drift segments in their temporal context. Unlike standard AUC, TAUC rewards detectors that correctly identify the onset, duration, and position of drifts, rather than just individual anomalous points. A soft version (sTAUC) is also introduced, accounting for partial overlaps between detected and true drift segments.
Benchmarking functionality
This package provides several common drift detection techniques (autoencoder-based, clustering, statistical tests like KS or MMD) which can be evaluated on synthetic datasets with different types and numbers of drift segments.
Getting started
This is a minimal example to generate N=10 curves from a cubic function:
import numpy as np
from driftbench.data_generation.loaders import load_dataset_specification_from_yaml
from driftbench.data_generation.sample import sample_curves
input = """
example:
N: 10
dimensions: 10
latent_information:
!LatentInformation
y0: [0, 8, 64]
x0: [0, 2, 4]
y1: [3, 27]
x1: [1, 3]
y2: [12]
x2: [2]
drifts:
!DriftSequence
- !LinearDrift
start: 3
end: 5
feature: x0
dimension: 1
m: 0.1
"""
def f(w, x):
return w[0] * x ** 3 + w[1] * x ** 2 + w[2] * x + w[3]
w0 = np.zeros(4)
dataset = load_dataset_specification_from_yaml(input)
coefficients, latent_information, curves = sample_curves(dataset["example"], w0=w0, f=f)