Skip to content

Novelty Experimentalist

The novelty experimentalist identifies experimental conditions \(\vec{x}' \in X'\) with respect to a pairwise distance metric applied to existing experimental conditions \(\vec{x} \in X\):

\[ \underset{\vec{x}'}{\arg\max}~f(d(\vec{x}, \vec{x}')) \]

where \(f\) is an integration function applied to all pairwise distances.

Example

For instance, the integration function \(f(x)=\min(x)\) and distance function \(d(x, x')=|x-x'|\) identifies condition \(\vec{x}'\) with the greatest minimal Euclidean distance to all existing conditions in \(\vec{x} \in X\).

\[ \underset{\vec{x}}{\arg\max}~\min_i(\sum_{j=1}^n(x_{i,j} - x_{i,j}')^2) \]

To illustrate this sampling strategy, consider the following four experimental conditions that were already probed:

\(x_{i,0}\) \(x_{i,1}\) \(x_{i,2}\)
0 0 0
1 0 0
0 1 0
0 0 1

Fruthermore, let's consider the following three candidate conditions \(X'\):

\(x_{i,0}'\) \(x_{i,1}'\) \(x_{i,2}'\)
1 1 1
2 2 2
3 3 3

If the novelty experimentalist is tasked to identify two novel conditions, it will select the last two candidate conditions \(x'_{1,j}\) and \(x'_{2,j}\) because they have the greatest minimal distance to all existing conditions \(x_{i,j}\):

Example Code

import numpy as np
from autora.experimentalist.novelty import novelty_sample, novelty_score_sample

# Specify X and X'
X = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]])
X_prime = np.array([[1, 1, 1], [2, 2, 2], [3, 3, 3]])

# Here, we choose to identify two novel conditions
n = 2
X_sampled = novelty_sample(conditions=X_prime, reference_conditions=X, num_samples=n)

# We may also obtain samples along with their z-scored novelty scores  
(X_sampled, scores) = novelty_score_sample(conditions=X_prime, reference_conditions=X, num_samples=n)