Contribute An Experimentalist

AutoRA experimentalists are meant to return novel experimental conditions based on prior experimental conditions, prior observations, and/or prior models. Such conditions may serve as a basis for new, informative experiments conducted by an experiment runner. Experimentalists are generally implemented as functions that can be integrated into an Experimentalist Pipeline.

Experimentalist Module

Repository Setup

We recommend using the cookiecutter template to set up a repository for your experimentalist. Alternatively, you can use the unguided template. If you choose the cookiecutter template, you can set up your repository using

cookiecutter https://github.com/AutoResearch/autora-template-cookiecutter

Make sure to select the experimentalist option when prompted. You can skip all other prompts pertaining to other modules (e.g., experiment runners) by pressing enter.

Implementation

For an experimentalist, you should implement a function that returns a set of experimental conditions. This set may be a pandas data frame, numpy array, iterator variable or other data format.

Hint

We generally recommend using pandas data frames as outputs in which

columns correspond to the independent variables of an experiment.

Once you've created your repository, you can implement your experimentalist by editing the __init__.py file in src/autora/experimentalist/name_of_your_experimentalist/. You may also add additional files to this directory if needed. It is important that the __init__.py file contains a function called name_of_your_experimentalist which returns a set of experimental conditions (e.g., as a numpy array).

The following example __init__.py illustrates the implementation of a simple experimentalist that uniformly samples without replacement from a pool of candidate conditions.

"""
Example Experimentalist
"""

import random
import pandas as pd
import numpy as np
from typing import Iterable, Union

def random_sample(conditions: Union[pd.DataFrame, np.ndarray], 
              num_samples: int = 1) -> pd.DataFrame:
    """
    Uniform random sampling without replacement from a pool of conditions.
    Args:
        conditions: Pool of conditions
        num_samples: number of samples to collect

    Returns: Sampled pool of conditions

    """

    if isinstance(conditions, pd.DataFrame):
        # Randomly sample N rows from DataFrame
        sampled_data = conditions.sample(n=num_samples)
        return sampled_data

    elif isinstance(conditions, np.ndarray):
        # Randomly sample N rows from NumPy array
        if num_samples > conditions.shape[0]:
            raise ValueError("num_samples cannot be greater than the number of rows in the array.")
        indices = np.random.choice(conditions.shape[0], size=num_samples, replace=False)
        sampled_conditions = conditions[indices]
        return sampled_conditions

Next Steps: Testing, Documentation, Publishing

For more information on how to test, document, and publish your experimentalist, please refer to the general guideline for module contributions .