Basic Tutorial Ib: Theorists¶

Theorists are classes designed to automate the construction of interpretable models from data. AutoRA theorists are implemented as sklearn regressors and can be used with the fit and predict methods.

In order to use a theorist, you must first install the corresponding theorist package. Some theorists are installed by default when you install autora. Once a theorist is installed, you can instantiate it and use it as you would any other sklearn regressor. That is, you can call the fit function of the theorist by passing in experimental conditions and corresponding observations, and then call the predict function to generate predicted observations for novel experimental conditions using the discovered model.

The following tutorial demonstrates how to use the BMSRegressor (Guimerà et al., 2020, in Sci. Adv.)–a theorist that can discover an interpretable equation relating the independent variables of an experiment (experiment conditions) to predicted dependent variables (observations).

We will compare the performance of the BMSRegressor with two other methods: a polynomial regressor and a neural network regressor. The polynomial regressor is a simple model that can only fit polynomial functions, while the neural network regressor is a more complex model that can fit a wider range of functions. The BMSRegressor is a hybrid model that can fit a wide range of functions while also providing an interpretable, potentially non-linear equation.

Note: this tutorial requires Python 3.10 to run successfully.

Installing and Importing Relevant Packages¶

Install relevant subpackages from AutoRA

In [ ]:

Copied!

# fix default prior and release new package version
!pip install -q "autora[theorist-bms]"
# fix default prior and release new package version
!pip install -q "autora[theorist-bms]"

Import relevant modules from AutoRA

In [ ]:

Copied!





from autora.theorist.bms import BMSRegressor
from autora.experiment_runner.synthetic.psychophysics.weber_fechner_law import weber_fechner_law
import numpy as np
from sklearn.base import BaseEstimator
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
from autora.theorist.bms import BMSRegressor
from autora.experiment_runner.synthetic.psychophysics.weber_fechner_law import weber_fechner_law
import numpy as np
from sklearn.base import BaseEstimator
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay

Functions for Plotting Results¶

Before we begin, we also define some functions to plot the results of our models. Simply execute the following code block to define the plotting functions.

In [ ]:

Copied!





def present_results(model, x_test, y_test, arg='default', model_name=None, variable_names=None, select_indices=None, figsize=None, *args):
  compare_results(models=[model], x_test=x_test, y_test=y_test, arg=arg, model_names=[model_name], variable_names=variable_names, select_indices=select_indices, figsize=figsize, *args)
  
def compare_results(models, x_test, y_test, arg='default', model_names=None, variable_names=None, observation_name=None, select_indices=None, figsize=None, *args):
  if model_names is None or model_names == [None]:
    names = ['Model '+str(i+1) for i in range(len(models))]
  else:
    names = model_names
  if len(x_test.shape) == 1:
    x_test = x_test.reshape(1, -1)
  num_var = x_test.shape[1]
  if variable_names is None:
    var_names = ['Variable '+str(i+1) for i in range(num_var)]
  else:
    var_names = variable_names
  if observation_name is None:
    obs_label = 'Observations'
  else:
    obs_label = observation_name
  match arg:
    case 'default':
      for i, model in enumerate(models):
        print(model)
        synthetic_runner.plotter(model)
    case '2d':
      if figsize is None:
        size = (8,3)
      else:
        assert len(figsize) == 2 and isinstance(figsize, tuple), 'incorrect format for figure shape\nshould be tuple of form (i,j)'
        size = figsize
      for i, model in enumerate(models):
        fig = plt.figure(figsize=size)
        axes = []
        y_predict = model.predict(x_test)
        for j in range(num_var):
          axes.append(fig.add_subplot(1, num_var, j+1))
          axes[j].set_xlabel(var_names[j])
          axes[j].set_ylabel(obs_label)
          axes[j].set_title(names[i]+' fit on '+var_names[j])
          axes[j].scatter(x_test[:,j], y_test, label='Ground Truth', alpha=0.5)
          axes[j].scatter(x_test[:,j], y_predict, label='Predicted', alpha=0.5)
          axes[j].legend()
          for arg in args:
            assert isinstance(arg, str), 'arguments must be in the form of a string'
            try:
              exec('axes[j].'+arg)
            except:
              raise RuntimeError(f'argument "{arg}" could not be executed')

      fig.tight_layout()
      plt.show()
    case '3d':
      if figsize is None:
        size = (15,5)
      else:
        assert len(figsize) == 2 and isinstance(figsize, tuple), 'incorrect format for figure shape\nshould be tuple of form (i,j)'
        size = figsize
      axes = []
      fig = plt.figure(figsize=size)
      if select_indices is None:
        idx = (0,1)
      else:
        len(select_indices) == 2 and isinstance(select_indices, tuple), 'incorrect format for select_indices\nshould be tuple of form (i,j)'
        idx = select_indices
      for i, model in enumerate(models):
        y_predict = model.predict(x_test)
        ax = fig.add_subplot(1, 3, i+1, projection='3d')
        ax.w_xaxis.set_pane_color((1.0, 1.0, 1.0, 0.0))
        ax.w_yaxis.set_pane_color((1.0, 1.0, 1.0, 0.0))
        ax.w_zaxis.set_pane_color((1.0, 1.0, 1.0, 0.0))
        axes.append(ax)
        axes[i].set_xlabel(var_names[idx[0]])
        axes[i].set_ylabel(var_names[idx[1]])
        axes[i].set_zlabel(obs_label)
        axes[i].scatter(x_test[:, idx[0]], x_test[:, idx[1]], y_test, s=1, label='Ground Truth')
        axes[i].scatter(x_test[:, idx[0]], x_test[:, idx[1]], y_predict, s=1, label='Predicted')
        axes[i].set_title(names[i])
        axes[i].legend()
        axes[i].set_facecolor('white')
        for arg in args:
            assert isinstance(arg, str), 'arguments must be in the form of a string'
            try:
              exec('axes[j].'+arg)
            except:
              raise RuntimeError(f'argument "{arg}" could not be executed')
      fig.tight_layout()
      plt.show()
    case 'choice':
      for model in models:
        y_pred = np.where(model.predict(x_test) > 0.5, 1, 0)
        cm = confusion_matrix(y_true=y_test, y_pred=y_pred)
        cmd = ConfusionMatrixDisplay(cm)
        cmd.plot()
        plt.show()
def present_results(model, x_test, y_test, arg='default', model_name=None, variable_names=None, select_indices=None, figsize=None, *args):
  compare_results(models=[model], x_test=x_test, y_test=y_test, arg=arg, model_names=[model_name], variable_names=variable_names, select_indices=select_indices, figsize=figsize, *args)
  
def compare_results(models, x_test, y_test, arg='default', model_names=None, variable_names=None, observation_name=None, select_indices=None, figsize=None, *args):
  if model_names is None or model_names == [None]:
    names = ['Model '+str(i+1) for i in range(len(models))]
  else:
    names = model_names
  if len(x_test.shape) == 1:
    x_test = x_test.reshape(1, -1)
  num_var = x_test.shape[1]
  if variable_names is None:
    var_names = ['Variable '+str(i+1) for i in range(num_var)]
  else:
    var_names = variable_names
  if observation_name is None:
    obs_label = 'Observations'
  else:
    obs_label = observation_name
  match arg:
    case 'default':
      for i, model in enumerate(models):
        print(model)
        synthetic_runner.plotter(model)
    case '2d':
      if figsize is None:
        size = (8,3)
      else:
        assert len(figsize) == 2 and isinstance(figsize, tuple), 'incorrect format for figure shape\nshould be tuple of form (i,j)'
        size = figsize
      for i, model in enumerate(models):
        fig = plt.figure(figsize=size)
        axes = []
        y_predict = model.predict(x_test)
        for j in range(num_var):
          axes.append(fig.add_subplot(1, num_var, j+1))
          axes[j].set_xlabel(var_names[j])
          axes[j].set_ylabel(obs_label)
          axes[j].set_title(names[i]+' fit on '+var_names[j])
          axes[j].scatter(x_test[:,j], y_test, label='Ground Truth', alpha=0.5)
          axes[j].scatter(x_test[:,j], y_predict, label='Predicted', alpha=0.5)
          axes[j].legend()
          for arg in args:
            assert isinstance(arg, str), 'arguments must be in the form of a string'
            try:
              exec('axes[j].'+arg)
            except:
              raise RuntimeError(f'argument "{arg}" could not be executed')

      fig.tight_layout()
      plt.show()
    case '3d':
      if figsize is None:
        size = (15,5)
      else:
        assert len(figsize) == 2 and isinstance(figsize, tuple), 'incorrect format for figure shape\nshould be tuple of form (i,j)'
        size = figsize
      axes = []
      fig = plt.figure(figsize=size)
      if select_indices is None:
        idx = (0,1)
      else:
        len(select_indices) == 2 and isinstance(select_indices, tuple), 'incorrect format for select_indices\nshould be tuple of form (i,j)'
        idx = select_indices
      for i, model in enumerate(models):
        y_predict = model.predict(x_test)
        ax = fig.add_subplot(1, 3, i+1, projection='3d')
        ax.w_xaxis.set_pane_color((1.0, 1.0, 1.0, 0.0))
        ax.w_yaxis.set_pane_color((1.0, 1.0, 1.0, 0.0))
        ax.w_zaxis.set_pane_color((1.0, 1.0, 1.0, 0.0))
        axes.append(ax)
        axes[i].set_xlabel(var_names[idx[0]])
        axes[i].set_ylabel(var_names[idx[1]])
        axes[i].set_zlabel(obs_label)
        axes[i].scatter(x_test[:, idx[0]], x_test[:, idx[1]], y_test, s=1, label='Ground Truth')
        axes[i].scatter(x_test[:, idx[0]], x_test[:, idx[1]], y_predict, s=1, label='Predicted')
        axes[i].set_title(names[i])
        axes[i].legend()
        axes[i].set_facecolor('white')
        for arg in args:
            assert isinstance(arg, str), 'arguments must be in the form of a string'
            try:
              exec('axes[j].'+arg)
            except:
              raise RuntimeError(f'argument "{arg}" could not be executed')
      fig.tight_layout()
      plt.show()
    case 'choice':
      for model in models:
        y_pred = np.where(model.predict(x_test) > 0.5, 1, 0)
        cm = confusion_matrix(y_true=y_test, y_pred=y_pred)
        cmd = ConfusionMatrixDisplay(cm)
        cmd.plot()
        plt.show()

Our Study Object: The Weber-Fechner Law¶

We will evaluate our models to recover Weber-Fechner law. The Weber-Fechner law quantifies the minimum change in a stimulus required to be noticeable. Similar to Steven's power law, the greater the intensity of a stimulus, the larger the change needed to be perceivable. This relationship is hypothesized to be proportional to the logarithm of the ratio between the two stimuli:

$y = c \log\left(\dfrac{x_1}{x_2}\right)$

Generating Data From a Synthetic Psychophysics Experiment¶

Here, we leverage a synthetic experiment to generate data from this equation. It is parameterized by the constant $c$. The independent variables are $x_1$ and $x_2$, corresponding to the intensity of a stimulus and the baseline stimulus intensity, respectively. The dependent variable is $y$ perceived stimulus intensity.

In [ ]:

Copied!

constant = 3.0

# synthetic experiment from autora inventory
synthetic_runner = weber_fechner_law(constant=constant)

# experiment meta data:
synthetic_runner.variables
constant = 3.0

# synthetic experiment from autora inventory
synthetic_runner = weber_fechner_law(constant=constant)

# experiment meta data:
synthetic_runner.variables

First, let's take a look at the variables in the synthetic experiment.

In [ ]:

Copied!

# independent variables
ivs = [iv.name for iv in synthetic_runner.variables.independent_variables]

# dependent variable
dvs = [dv.name for dv in synthetic_runner.variables.dependent_variables]

ivs, dvs
# independent variables
ivs = [iv.name for iv in synthetic_runner.variables.independent_variables]

# dependent variable
dvs = [dv.name for dv in synthetic_runner.variables.dependent_variables]

ivs, dvs

We can obtain the experimental data by running the synthetic experiment. The conditions contain values for the independent variables. Once we have the conditions, we can run the experiment to obtain the experiment_data containing both the conditions and the observations from the synthetic experiment.

In [ ]:

Copied!





# experimental data
conditions = synthetic_runner.domain()
experiment_data = synthetic_runner.run(conditions, added_noise=0.01)

# observations
observations = experiment_data[dvs]

experiment_data
# experimental data
conditions = synthetic_runner.domain()
experiment_data = synthetic_runner.run(conditions, added_noise=0.01)

# observations
observations = experiment_data[dvs]

experiment_data

Next, we split the data into training and testing datasets.

In [ ]:

Copied!

# split into train and test datasets
conditions_train, conditions_test, observations_train, observations_test = train_test_split(conditions, observations)
# split into train and test datasets
conditions_train, conditions_test, observations_train, observations_test = train_test_split(conditions, observations)

Fitting Models to the Data¶

In this section, we will fit the data with three techniques:

Polynomial Regressor
Neural Network Regressor
Bayesian Machine Scientist

The last technique is an equation discovery algorithm implemented in the AutoRA framework.

We will repeat the following steps for each method:

Initialize Model
Fit Model to the Data
Plot the Results

Polynomial Regressor¶

Expressivity: Low

Interpretability: High

The polynomial regressor has a high interpretability, but it is limited in expressivity. It can only fit polynomial functions.

We first initialize the polynomial regressor. The polynomial regressor fits a polynomial function to the data. Below, we set the degree of the polynomial to 3.

Note that the PolynomialRegressor class is a simple implementation of a polynomial regressor using sklearn's PolynomialFeatures and LinearRegression classes.

In [ ]:

Copied!





from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression


class PolynomialRegressor:
    """
    This theorist fits a polynomial function to the data.
    """

    def __init__(self, degree: int = 3):
      self.poly = PolynomialFeatures(degree=degree, include_bias=False)
      self.model = LinearRegression()

    def fit(self, x, y):
      features = self.poly.fit_transform(x, y)
      self.model.fit(features, y)
      return self

    def predict(self, x):
      features = self.poly.fit_transform(x)
      return self.model.predict(features)
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression


class PolynomialRegressor:
    """
    This theorist fits a polynomial function to the data.
    """

    def __init__(self, degree: int = 3):
      self.poly = PolynomialFeatures(degree=degree, include_bias=False)
      self.model = LinearRegression()

    def fit(self, x, y):
      features = self.poly.fit_transform(x, y)
      self.model.fit(features, y)
      return self

    def predict(self, x):
      features = self.poly.fit_transform(x)
      return self.model.predict(features)

Let's initialize the model

In [ ]:

Copied!

np.random.seed(0)
poly_model = PolynomialRegressor(degree=3)
np.random.seed(0)
poly_model = PolynomialRegressor(degree=3)

And fit it to the training data, consisting of the experimental conditions and corresponding observations of our synthetic experiment.

In [ ]:

Copied!

poly_model.fit(conditions_train, observations_train)
poly_model.fit(conditions_train, observations_train)

Finally, we can plot the results of the polynomial regressor to evaluate the model's performance.

In [ ]:

Copied!

present_results(model=poly_model, x_test=conditions_test, y_test=observations_test)
present_results(model=poly_model, x_test=conditions_test, y_test=observations_test)

In [ ]:

Copied!

present_results(model=poly_model, x_test=conditions_test, y_test=observations_test, arg='2d')
present_results(model=poly_model, x_test=conditions_test, y_test=observations_test, arg='2d')

In [ ]:

Copied!

present_results(model=poly_model, x_test=conditions_test, y_test=observations_test, arg='3d')
present_results(model=poly_model, x_test=conditions_test, y_test=observations_test, arg='3d')

In summary, we can see that the polynomial regressor is a simple model. It has high interpretability which allows us to quantify the law underlying the data in terms of a polynomial. However, it has some trouble fitting the data generated from the logarithmic psychophysics law due to its limited expressivity.

Neural Network¶

Expressivity: High

Interpretability: Low

Neural networks are known for their high expressivity, allowing to fit any function. However, they are often considered black-box models due to their complex structure, limiting interpretability for the user.

For this section, we are using torch: an open-source machine learning library. It provides a flexible and dynamic computational graph, allowing for complex neural network architectures.

In [ ]:

Copied!

!pip install -q torch
!pip install -q torch

We can now initialize a simple multi-layer perceptron (MLP) regressor using the MLPRegressor class from the sklearn.neural_network module. We will train it for a maximum of 500 iterations.

In [ ]:

Copied!

from sklearn.neural_network import MLPRegressor
import torch
nn_model = MLPRegressor(random_state=1, max_iter=500)
from sklearn.neural_network import MLPRegressor
import torch
nn_model = MLPRegressor(random_state=1, max_iter=500)

Similar to the polynomial regressor above, we can fit the neural network regressor to the training data.

In [ ]:

Copied!

np.random.seed(0)
torch.manual_seed(0)
nn_model.fit(conditions_train, observations_train)
np.random.seed(0)
torch.manual_seed(0)
nn_model.fit(conditions_train, observations_train)

And plot the results of the neural network regressor.

In [ ]:

Copied!

synthetic_runner.plotter(nn_model)
synthetic_runner.plotter(nn_model)

We may observe that the neural network regressor does a better job of fitting the data but it is less interpretable than the polynomial regressor. The neural network is a more complex model that can fit a wider range of functions, but it is also a black-box model that does not provide an interpretable equation, limiting its utility for scientific discovery.

Bayesian Machine Scientist¶

Expressivity: Medium

Interpretability: High

The Bayesian Machine Scientist (BMS) is one of the theorists that comes with the autora package. It is an equation discovery method that can fit a wide range of functions while providing an interpretable equation. It uses MCMC-Sampling to explore the space of possible equations and find the best-fitting equation for the data while minimizing the number of parameters in the equation.

We initialize the BMS regressor to run 1500 epochs.

In [ ]:

Copied!

bms_model = BMSRegressor(epochs=1500)
bms_model = BMSRegressor(epochs=1500)

We can also fit the model to the training data...

In [ ]:

Copied!

bms_model.fit(conditions_train, observations_train, seed=0)
bms_model.fit(conditions_train, observations_train, seed=0)

...and plot the results.

In [ ]:

Copied!

synthetic_runner.plotter(bms_model)
synthetic_runner.plotter(bms_model)

In addition, we can print the discovered equation.

In [ ]:

Copied!

print(bms_model)
print(bms_model)

Note that BMS is not only producing good fits but is also capable of recovering the underlying equation of the synthetic data, the Weber-Fechner law.

In summary, the BMS regressor is a powerful tool for fitting a wide range of functions while providing an interpretable equation. It strikes a balance between expressivity and interpretability, making it a valuable tool for scientific discovery. In our case, it is capable of re-discovering the Weber-Fechner law from the synthetic data.

Comparing the Models¶

Finally, we can compare all three fitted models in terms of their fit to the hold-out data using the compare_results function. This function plots the ground truth and predicted values for each model, allowing us to visually compare their performance.

In [ ]:

Copied!

models = [poly_model, nn_model, bms_model]
names =['poly_model', 'nn_model', 'bms_model']
compare_results(models=models, x_test=conditions_test, y_test=observations_test, model_names=names, arg='2d')
models = [poly_model, nn_model, bms_model]
names =['poly_model', 'nn_model', 'bms_model']
compare_results(models=models, x_test=conditions_test, y_test=observations_test, model_names=names, arg='2d')

We can also compare the models in 3D space.

In [ ]:

Copied!

compare_results(models=models, x_test=conditions_test, y_test=observations_test, model_names=names, arg='3d')
compare_results(models=models, x_test=conditions_test, y_test=observations_test, model_names=names, arg='3d')

Summary¶

In this tutorial, we compared the performance of three different theorists: a polynomial regressor, a neural network regressor, and the Bayesian Machine Scientist (BMS) regressor. The polynomial regressor is a simple model that can only fit polynomial functions, while the neural network regressor is a more complex model that can fit a wider range of functions. The BMS regressor is a hybrid model that can fit a wide range of functions while also providing an interpretable, potentially non-linear equation.

AutoRA provides interfaces for using theorists as sklearn regressors. This allows you to easily fit and evaluate theorists using the fit and predict methods. Note that such theorists may not be limited to fitting functions but may also discover complex computational models or algorithms describing the data.