How It Works
Bayesian Symbolic Regression (BSR) has the following features:
-
It models equations as expression trees, with root and intermediate tree nodes representing operators (e.g.
*
for a binary node andsin
for a unary node) and leaf nodes representing features in the data. BSR then defines the search space as the union of the following three parts:- Tree structure (T): this represents the structure of the expression tree (e.g. how to recursively construct the tree and when to stop by using leaf nodes), and also specifies the assignment of operators to non-leaf nodes.
- Leaf nodes (M): this assigns features to leaf nodes that are already defined from part T.
- Operator parameters (\(\Theta\)): this uses a vector \(\Theta\) to collect additional parameters for certain operators which require them (e.g. a linear operator
ln
with intercept and slope params).
-
It specifies priors for each of the three parts above.
AutoRA
's implementation of BSR allows users to either specify custom priors for partT
or choose among a pre-specified set. -
It defines
actions
that mutate one expression tree (original
) into a new expression tree (proposed
), and supports the calculation of transition probabilities based on the likelihoods of theoriginal
andproposed
models. -
It designs and implements a Reversible-Jump Markov-Chain Monte-Carlo algorithm (RJ-MCMC), which iteratively accepts new samples (where each sample is a valid expression tree) based on the transition probabilities calculated above. In each iteration,
K
expression trees are obtained either from theoriginal
samples or the newproposed
samples. -
With each iteration, the candidate prediction model is a linear mixture of the
K
trees, wherein the ground truth response is regressed on the results generated by theK
expression trees to obtain the linear regression parameters \(\beta\).
AutoRA
's implementation of BSR is adapted from original authors' codebase, and includes comprehensive refactoring of data structures and MCMC computations. It also provides new priors that suit the cognitive and behavioral sciences.