How It Works
Bayesian Symbolic Regression (BSR) has the following features:
-
It models equations as expression trees, with root and intermediate tree nodes representing operators (e.g.
*for a binary node andsinfor a unary node) and leaf nodes representing features in the data. BSR then defines the search space as the union of the following three parts:- Tree structure (T): this represents the structure of the expression tree (e.g. how to recursively construct the tree and when to stop by using leaf nodes), and also specifies the assignment of operators to non-leaf nodes.
- Leaf nodes (M): this assigns features to leaf nodes that are already defined from part T.
- Operator parameters (\(\Theta\)): this uses a vector \(\Theta\) to collect additional parameters for certain operators which require them (e.g. a linear operator
lnwith intercept and slope params).
-
It specifies priors for each of the three parts above.
AutoRA's implementation of BSR allows users to either specify custom priors for partTor choose among a pre-specified set. -
It defines
actionsthat mutate one expression tree (original) into a new expression tree (proposed), and supports the calculation of transition probabilities based on the likelihoods of theoriginalandproposedmodels. -
It designs and implements a Reversible-Jump Markov-Chain Monte-Carlo algorithm (RJ-MCMC), which iteratively accepts new samples (where each sample is a valid expression tree) based on the transition probabilities calculated above. In each iteration,
Kexpression trees are obtained either from theoriginalsamples or the newproposedsamples. -
With each iteration, the candidate prediction model is a linear mixture of the
Ktrees, wherein the ground truth response is regressed on the results generated by theKexpression trees to obtain the linear regression parameters \(\beta\).
AutoRA's implementation of BSR is adapted from original authors' codebase, and includes comprehensive refactoring of data structures and MCMC computations. It also provides new priors that suit the cognitive and behavioral sciences.