I’ve just stumbled across a new program: Eureqa

Its a program, developed by the Cornell Computational Synthesis Laboratory, that can detect equations in sets of data. Its primary goal is to identify the simplest mathematical formulas which could describe the underlying mechanisms that produced the data.

Its best described using their instructional video:

I’ve been playing around with it, for example trying to find a good prediction equation for the Son of Darts competition I blogged about earlier.

The algorithm(s) used in this program are based around “symbolic regression”. It is a form of Genetic Programming (GA) where the computer processes a tree of possibilities recursively searching for the best suited building-blocks.