Particle Swarm Curve Fitting

Particle swarm optimization is a powerful yet simple to understand procedure for searching a complex space. This space may represent almost anything that can be expressed as a set of numbers in which one is interested in finding the set of numbers that makes a particular function have its smallest (or largest) value. Checkout the Wikipedia article on particle swarm optimization for a much better explanation.

In our case, we want to find the set of parameters for a particular function that minimizes the error between measured Y values (for some X) and the function's own Y value at that particular X. We will use a particle swarm to search the space of all possible parameters for the function and will, hopefully, zero-in on the best set of parameters. I've used this approach before so I know that it works. Of course, that was using modern computers and CUDA, but with a fast machine and an emulator I suspect I'll get results in a reasonable amount of time. I'll post run times for a real Apple II if I get far enough. I also want to be able to plot the source data and the fitted function if I have enough free space in Lisp. P-Lisp does support simple hires plotting.

See here for a general explanation of curve fitting. It is worth noting that unlike most other curve fitting algorithms, our approach does not require any knowledge of the partial derivatives of the function being fitted, nor does it require any "good" initial parameter values.

I've just started the implementation in P-Lisp. I'm using, for speed and portability, Applewin running on Linux under wine. It works well. I'm also using the extensions for P-Lisp which are on my main page. This allows me to load source from a text file. As an editor, I'm using the simple Applesoft based LED I wrote for P-Lisp.

03-Jul
So far I have some utility functions defined. Simple wrappers for math functions so users can enter "+" instead of "ADD", etc. I also added a function to index a list like a vector and to compute the mean square error between two vectors of values. Lastly, I've implemented an expression evaluator which will evaluate a given expression for all the X values using given parameter values. For example,

(SETQ X '(1 2 3 4 5 6))
(SETQ E '(+ P0 (* X P1)))
(EXPR E X '(1 3))
  (4 7 10 13 16 19)
which is p0 + p1*x for p0 = 1 and p1 = 3.

Unfortunately, P-Lisp, while it supports floating point numbers, does not appear to have any transcendental functions so the curves will be limited to polynomials (at least at first). I also discovered that P-Lisp has a pitifully small stack and recursion is out of the question for any large amount of data, so all the routines are iterative instead making use of the (PROG ...) construct.

This isn't so bad, really. We still will be able to enter arbitrary functions to fit, as S-expressions, and then evaluate them on the fly. This is a cool-ish Lisp thing and would be hard to do in other languages for the Apple II.

19-Jul
I gave up on Lisp, simply too primitive. So, I wrote the program in Applesoft, you can see it here. Simply save it and EXEC it into memory, or type it in, it is short enough.

Even with the lousy random number generator (an oxymoron we've grown used to hearing!) it does manage to do an okay job of things. I present two examples:


In the graph above, the set of points was fit to a line, a two parameter fit. The red line was found by a swarm with 25 particles doing 25 iterations. The green was 50 particles for 100 iterations.


In this case, the function was

y = sin(3.1*x) + 2.04*cos(0.75*x)
and is plotted in white. A swarm of 100 particles and 300 iterations produced the red line. Not too good. If we instead increase the number of particles to 500 and iterate 100 times we get the green line, which is better but still not fantastic. If I do the same with 100 particles and 300 iterations in IDL I get the exact parameters since I added no noise in this case. So, I think the lousy Apple II RNG is to blame but I'd have to do many, many tests to get statistical power to back that up. Given that I used a real Apple IIgs for these experiments, that might take awhile. The 500 particle swarm ran for about 2 days! IDL does it in 1.5 seconds on a modest Linux box.
Last update: 19-Jul-2009
Back