Automatic Tuning of Evaluation Function
When it comes to Maverick’s evaluation function I’m frustrated and excited in equal measure!
My focus over the last couple of months has been to improve Maverick’s evaluation function. However I’ve found it quite difficult to improve on Maverick 0.51’s actual playing strength. Adding extra knowledge (e.g backward pawns) seems to add little in terms of playing strength. I’m struck by the massive number of ways to encode chess knowledge. I feel like a blind archer as I make wild guesses trying to calibrate the parameters.
There must be a better way!
Early on in Maverick’s development I came across Thomas Petzke‘s approach to tuning evaluation function. He uses a form of genetic algorithm (PBIL) to tune the parameters. PBIL optimization algorithms are really neat – they represents each number in binary format. Each bit of each number is represented as a floating point value between zero and one. As the system “learns” the floating point values are “trained” and gravitate to either zero or one based on the training data. In Thomas’ case he played a few games of chess to assess the fitness of a set of parameters. This is expensive – but ultimately game-playing-ability is the attribute we’d like to optimize. Sp maybe the training time is justified.
Back in 2000 I worked a lot with standard genetic algorithms. I used them to evaluate marketing campaigns. I think PBIL may be even better for evaluating marketing campaigns (but that’s a story for another day). I’m certainly interested in using them to tune Maverick’s evaluation function. The only problem is Thomas’ method take ages to complete (weeks!). I’d prefer a method which is quicker.
Then I came across a post on CCC by Peter Österlund:
How Do You Automatically Tune Your Evaluation Tables
Peter outlines a new way to tune an evaluation function. His approach takes 2.5 million chess positions and minimizes the following fitness function:
E = 1/N * sum(i=1,N, (result[i] – Sigmoid(QuiescientScore(pos[i])))^2)
This is really quite interesting for the following reasons:
- Since we’re not playing complete chess games this runs *much* faster – maybe less than one day of computing time
- The sigmoid function is *really* sensitive in the -100 to +100 centipawn range. This is a critical region where virtually all games are decided. If we can create an evaluation function which accurately evaluated this range then we’re likely to have a strong chess engine
- I suspect Houdini uses a similar technique to calibrate its evaluation function since it’s evaluation is linked to the probability of winning. Robert Houdart mentions this on his website, “Houdini 4 uses calibrated evaluations in which engine scores correlate directly with the win expectancy in the position. A +1.00 pawn advantage gives a 80% chance of winning the game against an equal opponent at blitz time control. At +2.00 the engine will win 95% of the time, and at +3.00 about 99% of the time. If the advantage is +0.50, expect to win nearly 50% of the time“
- Some people have tested this approach to training have reported good results
- When tested the pawn evaluation parameters (resulting from the PBIL optimization) they have varied considerably between the middlegame and endgame. The middlegame value of a pawn is 50 centipawns, while the endgame value is +145 centipawn. If these values are robust and uses in normal play they are likely to produce exciting chess where the engine is happy to sacrifices a pawn for a modest positional advantage. This sounds like the recipe for an interesting engine – which is one of the goals for Maverick!
So I’m in the process of rewriting Maverick evaluation code to accommodate the PBIL algorithm. I’m also writing a PGN parser (in Delphi) so I can train the evaluation function using different sets of training positions.
With all of this re-writing it may be a little while before I can release a new version of Maverick. My aim is still to take Maverick to 2500 ELO using only an improved evaluation function!
I’ll keep you posted!