Maverick 1.0 To Do List…

It’s been two months since I did any significant chess coding – life got in the way! I did tinker here and there but nothing special.  However, Christmas is around the corner. I’m hoping to have some time to relax with a IPA beer and do some coding.

I thought it would be worthwhile spending some time to think about what I need to do to create version 1.0 of Maverick. I’ve said from the start, version 1.0 will have a playing strength of 2500 ELO (i.e. Grandmaster level) on the CCRL scale. So I’ll release Maverick 1.0 when I can beat all of the following opponents in a 100 game match at 5 mins per move:

  • Tao 5.6 – I’ve admired Bas Hamstra’s engine for a long time – I hope one day he continues the development
  • YACE  – another classic
  • Anmon 5.6
  • Hermann 2.8 64 bit 

I tested Maverick 0.51 at fast time controls. One minute per game was the slowest time control I used. In my test this gave Maverick a rating of about 2370 ELO. However, in the CCRL Division 6 tournament Maverick struggled. I thought it would be in the top half of the table. But it finished one off the bottom! I suspect this is due to the time controls. Maverick is quite a fast searcher but I intentionally haven’t put any real selectivity into the search apart from null move. This probably results in a higher branching factor than other engines with Late-Move-Reduction. At super fast time control Maverick may do well due to its raw speed, but its higher branching factor holds it back at the slower time control. When version 0.51 is tested by CCRL I suspect it will be 100 ELO lower than my tests i.e. about 2270 ELO. This mean I have a long way to go to reach 2500 ELO.

Before I set out my to-do list for version 1.0 it might be helpful to go over what Maverick already contains i.e. the current state-of-the-engine. Here’s a reasonably comprehensive list:

  • Magic bitboard engine
  • Move dictionary architecture
  • Hash tables (4 slots structure)
  • Killer Moves
  • Evaluation function with the following terms:
    • Middle-game to Endgame interpolation
    • Piece square tables
    • All basic endgames (e.g. B + N +K vs. K)
    • Mobility
    • Passed pawns (a major impact on strength)
    • Basic pawn weaknesses (extremely rudimentary)
  • Null Move (R = 3)
  • Static-Exchange-Evaluation for move ordering and quiescent move pruning
  • Checks in the first ply of of the quiescent search
  • Extend on one-reply-to-check on the first ply of quiescent search
  • Internal Iterative Deepening
  • Enhanced Transposition cutoffs
  • Opening book
  • Pondering (really easy using the UCI protocol)

So I’d say Maverick isn’t too complex and doesn’t contain anything different.

Here’s the list of to-do’s for version 1.0:

  • CLOP tuning
  • Revisit some parameters for tuning:
    • Piece-Square-Tables
    • Mobility
    • Passed pawns
  • Add king safety
  • Add potential passed pawn terms to the evaluation
  • Add pawn storm code when castling on opposite sides
  • Add bad exchange terms to the evaluation (and tune)
  • Change the maximum depth measure to exclude the quiescent search (I think?)
  • Use magic multipliers to calculate mobility and center control
  • Improve knight placement and isolation code
  • Add pawn and king endgame code (this could be a big strength gain)
  • Add special endgame code for K + R vs. K + N (inspired by Thomas Pestzke)
  • Tweak move ordering:
    • Evade check move history
    • Null Move capture-evade
  • Multiple PV
  • Support for Chess960
  • Possibly MAC and Linux support (need to get a development environment up and running for both)

I’m hoping this list will be enough to push 2500 ELO – let’s see!

Have I missed anything?

Domain Space of a Chess Evaluation Function

I’m up to my chops in evaluation code.  As I’ve said in previous posts the engines I’ve written in the past have had basic evaluation function.  I want Maverick to be different.  I’m using a number of good resources to inspire me.  The first is a book by Hans Berliner The System: A World Champion’s Approach to Chess, which takes quite a quantitative approach to chess.  I’ve found it invaluable and would recommend it to any chess programmer.

As I’m writing the code I can’t help but feel the enormity of the domain space of a chess evaluation function.  Firstly there are the values I give to each element.  For example in the middelgame, is a pawn on g4 worth more or less than a pawn on f4?  The pawn on f4 is probably worth more, but how much more – is it 2 centi pawns or 10 centi pawns?  I’ve got to make these decision for all six piece types in the middle-game and endgame.  And then there is the values I assign for a double pawn on h2 or e4?  The task is truly enormous.  Secondly there is complexity due to what I choose to evaluate and how I go about doing so.

This all reinforces the idea of a solid testing framework.  I also recent read an interview with Robert Houdart.  Of course I would expect Robert to test Houdini but I was struck by him playing 100,000 games per day!  That’s a lot of games.  This impressed upon me the need to create a testing setup (when Maverick starts to play chess).  Testing was also one of the things Fabian used to rocket Fruit up the rating lists back in 2005.  Once I get Maverick playing chess I must make it my objective to be 100% test driven.

Is a Lazy Eval a Good Thing?

Lazy evaluation is a technique often used to save time when evaluating chess position.  If a position’s true score is much higher than beta or lower than alpha the full evaluation is really unnecessary.  For example, if you’re a queen down it’s a waste of time checking to see if you have a weak double pawn on a4.  So in the right situation this approach can save a lot of time and speed up an engine.  Of course there is the risk that the lazy evaluation is way-off and the position is incorrectly evaluated.  The technique was quite popular in the 80’s (and probably the 90’s).  Ed Shoeder noted it gave him an enormous speed increase in Rebel. 

I’ve been giving the issue some thought and I’ve come to the conclusion I will not add a lazy evaluation in Maverick.  My reason is simple – I’d like the leaf nodes to be evaluated as accurately as possible.  My hunch is that by evaluating the leaf nodes as accurately as possible I’ll be able to be more selective in positions closer to the root (which will dramatically reduce the tree size).  This type of decision is part of the design philosophy of Maverick. 

I also suspect the lazy evaluation technique is less relevant in modern programs which use null move and late-move-reduction.  Back in the 80’s there was still a battle between the Spracklen’s brute force approach and Lang’s selective search.  In those days it was common to only evaluate at the leaf nodes (i.e. not evaluate the interior nodes).  In this situation the leaf nodes evaluation could have a wide range of values, some much higher than beta and others much lower than alpha.  So lazy evaluation worked well.  In contrast null move and late-move-reduction have the effect of keeping the search tree closer to alpha and beta.  Null move reductions are good at quickly weeding out obvious fail high moves (i.e. very good positions) and late-move-reductions reduce the tree size for obvious fail low positions (i.e. very bad position).  So by utilizing these techniques lazy evaluation becomes less relevant and more risky.

Of course this is just theory – let’s see how it works in practice.

What are your thoughts on Lazy Evaluation?

Chess Position Evaluation Philosophy – Part 2

Tim Ferriss of “Four Hour Work Week / Body / Chef” fame has championed the idea of Minimum Effective Dose.  This is simply the smallest does which will have the desired outcome.  In the context of computer chess evaluation I think it’s interesting. 

I would like to try to create an automatic parameter tuning framework for the evaluation function using Probability Based Incremental LearningThomas Petzke is doing some cool stuff in this area.  However the more parameters I have the more difficult it will be to tune.  So my aim is to create a Minimum Effective Evaluation.  Of course over time I’ll add to the evaluation but only when improvements are significant and proven (at least that’s the intention).

To this end I’ve been going over some articles which I have previously saved to Evernote.  This one from Ed Schroeder is of particular interest – The Value of an Evaluation Function.  Ed shows the relative importance of each evaluation term.  In summary it would seem the high level order of importance is as follows:

  1. Mobility
  2. Passed Pawn
  3. Piece Square Tables
  4. King Safety
  5. Minor Piece “Stuff”
  6. Pawn Evaluation

I’m going to use this to prioritize my effort.

Chess Position Evaluation Philosophy – Part 1

I’m now starting to think about the evaluation function.  I’m doing this at a completely different stage than for Monarch (the Delphi or “C” version) – but that’s intentional – here’s why.

Monarch was my first chess engine.  While I had been a computer chess fan (fanatic?) since the early 80’s, and liked programming, I didn’t write my first engine until 1999.  I remember being giddy with excitement when I completed the move generator and the make / unmake routines.  I don’t think I tested Monarch as rigorously using the perft methodology.  Once I was at this stage my aim was to create a chess playing entity as quickly as possible.  I slapped some piece-square tables together and bolted on a vanilla alpha-beta search to see what would happen.  To my amazement, even with this rudimentary evaluation function, the first version played “OK” (maybe 1900 ELO). 

However, I suspect a poor evaluation routine restricts and hampers improvements to the search.  I look at creating a chess program as one big optimization task.  As programmers we are doing a manual “hill climb” optimization – changing parameter, adding knowledge, improving the search, while all the time keeping improvements and throwing out changes which don’t work.  The number of dimensions and domain space for the optimization is massive.  If we are doing a hill climb optimization where you start has some impact as to where you’ll finish.  My theory is that if you start with a moderately good evaluation function (i.e. not just piece square tables), you’ll have a better chance to improve it in the long term and avoid local optimums.  It’s a hunch – I don’t have any proof that this is the case. 

So I’m trying to put all of the components of a strong engine in place from day one.  I could be completely wrong but that’s why I’m putting some effort into creating the evaluation function before Maverick has even played one move in a real life chess game.

Debugging a Chess Move Generator

Perft is a fantastic way to debug your chess engine’s move generator (as well as the make and un-make routines).  Here are some tips which you may find helpful:

  • Find a set of test positions with known perft scores and which cover all the weird and obscure chess moves.  See my previous post Perfect Perft for some good examples.
  • Use an engine such as Sharper to split out the perft of each sub move for a position (for Sharper this is the divide command).  This way you can see which move gives you a problem.
  • Write a “flip_board” routine which (as the name suggests), flips the board.  This means you can test each perft position from white and black’s perspective.  The ensures you’re writing Color Blind Code!  Here’s my perft code which automatically flips the board:

//--Position 1
for(i = 0; i <= 1; i++){ set_fen(position, "rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1"); ok &= (perft(position, 6) == 119060324); flip_board(position); }

  • Use "asserts" liberally in conjunction with a "board_integrity" routine.  The "board_integrity" routine is key.  It simply ensures that everything makes sense in the chess board object / record - returning "true" if everything is OK and "false" if it find something wrong.  For example does the "all_pieces' bitboard agree with each individual color's occupancy bitboards?  Does the pawn has which has been created incrementally agree with a hash value created from scratch?  I add asserts at the beginning and end of almost every routine.  Here's my board integrity code:

BOOL integrity(struct t_board *board){
int i;
t_chess_color c;
t_bitboard b;

//-- Link between squares and bitboards
for(i = 0; i < 64; i++){ if (board->square[i] != BLANK)
if ((board->piecelist[board->square[i]] & SQUARE64(i)) == 0)
return FALSE;
}

//-- Color bitboards match up
for(c = WHITE; c <= BLACK; c++){ b = 0; for(i = KNIGHT; i <= KING; i++){ b |= board->pieces[c][i];
}
if (b != board->occupied[c])
return FALSE;
if (board->square[board->king_square[c]] != PIECEINDEX(c, KING))
return FALSE;
if (SQUARE64(board->king_square[c]) != board->pieces[c][KING])
return FALSE;
if (popcount(board->pieces[c][KING]) != 1)
return FALSE;
}

//-- All pieces match up with occupancy
if ((board->occupied[WHITE] | board->occupied[BLACK]) != board->all_pieces)
return FALSE;

//-- Castling rights
for (i = 0; i < 4; i++){ if (board->castling & ((uchar)1 << i)){ if (board->square[castle[i].king_from] != PIECEINDEX((i >> 1), KING))
return FALSE;
if (board->square[castle[i].rook_from] != PIECEINDEX((i >> 1), ROOK))
return FALSE;
}
}

//-- Hashing
if (board->pawn_hash != calc_pawn_hash(board))
return FALSE;
if (board->hash != calc_board_hash(board))
return FALSE;

return TRUE;
}

If you follow these tips and your engine agrees with the perft scores of the suggested positions (in debug mode), then I'm 99% sure you'll have a bug free move generator and make / unmake move routine.  Writing test routines never sounds appealing, but having code which you know is (almost) bug free is a GREAT feeling!!

Perfect Perft!

Over the weekend I got the last couple of bugs out of Maverick perft routine.  It now has a perfect node match for all the positions I’ve tried – Yippee!! 

In my opinion getting to this stage is a major milestone in the development of a chess engine.  If you can get this far you’re almost certainly capable of writing a fully functional chess engine.

It may be interesting to some to share my approach to perft.  When finding positions to use in a perft test you need to come up with “odd” position – ones which don’t occur too often and ones which are special cases.  Normally this means positions which contain the special chess moves i.e. castling, en-passant, promotions and capture promotions.  In particular I’ve found the best positions for perft are those which involve getting into, or out of, check using these special moves.  Also position with lots of revealed checks, pinned pieces are good.

Here’s the list of positions I used.  They are gleaned from Chess Programming on Wikispaces and this post on CCC:

http://chessprogramming.wikispaces.com/Perft+Results

rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1; perft 6 = 119060324

r3k2r/p1ppqpb1/bn2pnp1/3PN3/1p2P3/2N2Q1p/PPPBBPPP/R3K2R w KQkq -; perft 5 = 193690690

8/2p5/3p4/KP5r/1R3p1k/8/4P1P1/8 w - -; perft 7 = 178633661

r3k2r/Pppp1ppp/1b3nbN/nP6/BBP1P3/q4N2/Pp1P2PP/R2Q1RK1 w kq - 0 1; perft 6 = 706045033

1k6/1b6/8/8/7R/8/8/4K2R b K - 0 1; perft 5 = 1063513

TalkChess PERFT Tests (by Martin Sedlak)

//--Illegal ep move #1

3k4/3p4/8/K1P4r/8/8/8/8 b - - 0 1; perft 6 = 1134888

//--Illegal ep move #2

8/8/4k3/8/2p5/8/B2P2K1/8 w - - 0 1; perft 6 = 1015133

//--EP Capture Checks Opponent

8/8/1k6/2b5/2pP4/8/5K2/8 b - d3 0 1; perft 6 = 1440467

//--Short Castling Gives Check

5k2/8/8/8/8/8/8/4K2R w K - 0 1; perft 6 = 661072

//--Long Castling Gives Check

3k4/8/8/8/8/8/8/R3K3 w Q - 0 1; perft 6 = 803711

//--Castle Rights

r3k2r/1b4bq/8/8/8/8/7B/R3K2R w KQkq - 0 1; perft 4 = 1274206

//--Castling Prevented

r3k2r/8/3Q4/8/8/5q2/8/R3K2R b KQkq - 0 1; perft 4 = 1720476

//--Promote out of Check

2K2r2/4P3/8/8/8/8/8/3k4 w - - 0 1; perft 6 = 3821001

//--Discovered Check

8/8/1P2K3/8/2n5/1q6/8/5k2 b - - 0 1; perft 5 = 1004658

//--Promote to give check

4k3/1P6/8/8/8/8/K7/8 w - - 0 1; perft 6 = 217342

//--Under Promote to give check

8/P1k5/K7/8/8/8/8/8 w - - 0 1; perft 6 = 92683

//--Self Stalemate

K1k5/8/P7/8/8/8/8/8 w - - 0 1; perft 6 = 2217

//--Stalemate & Checkmate

8/k1P5/8/1K6/8/8/8/8 w - - 0 1; perft 7 = 567584    

//--Stalemate & Checkmate

8/8/2k5/5q2/5n2/8/5K2/8 b - - 0 1; perft 4 = 23527

Tomorrow I’ll talk about the perft debugging process – it’s quite simple once you put a couple of things in place!

Is Perft Speed Important?

The perft routine in Maverick is almost fully debugged!  I believe this is an important milestone in any chess engine’s development.  It’s really the first time the code “plays” chess, in the sense it generates moves, make and un-makes the moves, and iterates up and down the search tree.

I also think the speed of the perft routine is significant and is a measure which can be used to compare engines.  Now some people will disagree with me but here is my logic:

Validation of Structure:  The structure of a chess program could be defined as how the board is represented and the way moves are generated and stored.  This impacts the overall speed of an engine.   A perft routine is a good measure of manipulating this chess board structure.  Clearly there are other factors, most notable the size and complexity of the evaluation function.  But I would even argue a fast perft implementation is a indicator of a chess structure which can support a good evaluation function.  If you think about it, any half decent evaluation function must iterate over the squares and find some proxy for mobility.   At a basic level this is what is happening in a perft routine.  So I would argue a fast perft speed is an indication of a solid foundation upon which a strong chess engine can be developed.

Difficult to “Fiddle”:  Some people talk about an engine being “fast” based on the self reported measure of “number of nodes processed per second’.  The problem with nodes per second as a measure of speed is the definition of a node.  There is no standard definition of a node.  Some engine authors definite it as a call to the “alpha-beta” routine, while other base it around the make / undo move routines, and then again others use the “generate-move” procedure.  There isn’t one standard.  And the measure can easily be fiddled.  This is not the case with perft.  There are only really two ways to count the number of nodes; one being the number of leaf nodes, and the other is the total nodes (internal and leaf).  It would seem the standard measure is the number of leaf nodes.

Having said it’s difficult to fiddle there are three distinct approaches to perft routines and each one impacts the speed:

Make & Unmake All Moves: Most chess engines generate pseudo legal moves.  These are moves which are generally legal but may expose their king to a discovered attack and so are not actually level.  The reason chess engines generate pseudo legal moves is to save the time checking to see if there is a discovered check (which is costly) and may not even be required is there is a “cut-off”.  So the test for the discovered check is often carried out as part of the “Make-Move” routine.  The simplest perft implementation simply iterates through the depths generating the moves, make and unmaking each move in turn and counting the number of nodes.  This is the slowest type of perft implementation.  I regard this type of perft as a measure of the efficiency of the make and unmake routines, since this is the task which is carried out the most.

Counting Legal Moves at The Leaf: In contrast to the first method, another approach is to generate only truly legal moves.  In this approach a lot of time can be saved at the leaf nodes by simply returning the number of moves generated (without having to make and unmake each one).  The cost is a slightly more complex move generator which must detect potential discovered check.  In general this approach will be quite a bit faster than the first approach.  I regards this approach as a measure of the efficiency of the move generation routines.

Hashed Moves: In a perft search many transpositions occur.  This means the whole search can be significantly sped up by hashing the results and storing them in a hash table.  If the position reoccurs in the tree, the number of nodes in the sub-tree can be retrieved from the hash table and there is no need to search the sub tree.  I have not implemented this in Maverick.

Parallel Search: This is not something I have implemented in Maverick but the speed of the perft routine could be improved by implementing a parallel multiprocessor search.

Initial Perft Speed Results:

Based on the above logic I was eager to see how fast the new bitboard structure is compared to Monarch, which used a letter-box data structure. Monarch isn’t by any measure a fast searcher, so I was hoping for a speed improvement.  The position I used is from:

http://chessprogramming.wikispaces.com/Perft+Results

perft chess

FEN: r3k2r/p1ppqpb1/bn2pnp1/3PN3/1p2P3/2N2Q1p/PPPBBPPP/R3K2R w KQkq –

The number of leaf nodes to a depth of 5 ply deep is 193,690,690.  Monarch’s perft routine is of the “make / unmake all moves” type.  It managed to crank out the six five ply perft for the above position in 71.5 seconds on my Core i7 2670QM running a 2.2 GHz.  To my surprise Maverick blew this away with a time of only 16.3 second.  This is 4.3 times speed up – woohoo!!  Maverick’s legal move generator approach completed perft 5 in exactly 5.0 seconds.

Both engines were compiled using Microsoft Visual Studio Express 2012 in 32 bit mode.  I imagine there will be a reasonable speedup when I move to 64 bit and add the SSE bit-twiddling routines.  This is a much bigger speed-up than I anticipated and illustrates the superiority of the magic bitboard approach.

What type of perft routine does your engine employ and how fast is it?

Global Move List

In the first version of Monarch I used an Exhaustive Global List of Chess Moves.  This is a list of about (from memory), 47k moves which collectively describe all possible moves on the chess board.  So an example of one move might be white Bishop on b2 takes black Rook on e5.  This will be one of the moves in the table.  I think this approach may well work well with Magic Bitboards – so I intend to implement it in Maverick.  The main advantage is you can store move specific information in the move’s record (e.g. the changes in hash value).  Also generating moves is just a matter of storing a pointer to the move in question.  Obviously the downside is the memory used to store the table and the negative impact on the cache.

This structure was first popularized by Gnu Chess 3.  Subsequent engines which have used a similar approach are Ferret by Bruce Moreland and Diep by Vincent Diepeveen.

So my first task is to classify every possible move into “move types”.  This will be used in switch statements to make and unmake moves quickly.  Here’s my first pass:

//===========================================================//
// Types of Moves
//===========================================================//
#define MOVE_CASTLE 0
#define MOVE_PAWN_PUSH1 1
#define MOVE_PAWN_PUSH2 2
#define MOVE_PxPAWN 3
#define MOVE_PxPIECE 4
#define MOVE_PxP_EP 5
#define MOVE_PROMOTION 6
#define MOVE_CAPTUREPROMOTION 7
#define MOVE_PIECE_MOVE 8
#define MOVE_PIECExPIECE 9
#define MOVE_PIECExPAWN 10
#define MOVE_KING_MOVE 11
#define MOVE_KINGxPIECE 12
#define MOVE_KINGxPAWN 13

 Now I need to think about generating the Global Move List!

New Computer Chess Programming Blog

It’s been six years since I’ve done any serious chess development. During that time family life has been full-on, with two high energy daughters and one demanding wife. The pace of work has also picked up the pace. While I haven’t had time for chess computer coding I certainly haven’t been far from the computer chess scene (it’s so addictive!). There have been few days when I haven’t checked CCC.

During those seven years I’ve also observed many changes in the world of computer chess, some of which have piqued my desire to code another engine. From my perspective here are the notable developments:

  • Magic Bitboards: These provide the ability to quickly generate attack tables for sliding pieces. The tables can be used for move generation or to create sophisticated evaluation functions. Back in 2006 the jury was still as to whether or not bitboards are any better than other board structures; Magic Bitboards seem to tip scales heavily in favor of a bitboard architecture. I’d really like to develop a magic bitboard engine.
  • Ubiquitous 64 Operating Systems: 64 bit is now the norm. This wasn’t the case back in 2006. This makes bitboard engines even more attractive.
  • Late Move Reduction: This seems to be one of the main developments in computer chess over the last few years. It was starting to be discussed back in 2006 but now it seems to be the norm. I think there are many possibilities for enhancing the basic LMR techniques and making a highly selective engine.
  • New Testing Tools & Methods: Fabian Letouzey managed to improve Fruit at an amazing pace. One of his secrets was his obsession with testing. It would seem that playing thousand of super fast games (e.g. 1 second per game) is the best way to prove, or disprove, potential improvements. I find this interesting. One of my other passions is marketing. I suspect computer chess testing methods may provide a philosophical framework for improving businesses in general. Anyway I’d like to set up such a rapid-fire testing environment.

As a result of all of this I’ve downloaded Visual Studio Express, set up this website and rummaged around to find my old Monarch code. I thought about carrying on with Monarch chess but it doesn’t seem quite right. The new project will be a new code base (apart from basic i/o code), so I think it’s right I start a new engine. This new engine will be called “Maverick“, which conveys, in one word, my overall design philosophy and aims.

If you want to follow along you can access my live code base here. I’ve also set out what I’m striving to achieve on the development philosophy page.

I’d be interested to hear your comments and suggestions!