Is a Lazy Eval a Good Thing?

Lazy evaluation is a technique often used to save time when evaluating chess position.  If a position’s true score is much higher than beta or lower than alpha the full evaluation is really unnecessary.  For example, if you’re a queen down it’s a waste of time checking to see if you have a weak double pawn on a4.  So in the right situation this approach can save a lot of time and speed up an engine.  Of course there is the risk that the lazy evaluation is way-off and the position is incorrectly evaluated.  The technique was quite popular in the 80’s (and probably the 90’s).  Ed Shoeder noted it gave him an enormous speed increase in Rebel. 

I’ve been giving the issue some thought and I’ve come to the conclusion I will not add a lazy evaluation in Maverick.  My reason is simple – I’d like the leaf nodes to be evaluated as accurately as possible.  My hunch is that by evaluating the leaf nodes as accurately as possible I’ll be able to be more selective in positions closer to the root (which will dramatically reduce the tree size).  This type of decision is part of the design philosophy of Maverick. 

I also suspect the lazy evaluation technique is less relevant in modern programs which use null move and late-move-reduction.  Back in the 80’s there was still a battle between the Spracklen’s brute force approach and Lang’s selective search.  In those days it was common to only evaluate at the leaf nodes (i.e. not evaluate the interior nodes).  In this situation the leaf nodes evaluation could have a wide range of values, some much higher than beta and others much lower than alpha.  So lazy evaluation worked well.  In contrast null move and late-move-reduction have the effect of keeping the search tree closer to alpha and beta.  Null move reductions are good at quickly weeding out obvious fail high moves (i.e. very good positions) and late-move-reductions reduce the tree size for obvious fail low positions (i.e. very bad position).  So by utilizing these techniques lazy evaluation becomes less relevant and more risky.

Of course this is just theory – let’s see how it works in practice.

What are your thoughts on Lazy Evaluation?

50% Improvement in Perft Speed

I decided to see what Mavericks performance is like through the lens of a profiler.  I’m using CodeBlocks as my GCC IDE.  It was remarkably simple to get the profiler working.  The first thing I notices was the large amount of time taken to see if a move resulted in discovered check, and therefore illegal.  The time consumed by this one (small) procedure was almost as much as the move-generation code.  I thought something must be wrong.

After some prodding and poking of the code I realized there was a much better way to accomplish the same task.  In my original code I was doing a looking from the king through the “from_square” and seeing if there was a rook, bishop or queen on the relevant path.  This is what I did in Monarch but it’s really letter-box style thinking.  It’s much faster to calculate all of pins at the start of the move generation process and store them in a bitboard.  Then when you make the move you only need to check if there is a discovered check if the piece is pinned (a simple “AND”) – which is rare.  You still need to perform the check if the piece is pinned, it’s a king move or an en-passant move.  I also “inlined” the “is_in_check_after_move” procedure.

The result of these changes is a boost to the perft speed of about 50%.  Maverick now crunches 76 million nps on my humble 2.2 GHz i7 notebook. 

As an aside, my notebook is two years old.  On my wife’s newer 2.4 GHz i7 Maverick’s speed is 99 million nps. Somehow Maverick’s architecture is more suited to newer machines.  I assume it’s a bigger cache.

Chess Position Evaluation Philosophy – Part 2

Tim Ferriss of “Four Hour Work Week / Body / Chef” fame has championed the idea of Minimum Effective Dose.  This is simply the smallest does which will have the desired outcome.  In the context of computer chess evaluation I think it’s interesting. 

I would like to try to create an automatic parameter tuning framework for the evaluation function using Probability Based Incremental LearningThomas Petzke is doing some cool stuff in this area.  However the more parameters I have the more difficult it will be to tune.  So my aim is to create a Minimum Effective Evaluation.  Of course over time I’ll add to the evaluation but only when improvements are significant and proven (at least that’s the intention).

To this end I’ve been going over some articles which I have previously saved to Evernote.  This one from Ed Schroeder is of particular interest – The Value of an Evaluation Function.  Ed shows the relative importance of each evaluation term.  In summary it would seem the high level order of importance is as follows:

  1. Mobility
  2. Passed Pawn
  3. Piece Square Tables
  4. King Safety
  5. Minor Piece “Stuff”
  6. Pawn Evaluation

I’m going to use this to prioritize my effort.

Chess Position Evaluation Philosophy – Part 1

I’m now starting to think about the evaluation function.  I’m doing this at a completely different stage than for Monarch (the Delphi or “C” version) – but that’s intentional – here’s why.

Monarch was my first chess engine.  While I had been a computer chess fan (fanatic?) since the early 80’s, and liked programming, I didn’t write my first engine until 1999.  I remember being giddy with excitement when I completed the move generator and the make / unmake routines.  I don’t think I tested Monarch as rigorously using the perft methodology.  Once I was at this stage my aim was to create a chess playing entity as quickly as possible.  I slapped some piece-square tables together and bolted on a vanilla alpha-beta search to see what would happen.  To my amazement, even with this rudimentary evaluation function, the first version played “OK” (maybe 1900 ELO). 

However, I suspect a poor evaluation routine restricts and hampers improvements to the search.  I look at creating a chess program as one big optimization task.  As programmers we are doing a manual “hill climb” optimization – changing parameter, adding knowledge, improving the search, while all the time keeping improvements and throwing out changes which don’t work.  The number of dimensions and domain space for the optimization is massive.  If we are doing a hill climb optimization where you start has some impact as to where you’ll finish.  My theory is that if you start with a moderately good evaluation function (i.e. not just piece square tables), you’ll have a better chance to improve it in the long term and avoid local optimums.  It’s a hunch – I don’t have any proof that this is the case. 

So I’m trying to put all of the components of a strong engine in place from day one.  I could be completely wrong but that’s why I’m putting some effort into creating the evaluation function before Maverick has even played one move in a real life chess game.

Debugging a Chess Move Generator

Perft is a fantastic way to debug your chess engine’s move generator (as well as the make and un-make routines).  Here are some tips which you may find helpful:

  • Find a set of test positions with known perft scores and which cover all the weird and obscure chess moves.  See my previous post Perfect Perft for some good examples.
  • Use an engine such as Sharper to split out the perft of each sub move for a position (for Sharper this is the divide command).  This way you can see which move gives you a problem.
  • Write a “flip_board” routine which (as the name suggests), flips the board.  This means you can test each perft position from white and black’s perspective.  The ensures you’re writing Color Blind Code!  Here’s my perft code which automatically flips the board:

//--Position 1
for(i = 0; i <= 1; i++){ set_fen(position, "rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1"); ok &= (perft(position, 6) == 119060324); flip_board(position); }

  • Use "asserts" liberally in conjunction with a "board_integrity" routine.  The "board_integrity" routine is key.  It simply ensures that everything makes sense in the chess board object / record - returning "true" if everything is OK and "false" if it find something wrong.  For example does the "all_pieces' bitboard agree with each individual color's occupancy bitboards?  Does the pawn has which has been created incrementally agree with a hash value created from scratch?  I add asserts at the beginning and end of almost every routine.  Here's my board integrity code:

BOOL integrity(struct t_board *board){
int i;
t_chess_color c;
t_bitboard b;

//-- Link between squares and bitboards
for(i = 0; i < 64; i++){ if (board->square[i] != BLANK)
if ((board->piecelist[board->square[i]] & SQUARE64(i)) == 0)
return FALSE;

//-- Color bitboards match up
for(c = WHITE; c <= BLACK; c++){ b = 0; for(i = KNIGHT; i <= KING; i++){ b |= board->pieces[c][i];
if (b != board->occupied[c])
return FALSE;
if (board->square[board->king_square[c]] != PIECEINDEX(c, KING))
return FALSE;
if (SQUARE64(board->king_square[c]) != board->pieces[c][KING])
return FALSE;
if (popcount(board->pieces[c][KING]) != 1)
return FALSE;

//-- All pieces match up with occupancy
if ((board->occupied[WHITE] | board->occupied[BLACK]) != board->all_pieces)
return FALSE;

//-- Castling rights
for (i = 0; i < 4; i++){ if (board->castling & ((uchar)1 << i)){ if (board->square[castle[i].king_from] != PIECEINDEX((i >> 1), KING))
return FALSE;
if (board->square[castle[i].rook_from] != PIECEINDEX((i >> 1), ROOK))
return FALSE;

//-- Hashing
if (board->pawn_hash != calc_pawn_hash(board))
return FALSE;
if (board->hash != calc_board_hash(board))
return FALSE;

return TRUE;

If you follow these tips and your engine agrees with the perft scores of the suggested positions (in debug mode), then I'm 99% sure you'll have a bug free move generator and make / unmake move routine.  Writing test routines never sounds appealing, but having code which you know is (almost) bug free is a GREAT feeling!!

Perfect Perft!

Over the weekend I got the last couple of bugs out of Maverick perft routine.  It now has a perfect node match for all the positions I’ve tried – Yippee!! 

In my opinion getting to this stage is a major milestone in the development of a chess engine.  If you can get this far you’re almost certainly capable of writing a fully functional chess engine.

It may be interesting to some to share my approach to perft.  When finding positions to use in a perft test you need to come up with “odd” position – ones which don’t occur too often and ones which are special cases.  Normally this means positions which contain the special chess moves i.e. castling, en-passant, promotions and capture promotions.  In particular I’ve found the best positions for perft are those which involve getting into, or out of, check using these special moves.  Also position with lots of revealed checks, pinned pieces are good.

Here’s the list of positions I used.  They are gleaned from Chess Programming on Wikispaces and this post on CCC:

rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1; perft 6 = 119060324

r3k2r/p1ppqpb1/bn2pnp1/3PN3/1p2P3/2N2Q1p/PPPBBPPP/R3K2R w KQkq -; perft 5 = 193690690

8/2p5/3p4/KP5r/1R3p1k/8/4P1P1/8 w - -; perft 7 = 178633661

r3k2r/Pppp1ppp/1b3nbN/nP6/BBP1P3/q4N2/Pp1P2PP/R2Q1RK1 w kq - 0 1; perft 6 = 706045033

1k6/1b6/8/8/7R/8/8/4K2R b K - 0 1; perft 5 = 1063513

TalkChess PERFT Tests (by Martin Sedlak)

//--Illegal ep move #1

3k4/3p4/8/K1P4r/8/8/8/8 b - - 0 1; perft 6 = 1134888

//--Illegal ep move #2

8/8/4k3/8/2p5/8/B2P2K1/8 w - - 0 1; perft 6 = 1015133

//--EP Capture Checks Opponent

8/8/1k6/2b5/2pP4/8/5K2/8 b - d3 0 1; perft 6 = 1440467

//--Short Castling Gives Check

5k2/8/8/8/8/8/8/4K2R w K - 0 1; perft 6 = 661072

//--Long Castling Gives Check

3k4/8/8/8/8/8/8/R3K3 w Q - 0 1; perft 6 = 803711

//--Castle Rights

r3k2r/1b4bq/8/8/8/8/7B/R3K2R w KQkq - 0 1; perft 4 = 1274206

//--Castling Prevented

r3k2r/8/3Q4/8/8/5q2/8/R3K2R b KQkq - 0 1; perft 4 = 1720476

//--Promote out of Check

2K2r2/4P3/8/8/8/8/8/3k4 w - - 0 1; perft 6 = 3821001

//--Discovered Check

8/8/1P2K3/8/2n5/1q6/8/5k2 b - - 0 1; perft 5 = 1004658

//--Promote to give check

4k3/1P6/8/8/8/8/K7/8 w - - 0 1; perft 6 = 217342

//--Under Promote to give check

8/P1k5/K7/8/8/8/8/8 w - - 0 1; perft 6 = 92683

//--Self Stalemate

K1k5/8/P7/8/8/8/8/8 w - - 0 1; perft 6 = 2217

//--Stalemate & Checkmate

8/k1P5/8/1K6/8/8/8/8 w - - 0 1; perft 7 = 567584    

//--Stalemate & Checkmate

8/8/2k5/5q2/5n2/8/5K2/8 b - - 0 1; perft 4 = 23527

Tomorrow I’ll talk about the perft debugging process – it’s quite simple once you put a couple of things in place!

Is Perft Speed Important?

The perft routine in Maverick is almost fully debugged!  I believe this is an important milestone in any chess engine’s development.  It’s really the first time the code “plays” chess, in the sense it generates moves, make and un-makes the moves, and iterates up and down the search tree.

I also think the speed of the perft routine is significant and is a measure which can be used to compare engines.  Now some people will disagree with me but here is my logic:

Validation of Structure:  The structure of a chess program could be defined as how the board is represented and the way moves are generated and stored.  This impacts the overall speed of an engine.   A perft routine is a good measure of manipulating this chess board structure.  Clearly there are other factors, most notable the size and complexity of the evaluation function.  But I would even argue a fast perft implementation is a indicator of a chess structure which can support a good evaluation function.  If you think about it, any half decent evaluation function must iterate over the squares and find some proxy for mobility.   At a basic level this is what is happening in a perft routine.  So I would argue a fast perft speed is an indication of a solid foundation upon which a strong chess engine can be developed.

Difficult to “Fiddle”:  Some people talk about an engine being “fast” based on the self reported measure of “number of nodes processed per second’.  The problem with nodes per second as a measure of speed is the definition of a node.  There is no standard definition of a node.  Some engine authors definite it as a call to the “alpha-beta” routine, while other base it around the make / undo move routines, and then again others use the “generate-move” procedure.  There isn’t one standard.  And the measure can easily be fiddled.  This is not the case with perft.  There are only really two ways to count the number of nodes; one being the number of leaf nodes, and the other is the total nodes (internal and leaf).  It would seem the standard measure is the number of leaf nodes.

Having said it’s difficult to fiddle there are three distinct approaches to perft routines and each one impacts the speed:

Make & Unmake All Moves: Most chess engines generate pseudo legal moves.  These are moves which are generally legal but may expose their king to a discovered attack and so are not actually level.  The reason chess engines generate pseudo legal moves is to save the time checking to see if there is a discovered check (which is costly) and may not even be required is there is a “cut-off”.  So the test for the discovered check is often carried out as part of the “Make-Move” routine.  The simplest perft implementation simply iterates through the depths generating the moves, make and unmaking each move in turn and counting the number of nodes.  This is the slowest type of perft implementation.  I regard this type of perft as a measure of the efficiency of the make and unmake routines, since this is the task which is carried out the most.

Counting Legal Moves at The Leaf: In contrast to the first method, another approach is to generate only truly legal moves.  In this approach a lot of time can be saved at the leaf nodes by simply returning the number of moves generated (without having to make and unmake each one).  The cost is a slightly more complex move generator which must detect potential discovered check.  In general this approach will be quite a bit faster than the first approach.  I regards this approach as a measure of the efficiency of the move generation routines.

Hashed Moves: In a perft search many transpositions occur.  This means the whole search can be significantly sped up by hashing the results and storing them in a hash table.  If the position reoccurs in the tree, the number of nodes in the sub-tree can be retrieved from the hash table and there is no need to search the sub tree.  I have not implemented this in Maverick.

Parallel Search: This is not something I have implemented in Maverick but the speed of the perft routine could be improved by implementing a parallel multiprocessor search.

Initial Perft Speed Results:

Based on the above logic I was eager to see how fast the new bitboard structure is compared to Monarch, which used a letter-box data structure. Monarch isn’t by any measure a fast searcher, so I was hoping for a speed improvement.  The position I used is from:

perft chess

FEN: r3k2r/p1ppqpb1/bn2pnp1/3PN3/1p2P3/2N2Q1p/PPPBBPPP/R3K2R w KQkq –

The number of leaf nodes to a depth of 5 ply deep is 193,690,690.  Monarch’s perft routine is of the “make / unmake all moves” type.  It managed to crank out the six five ply perft for the above position in 71.5 seconds on my Core i7 2670QM running a 2.2 GHz.  To my surprise Maverick blew this away with a time of only 16.3 second.  This is 4.3 times speed up – woohoo!!  Maverick’s legal move generator approach completed perft 5 in exactly 5.0 seconds.

Both engines were compiled using Microsoft Visual Studio Express 2012 in 32 bit mode.  I imagine there will be a reasonable speedup when I move to 64 bit and add the SSE bit-twiddling routines.  This is a much bigger speed-up than I anticipated and illustrates the superiority of the magic bitboard approach.

What type of perft routine does your engine employ and how fast is it?

Up to my Knees in Debugging

I’ve been in Tokyo this week on business.  So there hasn’t been too much time for chess programming (sigh).  Hopefully I’ll have time on the 13 hour flight back via Dallas Fort-Worth. 

I’m in the process of using perft to debug the move generating and make / unmake routines.  For those who haven’t come across perft, it’s a routine which calculates all of the possible positions from a give starting position to a specified depth.  It’s a great way of debugging your core code as there are known possible where stable programs have calculated the correct nodes counts. I’ll come back and talk about perft in a later post – once I’ve managed to get all of these bugs out! 

As I’m debugging I’m struck by just how many bugs there are in my code.  I’ve always thoughts of myself as a reasonably defensive programmer and quite good at creating bug free code.  But there are a myriad of little bugs which are cropping up.  Take for example this macro:

#define SQUARE64(s) ((t_bitboard)1 << s)

It's not too complex.  It take a square number between zero and 63 and returns the bitboard version.  But, to my surprise, it doesn't work all of the time!  I have some code which return the potential en-passant square as a bitboard:

board->ep_square = SQUARE64((from + to) >> 1);

This gave spurios result!  The reason was to do with the precedence of operators.  The corrected version simply encloses the "s" in brackets. 

#define SQUARE64(s) ((t_bitboard)1 << (s))

This is most likely a newbie error and highlights my lack of deep experience with "C", but it also illustrate the subtleties required to simply get a chess engine up and running!



Make Move & Unmake Move…

I finally got the move generators working.  There are two primary move generators.  The first “generate_moves” is called when the side to move is not in check.  It creates a list of pseudo moves (i.e. some may be illegal due to discovered check).  The second routine, generate_evade_check creates a list of legal moves which get out of check.  There were quite a few bugs to hunt down in creating these routines but they now seem OK (although I’m sure the perft analysis will highlight some more bugs).

So I’m now looking at making and unmaking moves.  I’m trying to resist over-optimize at this stage.  It’s difficult.  It’s a balancing act between not over-optimizing and at the same time put in place all of the key elements which will enable Maverick to run at a reasonable pace.  I’m hoping to finish the make / unmake routines on my trip to Tokyo this week.  It’s a long trip via Dallas and the Dallas to Tokyo leg is 13 hours – I should have plenty of time.

Free Chess Bitboard Viewer

If your chess engine is bitboard based you will need to be able to visualize what the bitboard looks like as an actual chess board.  I can’t image developing a bitboard engine without this type of utility.  So I created a simple chess bitboard viewer. 

You download it here: Chess Bitboard Viewer

Chess Bitboard Viewer

It runs on Windows 32 / 64 bit systems.  It’s super-easy to use (i.e. if you cannot use it think about finding a different hobby).  I’ve zipped it up since I assume downloading a raw (and rare) exe file will make some virus detector scream like a banshee!  BTW it has been checked for viruses and it’s clean – (but use it at your own risk).

You can click on a square and the decimal number will update – or you change the decimal number and the board will update.  You can also copy and paste from the clipboard. 

I hope it’s useful – feedback and comments welcome!