Is a Lazy Eval a Good Thing?

Lazy evaluation is a technique often used to save time when evaluating chess position.  If a position’s true score is much higher than beta or lower than alpha the full evaluation is really unnecessary.  For example, if you’re a queen down it’s a waste of time checking to see if you have a weak double pawn on a4.  So in the right situation this approach can save a lot of time and speed up an engine.  Of course there is the risk that the lazy evaluation is way-off and the position is incorrectly evaluated.  The technique was quite popular in the 80’s (and probably the 90’s).  Ed Shoeder noted it gave him an enormous speed increase in Rebel. 

I’ve been giving the issue some thought and I’ve come to the conclusion I will not add a lazy evaluation in Maverick.  My reason is simple – I’d like the leaf nodes to be evaluated as accurately as possible.  My hunch is that by evaluating the leaf nodes as accurately as possible I’ll be able to be more selective in positions closer to the root (which will dramatically reduce the tree size).  This type of decision is part of the design philosophy of Maverick. 

I also suspect the lazy evaluation technique is less relevant in modern programs which use null move and late-move-reduction.  Back in the 80’s there was still a battle between the Spracklen’s brute force approach and Lang’s selective search.  In those days it was common to only evaluate at the leaf nodes (i.e. not evaluate the interior nodes).  In this situation the leaf nodes evaluation could have a wide range of values, some much higher than beta and others much lower than alpha.  So lazy evaluation worked well.  In contrast null move and late-move-reduction have the effect of keeping the search tree closer to alpha and beta.  Null move reductions are good at quickly weeding out obvious fail high moves (i.e. very good positions) and late-move-reductions reduce the tree size for obvious fail low positions (i.e. very bad position).  So by utilizing these techniques lazy evaluation becomes less relevant and more risky.

Of course this is just theory – let’s see how it works in practice.

What are your thoughts on Lazy Evaluation?

  • Thomas Petzke

    Hi Steve,

    my PBIL optimization actually turned LAZY eval on, so far in both runs. But it mandates an extremely high safety margin, somewhere between a rook and a queen value. So I haven’t measured the hit rate yet, but I expect it to be small. With such a high margin I guess the effect of LE will be hard to measure.

    Thomas…

    • Steve Maughan

      This confirms my suspicions that LE is less relevant. I think Ed is quoted as saying LE doubled Rebel’s speed (although I can’t find the exact quote) – it’s a different picture for today’s engines.

  • Don Dailey

    I know that some very strong programs are using it but still my theory is the same as yours, if you use lazy evaluation you have to compensate elsewhere. It is possible that this is all just a trade-off that gets you to the same place but if that is the case I would prefer to have stable evaluations and play the risk game in other places.

    • Steve Maughan

      Hi Don – as I mentioned in another post, I view chess programming as a manual hill climbing optimization. I feel it would be a (big) mistake to trade accuracy of evaluation for a speed-up in the early stages of development. It almost certainly will lead to a local optimum.

    • Thomas Petzke

      I think one mistake a lot of people (including myself) make at one point in time in their chess programming career is to assume positions in the search tree look reasonable, like positions you would encounter when playing OTB. I once recorded positions where LE lead to a wrong decision (where LE liked to trigger but when performed the full eval made it into or beyond the window) and those positions were frequent and looked totally crazy, like suddenly one side had three connected passers. I the tried to reduce the errors by limiting LE to positions with no passed pawns, no queen near an enemy king etc… but I actually never liked that bit and in a later version I removed it.

  • I use lazy eval to avoid calculating mobility and king safety. I believe it’s a significant win for my engine- though I really need to test it formally. I suspect it benefits mailbox engines more than bitboard, because moves / attacks can be calculated much faster with bitboards.

    • Steve Maughan

      Hi Erik – did you implement the lazy eval before null-move and Late Move Reductions?

      • No. I already had null move and LMR. I implemented lazy eval after I noticed mobility and king safety slowed down the search significantly.

        • Steve Maughan

          Interesting!

  • I ran some games yesterday to test lazy eval. 2000 games (2 min + 1 sec) with the default setting, lazy eval = 400. 2000 games with lazy eval disabled. Same opponents and opening book.

    The lazy eval version scored better by 21 +/- 13 ELO. Each score is +/- 13 ELO so I guess the results are within the error bar. So I believe lazy eval is beneficial for my engine but I should do more testing to narrow the error bar. Also, I should experiment with other lazy eval margins.

    • Steve Maughan

      Erik – thanks for testing! All interesting stuff. My only concern with Maverick is if I implement LE at an early stage it will give an improvement but I’m afraid it will hinder future improvements. I think Don’s comment was interesting when he said he’d “rather take risks in other places”. I’m trying to follow a philosophy of having as accurate an evaluation as possible at the tips. I can then “take risks” by pruning (or reducing the depth of) nodes which don’t seem to be attractive based on this evaluation. Let’s see if it works! Thanks for sharing the test.

  • I’d like to make one more comment here. I’d like to know what other people do- and if this is less of an issue for bitboard engines.

    My engines estimates a static score in interior nodes. The estimate does not include mobility or king safety. It uses this estimate to decide whether to try a null move (score >= beta) and whether a move is futile (score + margin <= alpha).

    Because it doesn't do a full eval, one could consider this lazy eval. I'm not sure how most programmers define lazy eval- whether the term implies lazy eval at the leaf nodes only or also at interior nodes. My engine does lazy eval at both nodes.

    The reason I do lazy eval at interior nodes is because I put a lot of effort into staged move generation. Doing a full eval would generate all moves to calculate mobility and king safety and would obliterate the performance improvement of staged move generation.

    How do others handle this? Or is it not a big concern in a bitboard engine?

  • Evert

    I wanted to add an evaluation hash to Jazz, but I didn’t want to hash “lazy” scores (it gets horribly complicated if you need to keep track of bounds as well), so I disabled lazy evaluation. This slowed things (ie, time-to-depth) down a lot but interestingly it didn’t make the program (much) weaker. When I added the evaluation hash it more than compensated for the loss in speed due to no longer doing lazy evaluation, so I certainly don’t regret pulling it. I still pass alpha and beta to the static evaluation function, but I should probably just remove them.

    • Steve Maughan

      Hi Evert – Interesting! This is what I suspect. Taking quick wins (with significant errors) early in the development of an engine (I think) can make true progress more difficult later when you need accuracy to make deep selectivity decisions. Thanks, Steve

  • “Taking quick wins (with significant errors) early in the development of an engine (I think) can make true progress more difficult later when you need accuracy to make deep selectivity decisions.”

    I don’t understand your statement. Lazy eval is not a major architectural decision that takes you far down a road with little hope of changing direction. Simply turn it off with a boolean switch. Or leave it on but periodically measure its interaction with other search and eval parameters. This is easy to do if you expose it as a UCI option.

    I’m not advocating for lazy eval. I just disagree with your characterization of lazy eval as an impediment to progress. Every search and eval parameter is an impediment to progress, if it’s not measured.

    • Steve Maughan

      Hi Erick – you’re right. Lazy Eval is not a structural component of a chess program. However, if it part of your engine and is a potential impediment to future progress you will need to try every possible improvement with and without Lazy Eval. The same goes for futility pruning (which is another source of leaf node errors). This just makes testing more complex.

      The only reason I see it as a potential, “impediment to progress”, is it introduces errors at the leaf. While I realize I cannot complete eradicate leaf node / QSearch error (by their very nature), I’d like to minimize them. That’s one of my philosophies in designing Maverick. I’m hoping I can use a good QSearch to make more radical pruning decisions, which will greatly reduce the tree size.

      Please don’t read my posts and articles as definitive statements on the way to develop a chess engine. It’s really just a record of my thinking as I develop Maverick. Some of it could be right – I’m sure some will be wrong. Hopefully I’ll be able to correct what is wrong!

      Steve

  • I hear you. Selectivity adds both strength and error. We have to pick and choose our battles. I’ve found that I’ll focus on search for a while, get obsessed with it. Then realize I haven’t paid much attention to evaluation and obsess over it for a while. And bounce back and forth like this. It’s difficult to work on a little bit of everything all at the same time. The nature of chess programming demands minute scrutiny and deep thought.