A brief survey of psychological studies of chess

I wrote this as a 3rd year term paper for my Human Information Processing class. The paper is presented in its original form although I may make odd minor corrections from time to time.

Chess is an ancient game of skill. It is one of the few such popular games that is devoid of luck. As a result, it has been a popular choice of topic for psychological research, as it provides psychologists with insights into how humans think about problems. Chess has also been a very popular choice for artificial intelligence implentations, and the results have been very successful. For the first time last year, a computer program beat the world champion in a game with regulation time controls.

The main question in psychological studies of chess is the following: how are better players able to win more often? Several factors have been investigated but there are no certain physiological advantages that always lead to better chess-playing. The most common finding of studies is that experience plays a great part in determining one's ability, which is certainly a promising thought for mediocre players. In essence, no chess gene has been found.

This paper will provide an overview of psychological research into chess, with an emphasis on theories that have been proven inaccurate by future research, to show the development o research in this field. As with any other branch of psychology, theories are made, and then refined when conflicting experimental evidence becomes available.

The first serious psychological study of the game of chess was conducted by Alfred Binet, in 1894. Binet, who was best known for his early intelligence tests, observed blindfold chess players as a subset of his investigations into memory. To the average person, playing a game of chess without sight of the board represents an extremely difficult, if not impossible challenge for the memory.1 Binet's experiment consisted of a survey which was taken by players of all skill levels, from novice to master. He came to the conclusion that blindfold chess players need knowledge and experience, imagination, and memory.2 The masters who took part in the survey gave introspective accounts that had some similarities and yet several differences concerning their blindfold play. A common thread among their responses was the fact that they did not use tactile imagery to represent the board. In addition, they were generally able to remember all the moves played in a sequence of blindfold games. One master, Goetz, was able to quickly recall all 336 moves that he made over 10 blindfold games played simultaneously.3 Binet concluded that verbal memory was an integral part of blindfold play. Finally the subjects reported the need to be aware of a general plan of action for each game,4 although this would seem to be a necessity for both blindfold and regular chess play.

The masters differed on whether they used visual or abstract imagery to represent the board. The majority said that they used only an abstract representation, combined with subvocalizations of previous moves, to mentally examine the board. A small majority including the well-known master Blackburne claimed to visualize an actual chessboard with pieces on it corresponding to the current position, "just as if before the eyes."5 Binet thus came to the realization that his original hypothesis of a strong visual memory being essential for blindfold play was wrong. In addition, he did not explore the almost direct correspondence between experience and ability in blindfold chess.6 Fine (1967) claims that any master (rating of 2200 or above) should be able to play at least one game of blindfold chess.7

Reuben Fine was a prominent chess-player during the thirties, who competed in the famous AVRO tournament of 1938. He also had considerable experience in psychoanalysis, and in 1956 the National Psychological Association for Psychoanalysis published his work, The Psychology of the Chess Player. The book gives a very Freudian account of the game of chess, and is useful only to demonstrate the advances that have been made in the realm of psychology with respect to chess within the past forty years.

Fine claimed that chess is a substitute for war. The king is held to represent the father, while the queen is the mother. In addition, the rook, bishop, knight and pawn are taken to be phallic symbols.8 Fine draws a lot of significance from the fact that promoted pawns may become any other piece except for the king/father. This restriction implies to Fine that chess-playing boys are discouraged from growing up to be like their fathers. Unfortunately Fine's analysis suffers from its entirely armchair nature. There are no experiments or observations, other than a few biographies of well-known grandmasters to support the hypotheses presented in the book. One consistency in Fine's work is that master chess players all have differing personalities and backgrounds.

The first true psychological enquiry into the minds of chessplayers was made by the Dutch psychologist Adriaan de Groot. de Groot was a master, although certainly below the level of the top players in the world. However, he had much more experience and knowledge of the game than did Binet. De Groot's book was titled Thought and choice in chess (translated from Dutch) and was largely based on his study of chess players of differing abilities. He was able to interview such giants of the chess world as Alekhine and Euwe (both World Champions), and "lesser" grandmasters such as Keres, Tartakower, Flohr, and even Fine, the chess-playing psychoanalyst. In addition, de Groot studied several masters, experts, and "class" (or lower-ranked) chess players.

De Groot gave his subjects a position set up on a chess board. Their task was to determine the best move to make, and to attempt to verbalize all of their thoughts. Fortunately, even mediocre chess players have a wide vocabulary of chess-specific terms ("pin", "fork", "back-row mate") that allow them to describe their thoughts very specifically. One problem about such taking protocols is that only the conscious thought can be captured by the player talking about his or her ideas. Often psychologist refer to a process made in solving chess problems as an automatic one, meaning that it is not thought about proactively. This tends to lessen the weight attached to results of protocol experiments, but they are interesting nonetheless. The positions given differed in that some were decisive tactical positions (meaning that a relatively short combination of moves could force the opposition to resign), some were more positional in nature (meaning that a long-term strategy needed to be devised), and still others were random legal positions.

Four distinct stages in the task of choosing the next move were noted. The first stage was the phase of orientation, in which the subject assessed the situation and determined a very general idea of what to do next. The second stage, the phase of exploration was manifested by looking at some branches of the game tree. The third stage, or phase of investigation, resulted in the subject choosing a probable best move. Finally, in the fourth stage, the phase of proof, saw the subject convince himself or herself that the results of the investigation were correct.9 In a regular game, however, the orientation phase would be much shorter, as the player would already be aware of the long-term threats, and would only need to examine the tactical possibilities created by the opponent's last move.

De Groot greatly respected the work of Otto Selz, a psychologist from the Würzburg school of psychology. Selz sought to create a theory of directed thought, which took one from being presented with a problem to finding its solution. Selz used verbal reports (as did de Groot), but he asked subjects different questions from his contemporaries. They were asked to name opposites to items, and to determine what part an item was of a greater whole, or what whole was greater than a smaller part.10 In essence, Selz claims that upon being presented with a problem, subjects have some awareness of the goal, and that one anticipates the goal being reached through a schematic cognition.11 Selz believed that thinking about a problem was a continuous action, or a "linear chain of operations."12

De Groot divided his four phases of problem-solving into two broader progressions: integration to elaboration. In accordance with Selz's theory, each such transition forms a deeper understanding of the problem, which gets the problem-solver closer to the goal.13 These progressions occur right through to the fourth stage, in which the chess player believes that the correct move is chosen.

De Groot also contributed to the first exclusively scientific work on the psychology of chess skill. He exposed subjects to a position taken from a game, for a brief period (usually 3 to 4 seconds). The idea for this experiment came originally from Djakow, Petrowdski and Rudik, who conducted their studies much earlier, in 1927. De Groot found that the top players (grandmasters and masters) were able to recall 93% of the pieces, while the experts remembered 72% and the class players merely 51%.14 De Groot interpreted the results by regarding the stronger performances by the higher-ranked players as a product of their experience, not because of difference in perceptual abilities. This is borne out by a study done by Chase and Simon, in 1973, in which they tested player's memories of positions of games versus random positions. In all legal positions, performance on this test declined as the player's ELO (chess rating) declined. However, given random positions, all levels of players did approximately the same.15 This suggests that the higher-ranked players are able to use some form of chunking, or pattern-matching, that allows them to rapidly encode macro features of the positions. For instance, even a mediocre player would be able to encode the six pieces comprising a castled king and rook, fianchettoed bishop and three surrounding pawns as a set, while a beginner would be forced to remember these separately. Further analysis by de Groot suggested that the functional relationships between the pieces were remembered better than that actual spatial relationships.16 In other words, a small subset of the board in which a bishop was pinning a knight to its queen would be remembered in terms of the pin relationship, rather than by recalling the bishop to be at g5, the knight at f6 and the queen at d8.17

Holding describe some of the experiments undertaken by Chase and Simon in 1973 on chunking. They defined a chunk as remaining intact through its encoding into long-term memory if at least two-thirds of its pieces remained together upon recall. Using this, they found that 96% of the class A player's chunks remained constant between trials. However, the results of masters were very surprising. Their chunks remained the same only 65% of the time, which is poor in comparison to the class A players, and similar to weaker class players.18 At first this seemed to dispel the theory of chunking, but a new hypothesis was created; that masters are able to make imaginative insights that involve the restructuring of pieces.

The concept of chunks, or structural units was neatly shown by the "pennies-guessing" task. Subjects are shown a chessboard with pennies representing each piece, taken from a real game position. Their task is to recreate the position by guessing which piece belongs on each square. In addition, the player is told the number of moves that have been made in the game, and whose turn it is to move. Masters are virtually perfect in placing the correct pieces on the board, while class A players still average over 90% correct.19 These results show that large libraries of likely piece configurations are known to skilled players. In fact, given a position that occurs fairly early on in a game (up to move 25-30), masters are generally able to reconstruct all of the moves that led up to it. During one of Alekhine's protocols taken by de Groot, the world champion (in 1938) commented that: "In half an hour I should be able to logically reconstruct the moves up to this position."20

Chess players need to be able to perceive threats in order to determine their next moves. Saariluoma conducted a series of simple experiments which suggest that grandmasters are much quicker than novices in certain lower-level perceptual processes. In the first of these experiments, a king of one colour was placed on the chessboard, along with a piece of the other colour. The subject had to state whether the king was in check or not. The average latencies were as follows: novices: 1550 ms, class players: 1250 ms, experts: 900 ms, grandmasters: 650 ms.21 The results show that skill is inversely proportional to reaction time.

However, this experiment is extremely artificial, and would not represent a position encountered in a real chess game. Saariluoma then conducted the test with a position containing 20 pieces. The purpose of this experiment was to determine if subjects used a serial comparison. If a serial comparison was indeed used, then one would expect the average reaction time to be 10 times longer than the latency found in the simple case. If one performs a sequential search on each of the pieces, then the average cost is O(N/2) for the positive trials (those in which a check relationship is present).22 However, if a parallel comparison was taking place, the latency would be not much longer than it was in the simple case (king and one other piece). A total of four conditions (the permutations of game position or random position, and check present or no check present) were tested on three groups of chess players, novices, class players and masters. Each group shows the same trend in their measured latencies. The most difficult task is determining that there is not a check relationship in a random position, which is to be expected, since 20 comparisons must take place. Also, since the positions were random, the usual patterns in which check is given would not be present, complicating the task. Second in difficulty was a game situation, and no check present, followed by a random board, check present, and finally a game situation, with a check. The masters took only 1500 ms to identify a check in the easiest case.23 Since their average latency for the single piece condition was 650 ms, it is quite clear that adding 19 pieces to check does not cost a proportional amount of mental processing power. Some of this action by the "threat detectors"24 must occur in parallel.

More recently, computer simulations have been used to model human chess-playing behavior. Tikhomirov and Poznyanskaya performed an experiment in which subjects' eyes were linked to a device which showed the location of the board that they were focussing upon. They noted that the eyes followed the paths of the candidate moves and over primary attack or defense relationships.25 Simon and Barenfeld attempted to model this behavior with their program PERCEIVER. PERCEIVER was programmed with simple heuristics, similar to those that a typical chess-playing program would have. PERCEIVER's simulated eye movements were very close to those of human subjects.26 One difference evident between humans and PERCEIVER is that humans tend to focus more often on vacated squares. This suggests that spatial relationships between pieces are the most important quantities on the chessboard. Similar evidence has been amassed from studies on hand movements of blind players.27

Chess requires an immense amount of knowledge to be played at its highest levels. It has been estimated that grandmasters have learned between 50 000 and 100 000 patterns and moves.28 As with other difficult tasks, it takes years of intense study to achieve a high level of play.

The game of chess would be much different if humans had a larger capacity of working memory. It is this short-term memory that acts as a delimiter of our chess skill. Most amateur players find it difficult to "see" more than five moves ahead in a position. This limit on searching forces inaccuracies into play because such players are simply unaware of the long-term consequences of their moves. Although it is still possible to make long-term plans (for example, start a pawn storm on the king-side), it is the long-term tactical aspect of chess that suffers most as a result of our fairly small working memory. In other words, it is extremely difficult for average players to determine whether their short combinations are truly sound.

Computer chess programs do not suffer from this same problem. They are exceptionally strong tactically, since they can simply calculate every variation, to ascertain that their position will not worsen within the next few moves. However, the long-range planning of computer programs is terrible, because they have no way of doing it. There is an ongoing debate among programmers of computer chess programs concerning the proper approach to searching. One group believes that the best way to create a better chess-playing program is to build better hardware and simply search more of the game tree. This is very costly, however. The game tree expands by a factor of approximately 30 (assuming there are an average of 30 legal moves at each position) for each ply searched. This becomes prohibitive fairly quickly. The other group believes that more work must be done to generate better static evaluations, so that the positions will be evaluated more accurately. In reality a combination of these improvements is needed.29

Several experiments have conclusively proven that stronger players are able to search deeper than weaker ones. Charness tested 34 chessplayers of varying skill levels. He found that the maximum depth average 5.7 plies, however for each increase in skill level of one standard deviation, the depth increased by 1.4 plies, a very significant change.30 Although one should be careful with extrapolation, these findings suggest that a player with a rating of 2600 would be able to search to an average depth of 13.5 plies.31 This is consistent with statements from highly-ranked players. To further the support for this belief, Holding and Reynolds carried out an experiment in which subjects had to memorize random positions and then determine the best move. They found a correlation of r = 0.44 between skill level and depth of search.32 Although the advantage of one or two plies may not at first seem significant, it is a very practical advantage that is guaranteed to win chess games.

One piece of experimental evidence that seems to conflict with the previous findings comes from studies of players of different ages. Charness conducted a study comparing players' skill levels, ages and search protocols, given a test position. The protocol required players to state their thought processes while generating their new move. Charness recorded the number of episodes (distinct paths stemming from a specific move or move combination), total moves, and base moves. Data among players of similar ages supported the Holding and Reynolds conclusion. The average 20-year-old player rated 1569 considered nine episodes and 22 total moves, while the average 20-year-old rated 2000 considered 12 episodes and 49 total moves. However, 50-year-olds with a rating of 2000 averaged 9 episodes and 36 total moves, which are numbers much closer to the lower-ranked 20-year-old rather than the higher-ranked one.33 Clearly, age makes the search process more efficient, since the players are achieving the same results with different amounts of effort. In terms of base moves, the average 20-year-old searched a mean of 4.1 base moves (starting moves from that position) while the average 50-year-old started his search from only 2.8 base moves.34 This data suggests that brute force search is not the only way to become a better chess player. Presumably, the experience of the older players allows them to choose better candidate moves, and to search more efficiently. There may be a higher-level process which screens out certain moves which are not trivially poor, based on a database of chess knowledge.

One facet of chess skill is being able to make an accurate static evaluation of a position. In other words, given a chessboard, and the side to move, to be able to judge which side stands better at that time. Of course, creating static evaluations is the main challenge of programming computers. Heuristic functions are notoriously inaccurate, as they are ill-equipped to deal with specific situations. Humans are generally fairly good at static evaluations, and that is one advantage that we have against programs, due to our ability to use long-range planning. Holding conducted an experiment in which players of different skills had to perform a static evaluation on a position. Subjects used a scale of 10 to 20, where 10 represented an even game, 15 was an almost winning advantage, and 20 represented that they thought resignation from that side was the only solution.35 The play of grandmasters was used as a baseline (the "correct" solution), as the test was administered to class players from E through A.36 The mean number of errors dropped steadily as the skill level increased, E: 3.6 errors, D: 3.0, C: 2.9, B: 2.3 and A: 1.6.37

An interesting case comes in certain ambiguous positions in which it is unclear to which side the advantage lies. These are often very good indicators of chess skill, since players of different rankings tend to predict different outcomes for the game. Comparing these with the "true" result (for practical purposes, the evaluation that a grandmaster would give it) of the game gives a very good indication of a player's relative strength. It is interesting to note that players of differing abilities tend to judge middle-game positions as a win, draw or loss equally well.38 However, in judging end game positions, weaker players tend to make poorer judgments, presumably because their knowledge of positional play is weaker.39

An interesting experiment to do with static evaluations was performed by Holding and Reynolds in 1982, bringing new evidence to current theory. They gave subjects a random position (as before, meaning a legal position, but one with pieces located randomly) and asked them to give evaluation after they had reconstructed the position. Two evaluations were made, one immediately upon reconstruction and one after a five-minute period of analysis. The results were quite surprising: there was a very small correlation between rating strength and correct evaluations. The correlations were r = 0.15 for the first one and r = 0.09 for the second.40 This small correlation shows that higher-ranked players draw on a lot of experience from real games, rather than just on piece relationships to make their judgments. If only the piece relationships were important, then no significant difference would be found between the correlations of skill level and judgment ability, between the random position and game position situations.

Saariluoma takes a cognitive psychological approach to the game of chess. In his theory, the board is a mental space, upon which several operators may be applied. Some of these include the concept of a transfer (in which a piece is moved so that on a future move it may be more to a better location), escape (a piece is moved so that it may not be captured) and pin (a piece is prevented from moving because if it did, a same-colored piece would be subject to immediate attack).41 Saariluoma uses these and other concepts to illustrate decision-making processes made during the course of chess games. These principles can be chained, or linked together to form a more complex plan, or embedded, in which the results of one principle depend on the existence of another relationship in parallel with it. Further research in this area is needed to clarify the types of associations made by chess players.

Current research by Saariluoma is on this subject of spatial relationships in chess. He has written the algorithm for a computer simulation model, M1, which is given a position, and is able to create search spaces consisting of the operators mentioned previously in a very similar way to humans.42 M1 generates a very small number of moves, as do humans. Unlike computer programs which statically evaluate a large number of positions, M1 and humans dynamically assess the position and thus need to evaluate a much smaller portion of the game tree.

The difference between static and dynamic evaluation, according to Saariluoma is that in a static evaluation, the defined operators are applied aimlessly, but with a dynamic evaluation, they are only applied in the correct context.43 For example, a computer program will evaluate a king-safety heuristic after each move, thus costing computing power. However, during large parts of the game, humans will not even consider the safety of their king, knowing that danger is virtually impossible in typical situations. These shortcuts allow humans to save processing power.

Of course, shortcuts bring about inaccuracies in certain situations. Saariluoma conducted an experiment with nine strong chess players, on a position that resembles one encountered in a standard mating technique, the "smothered mate."44 Players followed the standard sequence of moves, which involves a well-known pattern including a discovered check and a double check, forcing the enemy king into a checkmate. However the sequence takes five moves, whereas in Saariluoma's position, a mate in four moves was possible. None of the strong players found it without clues that such a mate existed from the experimenter.45 This experiment showed that players use their existing knowledge when at all possible, at the expense of finding the most efficient solution. A parallel to this phenomenon occurs in computer chess programs. State-of-the-art programs have tablebases, or precise instructions for optimum play in a variety of end game situations (for example, king, rook and pawn against king and rook). However, in many instances, the programs will sacrifice unnecessarily in order to reach a certainly won position in the tablebase.46 It is not efficient play, but the results are the same.

As with other branches of psychology, there are many unresolved issues in the psychology of chess skill. One of these is that of motivation. The question of whether players perform better under pressure (perhaps prize money for a win, or an electric shock for a loss) is an interesting, if unethical one.

Although it has not been proven that certain attributes lead to a higher ability for chess, general intelligence, verbal ability and spatial ability appear to be higher in strong chess players than in the general population.47 However, it is extremely unlikely that all people with those attributes would make strong chess players. There are possibly other factors, which have yet to be discovered. Another interesting possibility for research would be to take a number of people who have had no exposure to chess, to test them for several qualities, among these intelligence, perceptual abilities and memory, to teach them chess equally, and then find correlations between these factors and their performance after fixed periods of time.

Chess is a fascinating game to both play and study from a psychological perspective. Its complexity assures that the game will never be completely solved, like tic-tac-toe. Given an average of 30 possible moves per turn, and an average game length of 40 moves (80 half-moves), we can see that the game tree is at least 3080 nodes big (on the order of 10120).

In summary, there appear to be certain abilities possessed by chess players that tend to improve skill, but the dominating aspect is simply that better chess players have more knowledge and experience. This allows them to form patterns more easily, to apply specific knowledge of types of positions, and to simply recall useful elements from a vast quantity of memorized information which is directly applicable to the chess board. The research on the psychology of chess skill has developed greatly from Freudian analyses to very successful computer simulations of human behavior while trying to solve chess problems.

To close on a lighthearted note, perhaps one should not study the psychology behind chess too much. As Bobby Fischer, arguably the most brilliant chess player ever, said: "I don't believe in psychology. I believe in good moves."48

Endnotes

  1. I have tried playing blindfold chess once. It was extraordinarily taxing and I made my first illegal move after about 15 moves.
  2. de Groot, 1965.
  3. Holding, 1985.
  4. Holding, 1985.
  5. Binet, 1894, cited in Holding, 1985, p. 51.
  6. de Groot, 1965
  7. Fine, 1967
  8. Fine, 1967.
  9. de Groot, 1965.
  10. Murray, 1995.
  11. Murray, 1995.
  12. de Groot, 1965, p. 54.
  13. Saariluoma, 1995.
  14. Holding, 1985.
  15. Holding, 1985.
  16. Holding, 1985.
  17. Algebraic notation is commonly used to describe the position of chess pieces. The letters a to g represent the columns while the numbers 1 to 8 represent the ranks. a1 would be the lower left-hand square, from white's viewpoint.
  18. Chase and Simon, 1973, as cited in Holding, 1985.
  19. Holding, 1985.
  20. de Groot, 1965.
  21. Saariluoma, 1984, as cited in Saariluoma, 1995.
  22. "Big-Oh" notation is used to represent the cost of a function. O(N), for example, would mean that the number of calculations required for a finite N would approach a direct proportionality to N. O(N/2) means that halving the number of items in the search also reduces the number of calculations by a factor of 0.5.
  23. Saariluoma, 1995.
  24. Saariluoma, 1995.
  25. Saariluoma, 1995.
  26. Saariluoma, 1995.
  27. Holding, 1985.
  28. Saariluoma, 1995.
  29. This subject is frequently discussed in the rec.games.chess hierarchy on the newsgroups.
  30. Charness, 1981, cited by Holding, 1985.
  31. For comparison, Kasparov's (the world champion) rating is 2785. One is expected to beat a player ranked 400 points below one about 91% of the time.
  32. Holding and Reynolds, 1982, cited in Holding, 1985.
  33. Charness, 1981, cited in Holding, 1985.
  34. Charness, 1981, cited in Holding, 1985.
  35. Holding 1979, cited in Holding 1985.
  36. A is the highest class, and it is one step below expert. Class E players are very weak, rating approximately 1200.
  37. Holding 1979, as cited in Holding, 1985.
  38. The middle-game is the portion of the game between the opening and the end game. It is the phase of the game where the player must show the most creativity, as openings are generally memorized from books, and end games usually follow complex but straightforward algorithms.
  39. Holding, 1985.
  40. Holding and Reynolds, 1982, cited in Holding, 1985.
  41. Saariluoma, 1995.
  42. Saariluoma, 1995.
  43. Saariluoma, 1995.
  44. In a smothered mate, the king is surround by his own pieces, and an enemy knight checks him. If the knight cannot be captured, then there is checkmate.
  45. Saariluoma, 1995.
  46. This phenomenon has been discussed at length on rec.games.chess.computer.
  47. Holding, 1985.
  48. Brady, 1989, p.230

References

Copyright © 2001 Mark Jeays. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation.