• Rezultati Niso Bili Najdeni

Diminishing Returns and Phase of the Game

Steenhuisen [Ste05] was the first to point out that the chance of new best moves being discovered at higher depth decreases faster for positions closer to the end of the game. However, having in mind that deep-search behavior depends on the val-ues of positions in a test set, it seems worthwhile to check whether his results were just the consequence of dealing with positions with a decisive advantage (at least on average) in a later phase of the game. For the purpose of this experiment we took only a subset with more or less balanced positions with depth 12 and an evaluation in the range between -0.50 and 0.50 (see Table 9.8). Our results show that in the po-sitions that occurred in the games later than move 50, the chance of new best moves being discovered indeed decreases faster, which agrees with Steenhuisen’s [Ste05]

observations. The experiments in this and the following section were performed by CRAFTY.

Table 9.8: Five subsets of positions of different phases in the game, with evaluations obtained at search depth 12 in range between -0.50 and 0.50 (CRAFTY).

Group 1 2 3 4 5

Move no. (m) m<20 20≤m<30 30≤m<40 40≤m<50 m≥50

Positions 7,580 6,106 3,418 1,356 961

The results presented in Figure 9.4 show that while there is no obvious correlation between move number and the chance of new best moves being discovered at higher depth, in the positions of Group 5 that occurred closer to the end of the game the Best Change curve nevertheless appears lower than the curves of the other groups. Table 9.9 shows the best-move properties for this group.

162

9.5. Diminishing Returns and Phase of the Game

Figure 9.4: Go-deep results with positions of different phases of the game.

Table 9.9: Results for the 961 positions of Group 5.

Search Best Change Fresh Best (d-2) Best (d-3) Best Mean

depth in % (SE) in % in % in % evaluation

3 37.04 (1.56) 100.00 - - 0.07

4 34.03 (1.53) 72.78 27.22 - 0.05

5 29.24 (1.47) 60.85 27.40 11.74 0.05

6 26.85 (1.43) 49.22 30.23 14.34 0.03

7 24.35 (1.39) 47.44 29.91 10.26 0.02

8 22.89 (1.36) 45.91 27.27 9.55 0.02

9 23.10 (1.36) 38.29 32.88 10.81 0.02

10 21.85 (1.33) 37.62 27.62 11.43 0.02

11 20.60 (1.31) 33.33 32.83 12.12 0.02

12 19.25 (1.27) 26.49 36.22 8.65 0.01

The 95%-confidence bounds for Best Change at the highest level of search per-formed for the sample of 961 positions of Group 5 are [16.88;21.86].

Chapter 10 Conclusions

10.1 Critical Analysis and Open Questions

We view this thesis as a progress report on work that needs to be advanced in a great deal of directions. We do believe it describes some novel aspects on the comparison and combination of search and knowledge in human and machine problem solving, and that it provides one possible view on possibilities of developing heuristic-search methods for evaluating and improving problem-solving performance. Our hope is that the research done in the scope of this thesis will stimulate further extensions, emendations, or even refutations.

We are well aware that we have limited ourselves to only a small portion of refer-ences that exist in such a large scientific area such as Human and Machine Problem Solving. Only a rather small part of highly related branches in psychology such as the huge field of Integrated Cognitive Architectures, and a rather small part of an important and highly related field ofArtificial Intelligence in Education, particularly with respect to building intelligent tutoring systems (an excellent book on the subject was recently written by B.P. Woolf [Woo08]) were addressed. We do hope, however, that researchers of all kind of areas related to Human and Machine Problem Solving will find in our work at least some fresh ideas for arriving farther on our common way in the enormous “problem space” that we are all searching through.

The research done in the scope of the thesis was performed in the framework of human and computer game playing, and the game of chess was used as the experi-mental domain. While we advocated that many researchers used the game-playing platform and the domain of chess in their experiments, the explicit or implicit

mes-10. CONCLUSIONS

sage of their works being that the results for chess are generalizable to other domains, we actually did not provide any specific evidence that our work is extendable outside the scope of the game of chess. This is clearly one possible future research direction.

There is almost a countless number of (even fundamental) scientific questions that remain open. Below we mention five of them.

1. How to define a measure of understandability (or comprehensibility) of a body of knowledge (a theory, model, problem-solving strategy) for a human? This is an old fundamental question of AI that has been rarely considered, and for which not even a reasonable tentative solution exist.

2. What are the characteristics of human-understandable (or human-assimilable) representations of a theory, model, or problem-solving strategy? There are limits on memory – what can be memorized, on search complexity: How many inference steps or moves a human is able to look ahead? There should be a well balanced mixture of knowledge requirements, useful key patterns, and some search (not too much knowledge, not too much search).

3. What is an effective “conceptualization” of a domain theory for a human? One that can be effectively learned, memorized, and effectively applied for problem solving?

4. How to generate human-meaningful and natural commentary, depending on a student’s level and type of knowledge and skill? A human may be solving the task using a particular strategy, so the commentator should be aware of this and make comments relative to this strategy; or alternatively, the commentator may be aware of a student’s limited knowledge (limited problem-solving strategy, limited declarative knowledge) and may decide to draw a student’s attention to another strategy that is better applicable to the problem at hand.

5. How to define a measure of cognitive difficulty of a given problem? Is it mea-surable at all or is it unique to each individual?

We address some more open questions and suggestions for further research direc-tions in the sequel where assessments of the contribudirec-tions of the thesis are stated.

166