![]() |
|
Artificial Intelligence Laboratory
Electrical Engineering & Computer Science Department
University of Michigan
1101 Beal Avenue, Ann Arbor, MI 48109-2110
+1 313 763 6985
hornof@umich.edu, kieras@eecs.umich.edu
Researchers have proposed theories about the low-level strategies that people use to find a known item in an unordered menu. Norman [12] and Vandierendonck, Van Hoe, and De Soete [14] suggested that people process one menu item at a time. But they did not validate this low-level assumption empirically. There have also been conflicting theories. Card [3] proposed that people randomly choose which item to examine next, while Lee and MacGregor [9] provided evidence that people search systematically from top to bottom. The research presented here examines the plausibility of these theories by providing an empirically validated model of the low-level perceptual, cognitive, and motor processing that people use in a menu selection task.
EPIC consists of a production-rule cognitive processor and perceptual-motor peripherals. To model human performance aspects of accomplishing a task, a cognitive strategy and perceptual-motor processing parameters must be specified. A cognitive strategy is represented as a set of production rules, much the same way that CCT [2], ACT-R [1], and SOAR [8] represent procedural knowledge. The simulation is driven by a description of the task environment that specifies aspects of the environment that would be directly observable to a human, such as what objects appear at what times, and how the environment changes based on EPIC's motor movements. EPIC computational models are generative in that the production rules only represent general procedural knowledge of the task, and when EPIC interacts with the task environment, EPIC generates a specific sequence of perceptual, cognitive, and motor activities required to perform each specific instance of the task.
EPIC takes as its input:
EPIC generates as output:

Figure 1. Subset of EPIC architecture, showing flow of information and control. The processors run independently and in parallel. Not shown: Auditory and vocal motor processors, task environment.
A single stimulus in the task environment can produce multiple outputs from a perceptual processor to be deposited in working memory at different times. First the detection of a perceptual event is sent, followed later by features that describe the event. The perceptual processors are "pipelined." If an object's features begin moving to working memory, the arrival of those features will not be delayed by any other processing. Working memory contains these items deposited by perceptual processors, as well as control information such as the current task goal. At the end of each simulated 50 msec cycle, EPIC fires all of the production rules whose conditions match the current contents of working memory. EPIC allows for parallel execution of production rules in the cognitive processor, and some parallelism in each motor processor.
In short, EPIC is applied to a task as follows: The production-rule strategy directs the eyes to objects in the environment. The eyes have a resolving power which determines the processing time required for different object features, such as location and text. When information needed to determine the next motor movement arrives in working memory, the strategy instructs the ocular motor and manual motor processors to move the eyes and hands.
Information processing and motor movement times are held constant across modeling efforts, and are based on human performance literature. Manual movement times, for example, are determined by Fitts' law (see [4], Ch. 2). For lack of space, EPIC cannot be described in full detail here. A more thorough description is presented in [6, 7].
As shown in Figure 2, each trial consisted of the following steps: Using the mouse, move the cursor to the GO box which causes the precue of the target item to appear above the GO box. Commit the precue to memory. Click on the GO box. The GO box and precue disappear, the menu appears, and the clock starts. As quickly as possible, click on the target item in the menu. The clock stops.

Figure 2. Nilsen's task with six items in the menu.
This task isolates a subset of the processes required in a "real world" menu task. It is thus particularly well-suited for studying the low-level perceptual-motor processes of visual search and response selection. The task is not confounded with more complex processes of reading, comprehension, judgment, decision making, and problem solving. Though Nilsen mostly used the data to examine motor control, this modeling effort focuses on visual search. The data is particularly useful for modeling visual search of menus because Nilsen varied menu length and reported selection time as a function of the serial position of the target menu item. Few researchers have reported such data. As will be shown, this combination is critical for revealing search strategy.

Figure 3. Nilsen's observed data (solid lines). Mean selection times as a function of serial position of target item, for menus with three, six, or nine items. Also: Time required to move the mouse to each target position as predicted by Fitts' law (dashed line).
There are several key features to note in the observed data:
The discussion of each model includes a flowchart that summarizes the production rules written in EPIC to represent that model. Production rules were written to maximize performance within the constraints imposed by EPIC, and to be as simple as possible. EPIC was otherwise used 'as is' for all models. Details and parameters such as the availability of object features were established and validated in other modeling projects in different task domains, and are discussed in [6, 7].

Figure 4. Norman's [12] information processing model for search of an explicitly known target.
Both serial processing models were only run with an eye-to-screen distance of 8 inches so that only one item would fit into the fovea at a time, insuring the serial encoding process specified by Norman. At greater distances, more than one item would fit into the fovea simultaneously, and parallel encoding would ensue.

Figure 5. Selection times observed (solid lines) and predicted (dashed lines) by the Serial Processing Random Search model run with one item fitting into the fovea.
The results in Figure 5 suggest that the Serial Processing Random Search model is wrong. The only feature in the observed data that this model accounts for is that shorter menus are faster than longer menus. Otherwise, the model does not fit the observed data. Selection times are much too high overall. Slopes are very small because every item takes on average the same amount of time to find and select; any slope that appears is due to the mouse movement. A higher selection time for serial position 1 is not predicted. This model does not account for the observed data.

Figure 6. Selection times observed (solid lines) and predicted (dashed lines) by the Serial Processing Systematic Search model run with one item fitting into the fovea. The predicted times for the same serial position in different menu lengths are the same and are thus superimposed.
The results in Figure 6 suggest that this model is also wrong. The only feature in the observed data that this model accounts for is a positive slope greater than that of the predicted Fitts movement time. The model accounts for no other features in the observed data. Shorter menus are not faster. The slope of the predicted data is too steep. The selection time for serial position 1 is not higher than for serial position 2. This model does not account for the observed data.
The prediction has a slope resulting from more than just the mouse movement, but the predicted slope is too steep, about 380 msec per item as opposed to about 100 msec per item in the observed data. The discrepancy between the predicted and observed data results from all of the processing that must take place before moving the gaze to the next menu item. The slope of approximately 380 msec results because this is the time required for EPIC to move the eye, perceptually process a menu item, move the features to working memory, and decide on an item. Serially processing each item cannot produce a slope of 100 msec per item. Only by processing multiple items at once can a model produce such a small slope.
The results provided by the serial processing models provide strong evidence that, when scanning a menu, people process more than one menu item at a time. The serial processing models asserted by Norman [12] and Vandierendonck, Van Hoe, and De Soete [14] are highly implausible. Menu selection models should take this human capability into consideration. The remaining models presented in this paper utilize parallel processing of menu items.
Both parallel processing models were run with different eye-to-screen distances that resulted in one and three items fitting into the fovea simultaneously. When more than one item is visible in the fovea, all of those objects' features are sent to working memory in parallel. To prevent a random eye "movement" to essentially the same location while searching, both models choose the next item to look at from outside the fovea.
Figure 7 shows a flowchart that represents the production rules built in EPIC to investigate the possibility that subjects used a Parallel Processing Random Search strategy. To prevent a random eye "movement" to essentially the same location, the model chooses the next item from among all items currently outside the fovea.

Figure 7. Parallel Processing Random Search model.
The results from running the Parallel Processing Random Search model are shown in Figure 8. Each predicted selection time is averaged from 300 trials run for that menu length and serial position combination.


Figure 8. Selection times observed (solid lines) and predicted (dashed lines) by the Parallel Processing Random Search model run with one item (top graph) and three items (bottom graph) fitting into the fovea.
The predictions from the Parallel Processing Random Search model have some features that correspond to the observed data, but also have some problems.
As can be seen in Figure 8 (top graph), when one item at a time is visible in the fovea, the model accounts for shorter menus being faster, but no other features of the observed data. The overall predicted times are, however, significantly lower than in the Serial Processing Random Search model discussed above.
As can be seen in Figure 8 (bottom graph), when three items are visible in the fovea simultaneously, the model can account for some features of the observed data Shorter menus are faster, and about the right amount faster, as is shown by the distance between the predicted lines approximating the distance between the observed lines. The predicted values fall entirely within the range of the observed values. Most importantly, this model accounts for serial position 1 being higher than serial position 2. However, the overall slope is still too small.
In Figure 8 (bottom graph), both the first and last serial positions are higher because the model combines random search with three menu items fitting into the fovea. Items at both ends of the menu have a lower probability of being in the fovea after any random fixation. Any of the middle menu items can be foveated by moving the eye to that item, or to either of the two adjacent items. But the first and last items only have one adjacent item. This might explain serial position 1 being higher than serial position 2 in the observed data.
The predictions from the Parallel Processing Random Search model suggest that the model is partly correct, and partly incorrect.
In this model, the first eye movement is made to any of the items that are within one foveal radius from the topmost item (to insure the first gaze captures the topmost item). Each subsequent movement is made to an item one foveal diameter below the center of the current fixation. These details represent the belief that, when using a systematic search strategy, people attempt to maximize the foveal coverage with a minimum number of eye movements.

Figure 9. Parallel Processing Systematic Search model.
The results from running the Parallel Processing Systematic Search model are shown in Figure 10. Each predicted selection time is averaged from one trial run for each possible combination of menu length, serial position, and first eye movement.


Figure 10. Selection times observed (solid lines) and predicted (dashed lines) by the Parallel Processing Systematic Search model run with one item (top graph) and three items (bottom graph) fitting into the fovea. In each graph, the predicted times for the same serial position in different length menus are the same and are thus superimposed.
The predictions from the Parallel Processing Systematic Search model have some features that correspond to the observed, but also have some problems.
As can be seen in Figure 10 (top graph), when one item at a time is visible in the fovea, the model only accounts for a positive slope. The model does not predict that shorter menus will be faster, the slope is too steep, and serial position 1 is not higher.
As can be seen in Figure 10 (bottom graph), when three items are visible in the fovea simultaneously, the model can account for important features of the data. The slope is correct and the predicted values fall entirely within the range of the observed values. But again, the model does not account for shorter menus being faster, and serial position 1 is not higher.
These results show that the Parallel Processing Systematic Search model can partially explain how the subjects accomplished the task, but not account for all aspects of the observed data.
None of the models presented thus far can account for all of the features in the observed data. The serial processing models account for essentially none of the features of the observed data. But all features of the observed data are accounted for by at least one of the various parallel processing models, as shown in Figure 11.

Figure 11. Summary of how the parallel processing models account for (+) and do not account for (-) features in the observed data.
These models were motivated by observing, as shown in Figure 11, that all of the features in the observed data are accounted for by at least one of the parallel processing models when run one or three items fitting into the fovea. The random search model accounts for faster selection times in shorter menus. When three items fit into the fovea, the random search model also accounts for serial position 1 being higher. The systematic search model accounts for the correct slope when three items fit into the fovea.
Predictions from this hybrid model can be obtained in two ways. The first is to build a set of EPIC production rules that contain the rules from both the Parallel Processing Random Search strategy and the Parallel Processing Systematic Search strategy; the strategy would randomly choose which search strategy to use at the start of each trial. The second is to average the predicted values produced by running the two models independently. Since both approaches would produce the same predictions, the second approach was chosen for expedience. Figure 12 shows the results of this model, as determined by taking an unweighted average of the results shown in Figure 8 and Figure 10.


Figure 12. Selection times observed (solid lines) and predicted (dashed lines) by the Dual Strategy Hybrid model, with one item (top graph) and three items (bottom graph) fitting into the fovea.
The predictions from the Dual Strategy Hybrid model can account for most of the features in the observed data, but do not fit the observed values perfectly.
As can be seen in Figure 12 (top graph), when one item fits into the fovea, the model accounts for faster selection times in shorter menus and produces a near-perfect slope. But the model does not account for the higher selection time in serial position 1, and overall the predicted values are higher than the observed values.
As can be seen in Figure 12 (bottom graph), when three items fit into the fovea, the model accounts for faster selection times in shorter menus, produces a comparable slope, accounts for the higher selection time in serial position 1, and predicts values that are in range of the observed data. The only shortcoming of this model is that the predicted values do not exactly match the observed values.
The predictions from the Dual Strategy Hybrid model suggest that the model is almost correct.
Predictions from this hybrid model can be obtained in two ways. The first is to build a task environment that varies the screen distance from trial to trial, and to run a set of production rules developed for the Dual Strategy Hybrid model using this task environment. The second is to average the predicted values produced by running the Dual Strategy Hybrid model in two task environments, each with a fixed screen-to-eye distance. Since both approaches would produce the same predictions, the second approach was chosen for expedience. Figure 13 shows the results of this model, as determined by taking a weighted average of the results shown in the two graphs in Figure 12, with 15% from the top graph (one item in fovea) and 85% from the bottom graph (three items in fovea).

Figure 13. Selection times observed (solid lines) and predicted (dashed lines) with a Dual Strategy Varying Distance Hybrid model, with 15% of the trials at a one-item-in-fovea distance, and 85% of the trials at a three-items-in-fovea distance.
The Dual Strategy Varying Distance Hybrid model accounts for all of the features in the observed data. As can be seen in Figure 13, the model predicts the observed values very well (r2 = 0.99). Matching the observed values, the Dual Strategy Varying Distance Hybrid model offers a highly plausible explanation of the task environment and strategies used by subjects in Nilsen's experiment.
Also looking to the future, successfully modeling menu search provides evidence that a general purpose tool for evaluating the efficiency of visual aspects interfaces might be feasible. The tool would take as its input a definition of a screen layout and a task. The tool would provide as output a prediction of the time required for the user to execute the task. Previous researchers have set a precedent that such a tool can be built [10, 13]. Such a tool would analyze screen layouts and predict the cognitive effort required by a user to extract the information needed to accomplish a task.
This work was supported by the Advanced Research Projects Agency under order number B328, monitored by NCCOSC under contract number N66001-94-C-6036 awarded to David Kieras.
2. Bovair, S., Kieras, D. E., & Polson, P. G. (1990). The acquisition and performance of text editing skill: A cognitive complexity analysis. Human-Computer Interaction, 5, 1-48.
3. Card, S. K. (1984). Visual search of computer command menus. In H. Bouma & D. G. Bouwhuis (Eds.), Attention and Performance X: Control of Language Processes, (pp. 97-108). London: Lawrence Erlbaum Associates, Publishers.
4. Card, S. K., Moran, T. P., & Newell, A. (1983). The Psychology of Human-Computer Interaction. Hillsdale, NJ: Lawrence Erlbaum Associates.
5. John, B. E., & Kieras, D. E. (1994). The GOMS family of analysis techniques: Tools for design and evaluation (Technical Report No. CMU-CS-94-181): Carnegie Mellon University School of Computer Science.
6. Kieras, D. E., & Meyer, D. E. (1995). An overview of the EPIC architecture for cognition and performance with application to human-computer interaction (EPIC Tech. Rep. No. 5, TR-95/ONR-EPIC-5). Ann Arbor, Michigan: Department of Electrical Engineering and Computer Science.
7. Kieras, D. E., & Meyer, D. E. (in press). An overview of the EPIC architecture for cognition and performance with application to human-computer interaction. Human-Computer Interaction.
8. Laird, J., Rosenbloom, P., & Newell, A. (1986). Universal subgoaling and chunking. Boston: Kluwer Academic Publishers.
9. Lee, E., & MacGregor, J. (1985). Minimizing user search time in menu retrieval systems. Human Factors, 27(2), 157-162.
10. Lohse, J. (1991). A cognitive model for the perception and understanding of graphs. In Proceedings of CHI '91, New Orleans, Louisiana. New York: ACM.
11. Nilsen, E. L. (1991). Perceptual-motor control in human-computer interaction (Tech. Rep. No. 37). Ann Arbor, Michigan: The Cognitive Science and Machine Intelligence Laboratory, The University of Michigan.
12. Norman, K. L. (1991). The Psychology of Menu Selection: Designing Cognitive Control of the Human/Computer Interface. Norwood, N. J.: Ablex.
13. Sears, A. (1993). Layout appropriateness: A metric for evaluating user interface widget layout. IEEE Transactions on Software Engineering, 19(7).
14. Vandierendonck, A., Van Hoe, R., & De Soete,
G. (1988). Menu search as a function of menu organization, categorization
and experience. Acta Psychologica, 69(3), 231-248.
![]() |
|