Human-monkey gaze correlations reveal convergent and divergent patterns of movie viewing
The neuroanatomical organization of the visual system is largely similar across primate species [1, 2], predicting similar visual behaviors and perceptions. Although responses to trial-by-trial presentation of static images suggest that primates share visual orienting strategies [3–8], these reduced stimuli fail to capture key elements of the naturalistic, dynamic visual world in which we evolved [9, 10]. Here, we compared the gaze behavior of humans and macaques when they viewed three different 3-minute movie clips. We found significant intersubject and interspecies gaze correlations, suggesting that both species attend a common set of events in each scene. Comparing human and monkey gaze behavior with a computational saliency model revealed that interspecies gaze correlations were driven by biologically relevant social stimuli overlooked by low-level saliency models. Additionally, humans, but not Report that shares relevant neural structures involved in gaze control . The gaze control system of the macaque is the best-studied primate model of the nested, iterative, sensorimotor decision loops that make up our natural behavior  and comprises an important substrate in which to address the evolution of behavior. A second approach is to examine whether gaze behavior can be predicted by neurally inspired computational models of visual saliency. Such models have proven effective at locating areas of interest in static scenes based on low-level visual cues [26, 27]. In the present study, we test the hypothesis that humans and monkeys have adapted shared neural mechanisms to identify, localize, and monitor distinct sets of behaviorally relevant stimuli. monkeys, tended to gaze toward the targets of viewed individual’s actions or gaze. Together, these data suggest that human and monkey gaze behavior comprises converging and diverging informational strategies, driven by both scene content and context; they are not fully described by simple low-level visual models.