Overview of Binocular Rivalry Research
During normal vision both eyes typically converge on an object, with each perceiving near-identical images and the slight difference between them enabling the perception of depth (Howard & Rogers, Reference Howard and Rogers2012). However, when conflicting images are presented, one to each eye (i.e., dichoptically), spontaneous alternations occur between each unitary stimulus (Figure 1). Such binocular rivalry (BR) involves states of perceptual dominance (i.e., the visible image) and suppression (i.e., the invisible image), with the alternations typically occurring every one to two seconds. For example, if vertical gratings are presented to one eye and horizontal gratings to the other, observers perceive the vertical image for a few seconds, followed by the horizontal image for a few seconds, then back to perceiving the vertical image, and so on, for as long as the stimuli are presented dichoptically.
BR has been the subject of scholarly and scientific inquiry for over two centuries, stemming from late 17th-century observations concerning binocular single vision (Wade & Ngo, Reference Wade, Ngo and Miller2013). From the 1830s onwards, Wheatstone, Panum, Helmholtz and Hering engaged in seminal work on the phenomenon. Before the turn of the 20th century, the advent of experimental psychology was soon followed by detailed quantitative studies of BR, with a focus on its psychophysical characterization. Early psychologists then explored rivalry from various aspects, such as recording alternation rate along with other perceptual, motor, and cognitive measures as potential indices of personality traits and psychiatric disorders. These measures were also used to explore heritability of rivalry parameters in twin studies (reviewed in Wade & Ngo, Reference Wade, Ngo and Miller2013).
More recently, electrophysiological and brain-imaging studies in both humans and animals have used the phenomenon as a powerful tool to dissociate neural activity mediating states of visual consciousness during rivalry from that associated with the constant visual input (Blake & Logothetis, Reference Blake and Logothetis2002; Crick & Koch, Reference Crick and Koch1998; Miller, Reference Miller2013). In humans, brain-imaging studies of BR have revealed perception-dependent activity in lateral geniculate nucleus and early visual cortex through to temporal, parietal, and frontal lobe regions, while rivalry alternations are associated with right-sided frontoparietal activation for BR and other bistable phenomena (Buckthought & Mendola, Reference Buckthought, Mendola, Molotchnikoff and Rouat2012; Sterzer, Reference Sterzer and Miller2013). Rivalry phenomena have also been examined in animals, including cats (e.g., Fries et al., Reference Fries, Roelfsema, Engel, König and Singer1997; Sengpiel & Vorobyov, Reference Sengpiel and Vorobyov2005), monkeys (e.g., Keliris et al., Reference Keliris, Logothetis and Tolias2010; Maier et al., Reference Maier, Wilke, Aura, Zhu, Ye and Leopold2008; Panagiotaropoulos et al., Reference Panagiotaropoulos, Deco, Kapoor and Logothetis2012), mice (Zhang et al., Reference Zhang, Wen, Zhang, She, Wu, Dan and Poo2012), and even fruit flies (e.g., Heisenberg & Wolf, Reference Heisenberg and Wolf1984; Tang & Juusola, Reference Tang and Juusola2010; reviewed in Miller et al., Reference Miller, Ngo and van Swinderen2012). Monkey electrophysiological experiments in particular have shown that perception dependency of neural activity in striate and extrastriate neurons is substantially less than that in higher-level inferior temporal and lateral prefrontal regions (Panagiotaropoulos & Logothetis, Reference Panagiotaropoulos, Logothetis and Albertazzi2013).
In humans, various extrinsic factors are well known to influence BR dynamics. These include stimulus characteristics such as the contrast, luminance, spatial frequency, and temporal frequency of the stimuli. Such factors contribute to the signal strength of a presented stimulus or its ‘stimulus strength’, which can affect the relative perceptual dominance and alternation rate of the dichoptic images (Howard & Rogers, Reference Howard and Rogers2012). Intrinsic human factors that have been shown to be associated with BR include visual acuity (Fahle, Reference Fahle1982), stereoscopic acuity (Enoksson, Reference Enoksson1964; Halpern et al., Reference Halpern, Patterson and Blake1987), age (Bannerman et al., Reference Bannerman, Regener and Sahraie2011; Jalavisto, Reference Jalavisto1964; Norman et al., Reference Norman, Norman, Pattison, Taylor and Goforth2007; Ukai et al., Reference Ukai, Ando and Kuze2003), voluntary attentional control, and neurochemical state (Bressler et al., Reference Bressler, Denison, Silver and Miller2013; Lack, Reference Lack1978; van Loon et al., Reference van Loon, Knapen, Scholte, St John-Saaltink, Donner and Lamme2013).
Interest in BR research in the clinical domain entered the modern era following reports that the rate of BR was slow in the heritable psychiatric condition, bipolar disorder (BD; Miller et al., Reference Miller, Gynther, Heslop, Liu, Mitchell, Ngo and Geffen2003; Pettigrew & Miller, Reference Pettigrew and Miller1998), and may thus represent a trait marker or endophenotype for the condition (reviewed in Ngo et al., Reference Ngo, Mitchell, Martin and Miller2011). This finding was independently replicated in subsequent studies (Nagamine et al., Reference Nagamine, Yoshino, Miyazaki, Takahashi and Nomura2009; Vierck et al., Reference Vierck, Porter, Luty, Moor, Crowe, Carter and Joyce2013), which supported slow BR as a potential endophenotype for BD. The high sensitivity of slow BR in BD (~80%) prompted a large-scale twin study of the heritability of rivalry rate (see Ngo et al., Reference Ngo, Barsdell, Law, Miller and Miller2013). Rivalry rate was known to vary widely between individuals but to be relatively stable within individuals (Aafjes et al., Reference Aafjes, Hueting and Visser1966; Breese, Reference Breese1899; George, Reference George1936; McDougall, Reference McDougall1906; Mull et al., Reference Mull, Armstrong and Telfer1956; Wade & Ngo, Reference Wade, Ngo and Miller2013). In a large 10-year study of healthy monozygotic and dizygotic twins (N = 722; Miller et al., Reference Miller, Hansell, Ngo, Liu, Pettigrew, Martin and Wright2010) that remains ongoing (Wright & Martin, Reference Wright and Martin2004), heritability and reliability of BR rate was examined. A substantial genetic contribution to individual variation in BR rate was found, with 52% of the variance attributable to additive genetic factors (while unique environmental factors [18%] and measurement unreliability [30%] accounted for the remaining variance). BR rate was also shown to be highly reliable within (R = 0.93) and between (R = 0.70) testing sessions, and the heritability finding was confirmed in a subsequent smaller twin study (Shannon et al., Reference Shannon, Patrick, Jiang, Bernat and He2011). The twin study results further supported the notion of using slow BR rate as an endophenotype for BD by confirming high heritability and reliability for rivalry rate.
Regarding other factors relevant to endophenotype status (Gottesman & Gould, Reference Gottesman and Gould2003; Gould & Gottesman, Reference Gould and Gottesman2006; Hasler et al., Reference Hasler, Drevets, Gould, Gottesman and Manji2006; Kendler & Neale, Reference Kendler and Neale2010), the evidence thus far suggests that clinical state and medication do not appear to account for the slow BR trait (Miller et al., Reference Miller, Gynther, Heslop, Liu, Mitchell, Ngo and Geffen2003; Nagamine et al., Reference Nagamine, Yoshino, Miyazaki, Takahashi and Nomura2009; Pettigrew & Miller, Reference Pettigrew and Miller1998; Vierck et al., Reference Vierck, Porter, Luty, Moor, Crowe, Carter and Joyce2013; see also reviews in Ngo et al., Reference Ngo, Mitchell, Martin and Miller2011, Reference Ngo, Barsdell, Law, Miller and Miller2013), while family studies of rivalry rate remain to be performed. Furthermore, there is currently only a small amount of data available on specificity of slow rivalry rate to BD (Miller et al., Reference Miller, Gynther, Heslop, Liu, Mitchell, Ngo and Geffen2003). Hence, to definitively assess the trait's utility as a BD endophenotype, large-scale investigation is required of BR rate in BD and other psychiatric disorders that present diagnostic confusion with BD (such as schizophrenia and major depression; Conus & McGorry, Reference Conus and McGorry2002; Hirschfeld et al., Reference Hirschfeld, Lewis and Vornik2003; Joyce, Reference Joyce1984), taking account of state and medication issues, as well as studies of family members of BD probands. The ultimate aim is to utilize slow BR as an endophenotype in genome-wide association studies (GWAS) to increase the power of such studies by reducing misclassification of affected and non-affected genotypes that are associated with purely clinical diagnostic procedures (see Ngo et al., Reference Ngo, Mitchell, Martin and Miller2011, Reference Ngo, Barsdell, Law, Miller and Miller2013). Finally, the candidate gene approach is also amenable to utilization of the rivalry rate trait (e.g., Kondo et al., Reference Kondo, Kitagawa, Kitamura, Koizumi, Nomura and Kashino2012; Schmack et al., Reference Schmack, Sekutowicz, Rössler, Brandl, Müller and Sterzer2013), and with the recent advent of a Drosophila model of visual rivalry new approaches toward understanding the molecular mechanisms of rivalry rate and BD pathophysiology have emerged (see Miller et al., Reference Miller, Ngo and van Swinderen2012; Ngo et al., Reference Ngo, Barsdell, Law, Miller and Miller2013).
Research on BR and related phenomena has also involved examination with new psychophysical techniques (e.g., Alais et al., Reference Alais, Cass, O'Shea and Blake2010; Logothetis et al., Reference Logothetis, Leopold and Sheinberg1996; Tsuchiya & Koch, Reference Tsuchiya and Koch2005; Wilson et al., Reference Wilson, Blake and Lee2001) and brain stimulation methods (reviewed in Ngo et al., Reference Ngo, Barsdell, Law, Miller and Miller2013). In addition, these phenomena have been further characterized with methodologies probing learning effects (e.g., Chopin & Mamassian, Reference Chopin and Mamassian2010; Raio et al., Reference Raio, Carmel, Carrasco and Phelps2012), cross-modal input effects (e.g., Jantzen et al., Reference Jantzen, Seifert, Richardson, Behmer, Odell, Tripp and Symons2012; Salomon et al., Reference Salomon, Lim, Herbelin, Hesselmann and Blanke2013), plasticity (reviewed in Klink et al., Reference Klink, van Wezel, van Ee and Miller2013; see also Lunghi et al., Reference Lunghi, Burr and Morrone2013), non-conscious problem-solving (Sklar et al., Reference Sklar, Levy, Goldstein, Mandel, Maril and Hassin2012; Zabelina et al., Reference Zabelina, Guzman-Martinez, Ortega, Grabowecky, Suzuki and Beeman2013), comparison of different rivalry types (Naber et al., Reference Naber, Gruenhage and Einhäuser2010; van Ee, Reference van Ee2005), as well as the influence of individual differences and semantic context (e.g., Geng et al., Reference Geng, Zhang, Li, Tao and Xu2012; Gray et al., Reference Gray, Adams and Garner2009; Mudrik, Breska et al., Reference Mudrik, Breska, Lamy and Deouell2011; Mudrik, Deouell et al., Reference Mudrik, Deouell and Lamy2011; Nagamine et al., Reference Nagamine, Yoshino, Yamazaki, Obara, Sato, Takahashi and Nomura2007; Paffen et al., Reference Paffen, Plukaard and Kanai2011; Sheth & Pham, Reference Sheth and Pham2008; Tao et al., Reference Tao, Zhang, Li and Geng2012). Furthermore, rivalry and related phenomena have been subject to computational modeling (e.g., Bruce & Tsotsos, Reference Bruce, Tsotsos, Pomplun and Suzuki2013; Hayashi et al., Reference Hayashi, Maeda, Shimojo and Tachi2004; Pastukhov et al., Reference Pastukhov, García-Rodríguez, Haenicke, Guillamon, Deco and Bruan2013; Wilson, Reference Wilson and Miller2013) and used to explore neurological conditions (Bonneh et al., Reference Bonneh, Pavlovskaya, Ring and Soroker2004; Ritchie et al., Reference Ritchie, Bannerman, Turk and Sahraie2013; Shine et al., Reference Shine, Halliday, Carlos, Naismith and Lewis2012, in press; Windmann et al., Reference Windmann, Wehrmann, Calabrese and Güntürkün2006), pain conditions (Cohen et al., Reference Cohen, Hall, Harris, McCabe, Blake and Jänig2012; Hall et al., Reference Hall, Harrison, Cohen, McCabe, Harris and Blake2011; McKendrick et al., Reference McKendrick, Battista, Snyder and Carter2011; Wilkinson et al., Reference Wilkinson, Karanovic and Wilson2008), developmental psychiatric disorders (Amador-Campos et al., Reference Amador-Campos, Aznar-Casanova, Moreno-Sánchez, Medina-Peña and Ortiz-Guerra2013, in press; Aznar-Casanova et al., Reference Aznar Casanova, Amador Campos, Moreno Sánchez and Supèr2013; Robertson et al., Reference Robertson, Kravitz, Freyberg, Baron-Cohen and Baker2013; Said et al., Reference Said, Egan, Minshew, Behrmann and Heeger2013), anxiety disorders (Anderson et al., Reference Anderson, Dryman, Worthington, Hoge, Fischer, Pollack and Simon2013; Singer et al., Reference Singer, Eapen, Grillon, Ungerleider and Hendler2012), major depression (Sterzer et al., Reference Sterzer, Hilgenfeldt, Freudenberg, Bermpohl and Adli2011; Yang et al., Reference Yang, Zhao, Jiang, Li, Wang, Weng and Northoff2011), and ocular disorders (Black et al., Reference Black, Thompson, Maehara and Hess2011, Reference Black, Hess, Cooperstock, To and Thompson2012; Hess & Thompson, Reference Hess and Thompson2013; Hess et al., Reference Hess, Thompson, Black, Maehara, Zhang, Bobier and Cooperstock2012; Knox et al., Reference Knox, Simmers, Gray and Cleary2011; Li et al., Reference Li, Ma and Ngan2013; Spiegel et al., Reference Spiegel, Li, Hess, Byblow, Deng, Yu and Thompson2013; To et al., Reference To, Thompson, Blum, Maehara, Hess and Cooperstock2011; Zhou et al., Reference Zhou, Huang and Hess2013).
Nevertheless, despite such renewed research interest in BR and related phenomena, and widening of the field to include clinical, genetic, and other directions, investigation of new methods of dichoptic display in human BR studies has been lacking. This gap in the field is particularly surprising given the rapid, concurrent research and development of 3D display technology over recent decades. Achieving BR requires accurate and reliable dichoptic presentation, with mirror stereoscopes or liquid crystal shutter (LCS) goggles typically being used. Traditionally, such technical expertise has been limited to vision researchers, psychophysicists, engineers, computer scientists, optical physicists, digital/video imaging professionals, and display technologists. As such, for investigators new to the field of rivalry research, clinical and genetic researchers may face challenges in choosing a suitable rivalry display method to utilize in their studies. The research literature on 3D displays is also highly specialized, with a strong focus on improving hardware and software technology for stereoscopic depth perception (stereopsis) rather than for BR viewing. In distinct contrast, BR studies have focused on characterizing the phenomenon, wherein the dichoptic display method used is often peripheral to the research question of interest. To bridge this apparent divide between the two fields, the current article draws upon the relevant theoretical and empirical literature on 3D display technology, and applies it to the study of BR. The aim is to provide (1) a framework for cross-disciplinary research that is directly relevant to both fields (i.e., rivalry research and 3D display technology development) and (2) a detailed but accessible resource for investigators new to rivalry research who do not have visual science expertise. In the section to follow, we begin by outlining the distinction and overlap between rivalry research and 3D display technology development.
Application of Technology: Research and Development for Stereopsis and BR
During normal vision, stereoscopic depth perception is achieved as a function of the binocular disparity (i.e., parallax) between two images, which slightly differ because each eye views a scene from a slightly different perspective (Howard & Rogers, Reference Howard and Rogers2012; see also Vishwanath & Hibbard, Reference Vishwanath and Hibbard2013). Both stereopsis and BR involve binocular disparity between two images, resulting in a unitary (cyclopean) percept. The key difference between them, however, is the extent to which both images differ: stereopsis is induced by a small binocular disparity with near-identical images (i.e., stereoscopic presentation), whereas BR is a consequence of large binocular disparity from sufficiently dissimilar images (i.e., dichoptic presentation). Thus, the key distinction between stereopsis and BR is stimulus dependent. More specifically, it has been reported that disruption of binocular fusion to induce BR occurs, for example, with the following interocular stimulus parameter differences: (1) ~15–20% cycles/degrees for spatial frequency (Blakemore, Reference Blakemore1970; see also Yang et al., Reference Yang, Rose and Blake1992); (2) 15°–30° for relative orientation (Braddick, Reference Braddick1979; Kertesz & Jones, Reference Kertesz and Jones1970); (3) 30° for direction of motion (Blake et al., Reference Blake, Zimba and Williams1985); (4) two octaves for temporal frequency in Hz (i.e., a factor of 4; Alais & Parker, Reference Alais and Parker2012); and (5) 15–76 nm for color (subject to luminance, retinal eccentricity, and wavelength; Ikeda & Nakashima, Reference Ikeda and Nakashima1980; Ikeda & Sagawa, Reference Ikeda and Sagawa1979; Qin, Jiang et al., Reference Qin, Jiang, Takamatsu and Nakashima2009; Qin, Takamatsu et al., Reference Qin, Takamatsu and Nakashima2009; see also Hollins & Leung, Reference Hollins, Leung, Armington, Krauskopf and Wooten1978; summarized in Blake Reference Blake1989, Reference Blake1995).
In the field of 3D display technology research, the literature has primarily had an engineering focus to improve the quality of stereopsis and its error-resilience for the 3D display market (e.g., Barkowsky et al., Reference Barkowsky, Wang, Cousseau, Brunnström, Olsson and Le Callet2010; Blundell & Schwarz, Reference Blundell and Schwarz2006; Choi, Reference Choi, Sheng, Yu and Chen2010; Fernando et al., Reference Fernando, Worrall and Ekmekcioǧlu2013; Forman, Reference Forman, Chen, Cranton and Fihn2012; Gotchev, Strohmeier et al., Reference Gotchev, Akar, Capin, Strohmeier and Boev2011; Gutiérrez et al., Reference Gotchev, Strohmeier, Mueller, Akar and Petrov2011; Hodges, Reference Hodges1992; Holliman et al., Reference Holliman, Dodgson, Favalora and Pockett2011; Knorr et al., Reference Knorr, Ide, Kunter and Sikora2012; Li, Ma et al., Reference Li, Thompson, Deng, Chan, Yu and Hess2013; Lipton, Reference Lipton1982; McAllister, Reference McAllister1993; Mendiburu, Reference Mendiburu2011; Moorthy et al., Reference Moorthy, Su and Bovik2013; Reichelt et al., Reference Reichelt, Häussler, Fütterer, Leister, Javidi, Thomas, Desjardins and Son2010; Symanzik, Reference Symanzik2011; Urey et al., Reference Urey, Chellappan, Erden and Surman2011; Zhu et al., Reference Zhu, Zhao, Yu and Tanimoto2013). Stereoscopic 3D technology now has widespread utility in entertainment (e.g., movies, gaming), medical imaging and surgery, construction, oil and gas exploration, geological mapping, industrial design, aviation and navigation, data visualization, molecular modeling, drug development, and education (e.g., Cai et al., Reference Cai, Lu, Fan, Indhumathi, Lim, Chan and Li2006; Clements et al., Reference Clements, Mintz and Blank2010; Corkidi et al., Reference Corkidi, Voinson, Taboada, Córdova and Galindo2008; De Araújo et al., Reference De Araújo, Casiez, Jorge and Hachet2013; Ersoy et al., Reference Ersoy, Şen, Aydar, Tatar and Çelik2010; Held & Hui, Reference Held and Hui2011; Hofmeister et al., Reference Hofmeister, Frank, Cushieri and Wade2001; Van Orden & Broyles, Reference Van Orden and Broyles2000; Zone, Reference Zone2012). Demonstrated benefits of 3D displays over 2D viewing include enhanced understanding of spatial relationships, spatial manipulation of objects, and performance of difficult tasks in complex environments (e.g., Smallman et al., Reference Smallman, St John, Oonk and Cowen2001; Van Orden & Broyles, Reference Van Orden and Broyles2000; for review see McIntire et al., Reference McIntire, Havig, Geiselman, Desjardins, Marasco, Sarma and Havig II2012). In the field of BR research, studies examining the relationship between stereopsis and BR have found that they can occur simultaneously (Andrews & Holmes, Reference Andrews and Holmes2011; Su et al., Reference Su, He and Ooi2009; Wolfe, Reference Wolfe1986). However, perfecting stereoscopic image quality with 3D display technology has required that BR be eliminated or at least substantially minimized (Hoppe & Melzer, Reference Hoppe, Melzer, Fischer and Smith1999; McAllister, Reference McAllister1993; Mendiburu, Reference Mendiburu2011; Patterson, Reference Patterson2009). That is, any simultaneously occurring BR is targeted as an unwanted perceptual artifact for removal. This objective has recently been achieved, for example, using computational strategies and algorithmic frameworks that model BR in order to eliminate or conceal its occurrence, and thus improve the perceived quality of 3D images (e.g., Aflaki et al., in press; Barkowsky et al., Reference Barkowsky, Wang, Cousseau, Brunnström, Olsson and Le Callet2010; Bensalma, & Larabi, Reference Bensalma and Larabi2009; Chen et al., Reference Chen, Su, Kwon, Cormack and Bovik2013; Li, Ma et al., Reference Li, Thompson, Deng, Chan, Yu and Hess2013). Given the recent intensive interest in BR research, however, the following question arises: how can 3D display technology be utilized to reliably induce and study BR, rather than eliminate it?
In principle, any type of stereoscopic 3D display technology can be used to achieve BR. Instead of viewing near-identical images to achieve stereopsis, dichoptic viewing of two sufficiently different images should yield BR. In practice, however, this objective is not without its challenges, given the following factors. First, BR has typically been an experimentally induced phenomenon of interest in neuroscience, psychology, and vision research. As noted above, however, 3D display technology has been primarily developed to improve stereoscopic image quality to the exclusion of BR, with such work typically done in the field of engineering and computer science. Second, this apparent divide is reflected by the fact that different types of stereoscopic display techniques have yet to be conceptualized within the context of dichoptic viewing methods for BR research (see next section). Third, the rapid development and wide commercialization of 3D display technology over recent years reflects economic factors such as industry-driven advances and general consumer demand (Chinnock, Reference Chinnock, Chen, Cranton and Fihn2012; McIntire et al., Reference McIntire, Havig, Geiselman, Desjardins, Marasco, Sarma and Havig II2012; Mendiburu, Reference Mendiburu2011, Reference Mendiburu, Chen, Cranton and Fihn2012; Ng & Funk, Reference Ng and Funk2013; cf. Grasnick, Reference Grasnick, Quan, Qian and Asundi2013); thus, the specialized hardware and software development required for BR research has generally been overlooked by manufacturers. Indeed, display metrology variables and human factors (e.g., individual differences) that affect stereoscopic image quality have been specifically examined to improve 3D technology (e.g., Abileah, Reference Abileah2011, Reference Abileah2013; Barkowsky et al., Reference Barkowsky, Brunnström, Ebrahimi, Karam, Lebreton, Le Callet, You, Zhu, Zhao, Yu and Tanimoto2013; Fernando et al., Reference Fernando, Worrall and Ekmekcioǧlu2013; Hurst, Reference Hurst2012; IJsselsteijn et al., Reference IJsselsteijn, Seuntiëns, Meesters, Schreer, Kauff and Sikora2005; Larimer, Reference Larimer, Bhowmik, Li and Bos2008; Miseli, Reference Miseli2013; Patterson, Reference Patterson, Chen, Cranton and Fihn2012; Yamanoue et al., Reference Yamanoue, Emoto and Nojiri2012), rather than for the purpose of BR research. Thus, the specialized software and technical system information required for BR viewing specific to each type/model of 3D display device has either been overlooked, is not readily available, or cannot be readily used. Typically, such specialized software, for example, is developed in-house by BR researchers for use with a particular type/model of 3D display device, and thus may not be readily available or applied to different systems used by other investigators. As mentioned above, one major aim of the current review is to address this apparent divide between 3D display technology and its application in BR research by bringing together the relevant methodological and technical considerations from both fields.
Taxonomy of Dichoptic Display Methodologies for BR
Nearly two centuries ago, Sir Charles Wheatsone's invention of the mirror stereoscope and prism stereoscope revolutionized the empirical study and understanding of binocular vision (Wade & Ono, Reference Wade and Ono2012; Wade et al., Reference Wade, de Weert and Swanston1984). Using the mirror stereoscope, he went on to conduct the first systematic study of BR (Wheatstone, Reference Wheatstone1838). Along with anaglyphs (see Figure 3), both types of stereoscope (Figure 6a and 6b) are the traditional methods used to investigate BR and stereoscopic vision. In a recent review of these techniques, their respective advantages and disadvantages were evaluated for studying BR (Carmel, Arcaro et al., Reference Carmel, Arcaro, Kastner and Hasson2010). Choosing a method most suitable for BR testing was shown to depend on the specific research question and study design. For example, anaglyphs are generally considered not suitable for research questions involving multi-colored BR images, given the technique relies on different color filters in each eye to elicit rivalry. In the current article, we review traditional methods of stereoscopic viewing along with modern 3D display technology as further avenues for BR research. The latter methods that are surveyed include LCS goggles (Figure 4), passive polarized filters (PPF; Figure 5), dual-input head-mounted display (HMD) goggles (Figure 6c), and autostereoscopy (ATS) displays (Figure 7).
In the stereopsis and 3D display literature, classification of such viewing methods has been a common and recurring theme (e.g., Abileah, Reference Abileah2011, Reference Abileah2013; Benton, Reference Benton2001; Benzie et al., Reference Benzie, Watson, Surman, Rakkolainen, Hopf, Urey and von Kopylow2007; Blundell, Reference Blundell2011; Borel & Doyen, Reference Borel, Doyen, Dufaux, Pesquet-Popescu and Cagnazzo2012; Diner & Fender, Reference Diner and Fender1993; Forman, Reference Forman, Chen, Cranton and Fihn2012; Hodges, Reference Hodges1992; Holliman, Reference Holliman, Dakin and Brown2006; Howard & Rogers, Reference Howard and Rogers2012; Järvenpää & Salmimna, Reference Järvenpää and Salmimaa2008; Kejian & Fei, Reference Kejian and Fei2010; Konrad & Halle, Reference Konrad and Halle2007; Kovács & Balogh, Reference Kovács, Balogh, Mrak, Grgic and Kunt2010; Lee & Kim, Reference Lee, Kim, Osten and Reingand2012; Lee & Park, Reference Lee, Park and Chien2010; Lueder, Reference Lueder2011; Pastoor, Reference Pastoor, Schreer, Kauff and Sikora2005; Peddie, Reference Peddie2013; Su et al., Reference Su, Lai, Kwasinski and Wang2013; Surman, Reference Surman, Zhu, Zhao, Yu and Tanimoto2013; Symanzik, Reference Symanzik2011; Valyus, Reference Valyus1962; Urey et al., Reference Urey, Chellappan, Erden and Surman2011). Most recently, the historical development of devices, hardware, and software for stereoscopic viewing was succinctly covered by Peddie (Reference Peddie2013). Critically, however, such information has yet to be systematically examined and presented within a specific framework for BR research. As a starting point, we have reconceptualized the prevailing classification of stereoscopic viewing techniques cited above. Figure 2 shows a taxonomy of dichoptic viewing methods that are most commonly used for BR, categorized according to the principle of how images are multiplexed (i.e., combined). As a summary, spectral multiplexing involves image selection by wavelengths (i.e., colors) at either end of the light spectrum. This method is commonly recognized in the form of cardboard frame glasses with complementary monochrome filters (e.g., red and blue), which create depth when viewing illustrations in books and magazines. When viewing two different stimuli with corresponding colors (i.e., red and blue), the technique works by the filters blocking and passing the respective image, one to each eye. Time-multiplexed methods (also known as temporal interleaving or temporal multiplexing) involve the rapid alternating presentation of left- and right-eye images via shutters to produce a seamless view of the respective image for each eye. Polarization-multiplexed methods involve two images of different planes of polarization, which are viewed through complementary polarizer filters that block and pass the respective image to each eye. Spatially multiplexed methods display two images, one to each eye, by reflecting, refracting, physically displacing, redirecting, and selectively presenting the direction of light to left and right retinal channels.
In the sections to follow, an explanation for each method is given along with a schematic diagram to illustrate its basic mechanism of BR induction. Clear and accessible presentation of such technical information, particularly for studying BR (e.g., Carmel, Arcaro et al., Reference Carmel, Arcaro, Kastner and Hasson2010), has been lacking in the research literature and is therefore additional grounds for the current review. Each dichoptic viewing method is discussed in the context of a typical model. It is beyond the current article's scope to provide an exhaustive review of all currently available models, notwithstanding the lack of applied research on the wide variety of devices available. Furthermore, it is important to note that variations exist between particular models for each dichoptic viewing technique, such that some models are specialized for certain testing conditions (e.g., brain imaging). It is also beyond the current article's scope to survey in detail the several extrinsic and intrinsic factors that affect BR parameters, which have been reviewed elsewhere in the BR literature (see references in the preceding section). Although such factors can influence the choice of dichoptic display method (e.g., depending on the BR study of interest), the current article's focus is on other such variables that have received relatively less attention (see sections from ‘Comparison of BR Display Methods’ onwards). Specific 3D display models mentioned in the present review are commercially available and current at the time of writing. Where applicable, a particular model that has been used in published BR research is also indicated accordingly.
Spectrally Multiplexed Techniques — Anaglyphs
The anaglyphic method, demonstrated by Heinrich Wilhelm Dove in the early 1840s (see Gosser, Reference Gosser1977), is a convenient and relatively simple method of inducing BR. The viewer wears glasses with monochrome filters of different wavelength bands (e.g., 370–550 nm and 550–730 nm), one for each eye, such as red over the left eye and blue over the right eye (Figure 3). A corresponding stimulus is constructed by fusing two monochrome images, each with an equivalent pass-band as the corresponding filter (e.g., a house image in a shade of red and a face image in a shade of blue).
During anaglyph viewing of the presented stimulus, each monochrome image can only pass through the monochrome filter with the matching wavelength band, and is attenuated by the other filter. Each image therefore is selectively transmitted through the intended filter, resulting in different stimuli that are spectrally separated, one to the left retinal channel of the observer and the other to the right retinal channel.
Time-Multiplexed Techniques — Liquid Crystal Shutter Goggles
Patented by Lenny Lipton in 1985 (U.S. Patent No. 4,523,226; see Lipton et al., Reference Lipton, Starks, Stewart and Meyer1985), the LCS method presents a frame-sequential dichoptic display using electro-optical shutters (i.e., filter), one for each eye (Figure 4). Each LCS filter correspondingly and rapidly alternates in polarization state — that is, either transmit or occlude light — in temporal synchrony with the frame rate (or refresh rate) of the monitor (e.g., Fergason et al., Reference Fergason, Robinson, McLaughlin, Brown, Albileah, Baker, Green, Woods, Bolas, Merritt and McDowall2005). Frame rate is the temporal frequency at which different frame-sequential images are drawn on the screen within a given period.
In one frame sequence, when the left-eye's image is displayed on the monitor, voltage is applied to the right-eye shutter, which results in polarization that causes the right-eye filter to darken and be temporarily occluded. The simultaneous absence of voltage to the left-eye shutter leaves the filter transparent, allowing the image to transmit to the intended (left) eye. The process is repeated when the right-eye's image is presented: the left shutter is closed while the right shutter is transparent, thus allowing the image to be viewed by the right eye. Such alternations in polarization of the shutters are in synchrony with the monitor frame rate. The temporary occlusion of one of two shutter filters means that the monitor's frame rate is divided (i.e., halved) between the shutters for the two eyes.
In humans, binocular critical flicker frequency is the transition point where perception of a flickering light source appears as a continuous light (e.g., Isono & Yasuda, Reference Isono and Yasuda1990), with an average of 55 Hz (noting individual variation across age groups and sex; Misiak, Reference Misiak1947). As such, the presented frame sequence for each eye's image requires a greater frequency, such as 60 Hz, to ensure each stimulus merges to generate monocular perception of that stimulus as a continuous (i.e., non-flickering) unitary image. The monitor frame rate, typically at 120 Hz, is therefore double the shutter frame rate for each eye (e.g., 60 Hz) to avoid persistent flicker and ensure that a continuous stream of discordant images is dichoptically viewed, especially if the stimuli are animated in real time. In essence, this doubles the hardware requirements of the LCS setup. LCS goggle setups that have been used for BR research include the FE-1 by Cambridge Research Systems (e.g., Alais & Melcher, Reference Alais and Melcher2007; Ross & Ma-Wyatt, Reference Ross and Ma-Wyatt2004) and CrystalEyes 3 by StereoGraphics Corporation (e.g., Buckthought et al., Reference Buckthought, Kim and Wilson2008).
Polarization-Multiplexed Techniques — Passive Polarized Filters
First proposed and patented by John Anderton in 1891 (U.S. Patent No. 542,321; see Anderton, Reference Anderton1895), the use of passive polarizer filters for stereoscopic viewing with glasses was subsequently demonstrated by Edwin Land in 1936 (Land & Hunt, Reference Land and Hunt1936; for a chronology of polarizer development see Walworth, Reference Walworth, Woods, Holliman and Favalora2013). This method, albeit different in multiplexing principle from LCS goggles, is based on the same polarization technology. That is, rather than monocularly viewing unaltered light in alternation, image separation is achieved by viewing light from each image in different planes (i.e., angles) of polarization (Figure 5a). Polarization states used for dichoptic displays are generally either linear or circular (see section ‘Interocular Crosstalk’). The former is most commonly used and will therefore be the focus of the current article.
A typical linear PPF setup requires a dual-screen unit oriented at right angles to each other with a half-silvered mirror in between (e.g., Planar StereoMirror™; Fergason et al., Reference Fergason, Robinson, McLaughlin, Brown, Albileah, Baker, Green, Woods, Bolas, Merritt and McDowall2005; see also Kollin & Hollander, Reference Kollin, Hollander, Woods, Bolas, McDowall, Dodgson and Merritt2007). During dichoptic viewing, different images are independently and simultaneously presented at the same position on separate screens. Light from each screen passes through polarizing filters arranged at perpendicular (90°) orientations, usually at 45° and 135°. As both incident beams from the screen hit the half-silvered mirror, one beam transmits through the mirror unchanged, while the other beam is reflected at a 45° angle off the mirror, resulting in a stimulus of the two (spatially) superimposed images when naturally viewed. The observer views the stimulus by wearing passive glasses with polarized filters — each with an orthogonal plane of polarization — that correspond with the orientation of the polarizing filters on the screens. As polarized light only transmits through the filter with the same plane of polarization (and is absorbed by the other filter of orthogonal orientation), each of the two different images within the stimulus is separated to corresponding retinal channels.
Recent advances in PPF technology have also produced single-screen interleaved PPF monitors (Figure 5c and 5d; e.g., U.S. Patent No. 605,776, 2010; AOC e2352Phz, Zalman ZM-M215W/M240W; Trimon ZM-M240W by Zalman used in Fahle et al., Reference Fahle, Stemmler and Spang2011). Single-screen interleaved PPF monitors use a similar multiplexing principle to the dual-screen PPF setup, whereby each adjacent pixel row (or column) on the screen is placed directly behind a corresponding polarizing filter arranged in orthogonal orientations. BR viewing with these monitors therefore requires a dichoptic image pair to be presented (i.e., projected) in each adjacent pixel row (or column), which are then transmitted through the corresponding polarizing filter on the monitor and glasses to each eye. Multiplexing of the dichoptic image pair for BR viewing is therefore achieved without the need for a half-silvered mirror.
Spatially Multiplexed Techniques — Mirror Stereoscope, Prism Lens Stereoscope, Head-Mounted Display, Autostereoscopy
Invented by Wheatstone in the early 1830s (Wade, Reference Wade2012), the modern-day mirror stereoscope setup is regarded by vision scientists as a versatile and precise viewing method for research, and as such has traditionally been used for fine-scale psychophysical studies of BR. A common setup involves two different images presented on a single monitor in a vertical split-screen fashion (Figure 6a). To reflect light from each image separately to the corresponding eye, two inner (central) mirrors and two outer (lateral) mirrors are used. With the inner mirrors, one each is placed directly in front of each eye, while with the outer mirrors one each is placed beside a corresponding inner mirror and in front of the observer toward the periphery. This means that for the left-eye's inner-outer mirror pair, the outer mirror is oriented at 135° while the inner mirror is oriented at 45° to the eye's line of viewing; and vice versa for the right eye's mirror pair. In open designs, a divider is centrally placed between the two sets of mirrors — between the eyes from the viewer's head to the monitor screen — to partition each eye's line of vision from the image intended for the other eye. To stabilize the observer's head, a chin rest or head rest is also used.
An alternative to this common setup involves the use of two monitors, each placed laterally to the observer and displaying a different image at the center of each screen. The observer dichoptically views both images via two mirrors oriented — one at 135° and the other at 45° — to reflect light exclusively from one monitor to the corresponding eye. However, this article will focus on the more commonly used single monitor setup. It is also important to differentiate the mirror stereoscope from the more general term ‘haploscopes’, which defines a class of devices that displace the visual fields of the two eyes (including the mirror and prism stereoscopes as well as anaglyphs). In clinical orthoptic and ophthalmic settings for instance, haploscopes such as the amblyoscope and synoptophore are common diagnostic and treatment instruments (Ohmi et al., Reference Ohmi, Fujikado, Ohji, Saito and Tano1997; Rutstein & Daum, Reference Rutstein and Daum1998; Stanworth, Reference Stanworth1958).
The prism lens stereoscope — also invented by Wheatstone in the early 1830s (Bowers, Reference Bowers1975; Wade, Reference Wade2012) — adopts the same image presentation and equipment as the mirror stereoscope setup (e.g., divider and head stabilizing apparatus). The key difference is that instead of using mirrors to reflect light, the observer views each image through a convex prism lens that refracts light (Figure 6b). Different images are independently transmitted from the screen separately to each corresponding lens, which refracts light to corresponding retinal locations of the observer's eyes.
One issue with custom-built mirror and prism stereoscopes is that observers might see both images simultaneously, one through the mirror/lens and the other directly, thereby causing additional images to appear beside the rivaling percept. To rectify this issue, a physical barrier of sufficient size can be placed between the two eyes from the inner mirrors or lenses to the screen. In addition, for the mirror stereoscope the distance between the inner mirrors and screen can be fine-tuned for each individual subject (and subsequent rotation of the outer mirrors), such that each eye cannot directly view the image intended for the other eye.
Another spatially multiplexed method for inducing BR involves presenting the stimuli at corresponding screens mounted directly in front of the eyes, that is, using HMDs. While two different HMD methods — dual-input and single-input — can both achieve 3D viewing, only the former can present dichoptic stimuli to induce reliable BR viewing. The single-input method involves frame-sequential presentation of images in successive frames on a single screen that is viewed by both eyes. This method leads to unavoidable crosstalk (see section ‘Interocular Crosstalk’), and therefore it will not be discussed further in the current article.
First put forward by Morton Heilig in 1960 (U.S. Patent No. 2,955,156; see Heilig, Reference Heilig1960), dual-input HMDs consist of two independent miniature output screens placed at a comfortable position, one in front of each eye (Figure 6c). Each screen is capable of projecting a different image, one directly to each corresponding eye, thus creating a dichoptic display for BR (e.g., Sony HMZ-T1, zSight™, Occulus Rift by Occulus VR™; eMagin z800 3DVisor used in Huang et al., Reference Huang, Baker and Hess2012; see also Mizuno et al., Reference Mizuno, Hayasaka, Yamaguchi and Jobbágy2012; Travers, Reference Travers, Bhowmik, Li and Bos2008). In military and aviation settings, various types of HMDs, also known as helmet-mounted displays, are used under applied conditions in which the occurrence of BR is considered problematic (Haitt et al., Reference Haitt, Rash, Heinecke, Brown, Marasco, Harding and Jennings2008; Melzer, Reference Melzer and Spitzer2007; Patterson et al., Reference Patterson, Winterbottom, Pierce and Fox2007; Temme et al., Reference Temme, Kalich, Curry, Pinkus, Task, Rash, Rash, Russo, Letowski and Schmeisser2009).
Another spatially multiplexed method for viewing dichoptic stimuli involves autostereoscopic viewing, that is, without the need for optical equipment and goggles (e.g., mirrors, prism lenses, HMDs). ATS screens are generally divided into three classes: re-imaging, volumetric, and parallax. The current article will focus on the third variety as they are the most common and most compatible with computer graphics video cards (e.g., Nvidia GeForce 8800 GTS) typically used in vision research (Halle, Reference Halle1997).
The first ATS display was invented by James Clerk Maxwell in 1868 by constructing a prism stereoscope-like setup such that an observer was not aware of using any optical apparatus (Maxwell, Reference Maxwell1868). Today, modern conventional ATS setups consist of a liquid crystal display (LCD) monitor for image projection, with an optical filter directly in front of the screen for dichoptic image separation. Two types of optical filters are commonly used (see Travis, Reference Travis, Chen, Cranton and Fihn2012): (1) the parallax barrier — first proposed by Auguste Berthier (Reference Berthier1896a, b) and later applied by Frederic Ives in 1903 (U.S. Patent No. 725,567; see Ives, Reference Ives1903) — selectively blocks light transmission from traveling to particular directions (Figure 7a); and (2) the lenticular screen — enunciated by Walter Hess in 1915 (U.S. Patent No. 1,128,979; see Hess, Reference Hess1915) — consists of a micro-lens array that selectively refracts light toward separate directions (Figure 7b). A third type of optical filter is directional backlight but will not be discussed here (see Sasagawa et al., Reference Sasagawa, Yuuki, Tahata, Murakami and Oda2003; Schultz et al., Reference Schultz, Brott, Sykora, Bryan, Fukamib, Nakao and Takimoto2009, for more detail).
BR via ATS display requires an image pair to be presented via adjacent pixel columns on the monitor. Light projected from each set of pixel column passes through the filter directly in front of it, which alters the direction that the light travels as a function of the angle. Because the eyes are horizontally adjacent to each other, subjects positioned at optimal observation zones — where each retinal channel (i.e., eye) is stimulated by pixel columns of one image — will see separate images, one by each eye. Positioning within the zone is most optimal when both eyes have maximum visibility of the intended image and minimum visibility of the unintended image. ATS displays are commonly classified as a spatial-multiplexed method but they can also be considered a directional-multiplexed method, that is, light from one set of pixels (for one image) is redirected toward a particular direction while light from the remaining set of pixels (for the other image) is redirected in the opposite direction. Models that have been used for BR research include the 2018XLC by Dimension Technologies Inc. (e.g., Miller et al., Reference Miller, Harvey and Dobson2004) and the Sanyo THD-10P3 (e.g., Nagamine et al., Reference Nagamine, Yoshino, Yamazaki, Obara, Sato, Takahashi and Nomura2007, Reference Nagamine, Yoshino, Miyazaki, Takahashi and Nomura2008, Reference Nagamine, Yoshino, Miyazaki, Takahashi and Nomura2009; Qin et al., Reference Qin, Takamatsu and Nakashima2006; Qin, Jiang et al., Reference Qin, Jiang, Takamatsu and Nakashima2009; Qin, Takamatsu et al., Reference Qin, Takamatsu and Nakashima2009).
Portable ATS technology is also an emerging area in the 3D display market. For example, such technology is commercially available on laptops (e.g., Toshiba ‘Qosmio F750, F755, and X770’; Sharp ‘Actius RD3D’), smart phones (e.g., HTC EVO 3D), and game consoles (e.g., Nintendo 3DS). However, such devices along with portable 3D display technology in general are still in their infancy and remain an area of ongoing development (Boev & Gotchev, Reference Boev and Gotchev2013; Fattal et al., Reference Fattal, Peng, Tran, Vo, Fiorentino, Brug and Beausoleil2013; Gotchev, Akar et al., Reference Gotchev, Akar, Capin, Strohmeier and Boev2011; Harrold & Woodgate, Reference Harrold, Woodgate, Woods, Bolas, McDowall, Dodgson, Merritt and Holliman2007; Kimmel, Reference Kimmel, Chen, Cranton and Fihn2012; Ogniewski & Ragnemalm, Reference Ogniewski and Ragnemalm2011; Travis, Reference Travis, Bhowmik, Li and Bos2008; e.g., MOBILE3DTV, IEE, Hitachi, MasterImage 3D). For ATS mobile devices in particular, several technical challenges such as miniaturization and bandwidth issues still need to be addressed (Dodgson, Reference Dodgson2013; Su et al., Reference Su, Lai, Kwasinski and Wang2011). Nevertheless, the prospect of well-developed mobile ATS displays could enable new and broader avenues for BR research (see section ‘Convenience and Design’). In the sections to follow, the various dichoptic viewing methods covered thus far will be comprehensively compared within a framework of BR research.
Comparison of BR Display Methods
Across the different methods of dichoptic presentation, several parameters warrant consideration in determining the most suitable one for a particular BR study. Table 1 provides a summary assessment of these methods in this regard, and can be used accordingly as a decision-making aid. Dark-shaded gray boxes (green online) indicate the method (column) is advantageous for BR in regard to the parameter or experimental factor of interest (row). Light-shaded gray boxes indicate that the attribute applies only under restricted conditions. These points are discussed in more detail below by comparing the advantages and disadvantages of each method depending on the study of interest along with other considerations.
Dark-shaded gray cells (green online) denote a favorable attribute of the method, and light-shaded gray cells indicate an important caveat to the corresponding attribute. Letter denotations qualify the advantageous attribute, explain the caveat, or provide additional technical information.
aAn exception for example is the Infitec® GmbH, which supports full-color (RGB) image presentation, although the target stimulus viewed by each eye is comprised of different wavelength bands. This anaglyph model has been used in studies of BR and related phenomena (e.g., de Jong et al., Reference de Jong, Kourtzi and van Ee2012; Watanabe et al., Reference Watanabe, Cheng, Murayama, Ueno, Asamizuya, Tanaka and Logothetis2011). With conventional anaglyphs, observers may also perceive a color different to that of the original image presented.
bWavelengths of light have different refractive indices (Hecht, Reference Hecht2002); thus some colors of the presented image may be distorted when viewed through lens-based setups (e.g., chromatic aberration), which consequently may affect other stimulus parameters (e.g., luminance, contrast). It is beyond the current article's scope, however, to compare the different display methods across such various stimulus characteristics, and this remains to be reviewed in the literature. It is also beyond the current article's scope to systematically discuss the numerous features of rivalry and the various investigative methods to examine them.
cGreater angular extent can be achieved with a dual-monitor setup, reportedly up to 90° (Howard & Rogers, Reference Howard and Rogers2012).
dThis artifact will be more severe with sub-standard models. Higher-end shutter goggle systems are available that eliminate crosstalk by using monochrome (yellow-green) cathode ray tube (CRT) monitors with ultra-short persistence phosphor (P46; Vision Research Graphics). The artifact can also be eliminated by using digital light processing (DLP) projections that are capable of rapid pixel response times (~2 μs; Hornbeck, Reference Hornbeck1998; e.g., ‘Galaxy series’ by Barso, ‘Mirage series’ by Christe Digital, ‘DepthQ’ by Infocus/Lightspeed Design; for a list of tested models, see Woods & Rourke, Reference Woods, Rourke, Woods, Bolas, McDowall, Dodgson, Merritt and Holliman2007).
eNegligible if subject maintains a level head position (i.e., pitch), or if circular PPF are used (see also Figure 10).
fIf the subject's eyes are positioned within the optimal observation zones.
gMay be considerably less depending on a particular model's specifications. The prices indicated do not include the PC system used for running the stimulus presentation program.
hThis estimated cost range is for the typical glasses only, that is, models without full-color rendering.
iMore expensive for non-ferromagnetic models. The recent Sony HMZ-T1 HMD model offers dichoptic display with high resolution 1280 × 720 organic light-emitting diode (OLED) panels for about USD$800. Note also that the price range indicated is for goggles only, and not for an accompanying monitor as well, given their extensive variety and cost differences.
jDue to rapid advances in 3D display technology, cheaper PPF monitors with higher specifications have also recently become available (e.g., ‘Cobox’ by Ekeren 3D Equipment; http://www.ekeren3d.com/viewing.html). Single-screen interleaved PPF monitors that do not suffer the disadvantage of high cost, albeit with lower spatial resolution, are also available (e.g., AOC e2352Phz).
kSufficiently large multiple-viewer scenarios may incur increased signal congestion and thus transmission error and response lag. This issue could disrupt accurate synchrony between the monitor and LCS goggles to cause perceptual artifacts (e.g., flicker).
lStudies using stimuli that are presented intermittently (i.e., with blank intervals) may be affected by factors such as flicker artifact and phosphor decay rate depending on the interstimulus interval. This issue is also relevant to experiments employing flickering stimuli, such as rapid eye-swap protocols and frequency-tagging MEG/EEG studies (e.g., see ‘Neuroimaging’ section for references), however it is beyond the current paper's scope to systematically review their compatibility with the wide range of display types available.
mSufficient crosstalk may disrupt BR dynamics and cause particular visual features to break perceptual suppression (see section ‘Interocular Crosstalk’), especially if the image contains emotional and semantic content (i.e., image valence). Studies examining sub-threshold processing would thus need any crosstalk effects to be non-consequential, or alternatively the use of crosstalk-free methods of dichoptic viewing.
nFor functional magnetic resonance imaging (fMRI) and magnetoencephalography (MEG): if the material or model is non-ferromagnetic or does not contain ferromagnetic components. Ferro-magnetic restriction becomes even more critical for MEG, as no metal devices can be present at all. Even small amounts of metal can result in a current flow resulting in magnetic fields that could be stronger than the fields being measured. For PPF setups, a dual-projector system positioned outside the fMRI/MEG room projects the stimuli through a window onto the screen inside the fMRI/MEG room. Currently available ATS displays are suitable for electroencephalography (EEG) studies of BR but not for fMRI or MEG experiments.
oRapid frequent switching within each LCS may induce a minor but noticeable current to the EEG electrodes and leads close to the goggles (e.g., Fp1, Fp2, Fz).
pIf the eye-tracking equipment is mounted so that it is not detecting the point of gaze through the filters. For EOG, rapid frequent switching within each LCS may induce a minor but noticeable current to the electrodes and leads close to the goggles. Other dichoptic viewing methods also have their own workarounds in order to allow for concurrent eye-movement recording and BR viewing (see section ‘Eye-Movement Recording and Optokinetic Nystagmus’).
qOffline CVS/TMS/tDCS/tACS/tRNS/GVS — that is, stimulation applied before or between sessions of BR viewing — is compatible with all dichoptic viewing methods reviewed here. CVS = caloric vestibular stimulation; TMS = transcranial magnetic stimulation; tDCS = transcranial direct current stimulation; tACS = transcranial alternating current stimulation; tRNS = transcranial random noise stimulation; GVS = galvanic vestibular stimulation.
rOnline TMS/tDCS/tACS/tRNS — that is, stimulation during BR viewing — may not be compatible with HMD/LCS goggle setups depending upon other considerations (see section ‘Compatibility With Brain Stimulation Techniques’).
sThe mirror and prism lens stereoscopes may require individual subject adjustments and are therefore highly impractical in the intraoperative setting given the physical space restrictions, the extreme time limitations, and the sensitive nature of clinical subjects being tested. Regarding the latter point, LCS setups may not be suitable for use with particular epilepsy patients (especially individuals with photic epilepsy), as poor synchronization, slow response time in the LCD panels, or slow refresh rate per eye's panel (i.e., less than 60 Hz; see also Figure 4) can lead to clearly visible flicker. In comparison, extraoperative chronic experiments in epilepsy patients following electrode implantation for electrocorticography (ECoG; ~1 week) are more amenable to mirror stereoscopes. ECoG studies of CFS — a form of interocular suppression (Figure 11) — have used this dichoptic display method, in which presentation of the masking stimulus at 10 Hz did not evoke seizures in more than 10 subjects tested (N. Tsuchiya, personal communication).
tThe severity of simulator sickness and visual fatigue varies depending on the interaction of human factors and the magnitude of perceptual artifacts (e.g., flicker, crosstalk).
By way of summary, different methods of dichoptic display are first compared across various display metrology factors, such as viewing parameters and perceptual artifacts, which can affect the precision of stimuli presentation. Financial and practical considerations are also discussed, particularly as they have received little attention in the BR literature. Next, the application of dichoptic viewing methods to study BR-related phenomena is addressed, followed by discussion of their compatibility with brain-activity recording techniques, eye-movement tracking equipment, and brain stimulation methods. Thereafter, we discuss subject population issues, including subjects’ experiences associated with different dichoptic viewing methods. Finally, we address sample size issues, with a particular focus on the very large sample sizes needed for clinical and genetic rivalry studies, including a proposed solution to this barrier.
It is important to note that where relevant, research from the field of stereoscopic 3D display technology is discussed in the context of BR research. It is also worth noting that with certain methodological points in the current article, differences of opinion and preference may vary considerably among researchers and industry professionals. This wide variation is expected given differences in technical expertise and the fact that such specialized practical knowledge is invariably not shared widely in extensive detail. Nevertheless, the current review seeks to provide a framework for stimulating the development and dissemination of testing protocols that bridge BR and 3D display technology research, as well as a resource for non-specialist investigators new to rivalry research.
Image Viewing Parameters
For the anaglyph method, a major disadvantage is that full-color image presentation is precluded, with stimuli restricted to only shades of a single color (Carmel, Arcaro et al., Reference Carmel, Walsh, Lavie and Rees2010; Woods & Harris, Reference Woods, Harris, Woods, Holliman and Dodgson2010). Anaglyph images may also suffer from luminance and color distortion, that is, colors perceived through the anaglyph filters are different to that of the presented stimuli. Discrepancies in the luminance profile of two images also may arise due to different optical properties of the corresponding anaglyph filters. Therefore, if color is an important aspect of the experimental design, typical anaglyphs may not be suitable (see also Table 1). This is a non-issue for the other dichoptic display methods that support the presentation of two different images in the full-color spectrum.
The refractive index of light varies with different wavelengths (i.e., color; Hecht, Reference Hecht2002), depending on the material and shape of the lens. For the prism stereoscope and lenticular ATS setups, this means the presented image may be distorted as some colors are not in focus (i.e., chromatic aberration). With polarization-based LCS and PPF setups, although full-color viewing of dichoptic stimuli is enabled, these methods can also cause the images to physically darken (Choubey et al., Reference Choubey, Jurcoane, Muckli and Sireteanu2009; Pastoor & Wöpking, Reference Pastoor and Wöpking1997; Stevens, Reference Stevens2004). For ATS displays, a number of related issues to consider include the following: (1) reduced image brightness in parallax ATS displays as half of the subpixel columns are blocked by the barrier (Choubey et al., Reference Choubey, Jurcoane, Muckli and Sireteanu2009; Pastoor & Wöpking, Reference Pastoor and Wöpking1997); (2) interpixel gaps (i.e., black mask matrix) in lenticular-type displays can be projected, causing dark regions to appear between observation zones (Holliman et al., Reference Holliman, Dodgson, Favalora and Pockett2011; though it can be resolved by slanting the monitor's lenticular lenses: Lipton & Feldman, Reference Lipton, Feldman, Woods, Merrit, Benton and Bolas2002; Lipton et al., Reference Lipton, Starks, Stewart and Meyer1985; Lueder, Reference Lueder2011; van Berkel & Clarke, Reference van Berkel, Clarke, Fisher, Merritt and Bolas1997); and (3) light scatter at lenticular lens boundaries causing variation in illumination intensity (Hill & Jacobs, Reference Hill and Jacobs2006). Furthermore, because half of ATS monitor pixels are viewed by each eye, horizontal spatial resolution of each image is reduced by 50%. This condition may cause a ‘picket fence’ effect, especially when viewed at a substantially close range, whereby vertical lines in an image are visible due to the black mask between subpixel columns. Single-screen interleaved PPF monitors also inherently produce half-resolution dichoptic displays as each image is drawn on adjacent pixel rows and projected to each corresponding eye (e.g., AOC e2352Phz, Zalman ZM-M215W). A potential workaround to address the reduced spatial resolution of ATS and single-screen PPF displays is to use a hybrid display protocol, where dichoptic images are presented in successive frames reciprocally to adjacent pixel rows corresponding to each eye (Johnson et al., Reference Johnson, Kim and Banks2013). Furthermore, for lenticular-type ATS monitors, the pixels are magnified in viewing window planes, such that non-uniform features within the pixel (including the black mask between pixels) are also magnified (Hill & Jacobs, Reference Hill and Jacobs2006). Stimulus parameters (e.g., luminance, contrast, spatial frequency, temporal frequency) therefore need to be considered when choosing between lenticular and parallax methods for studies that involve (or are sensitive to) variations in stimulus signal strength (e.g., Fahle, Reference Fahle1982; Hollins, Reference Hollins1980; Wade et al., Reference Wade, de Weert and Swanston1984).
For ATS and single-screen interleaved PPF displays, it is also important to note that target dichoptic image pairs need to be interlaced as one image at a subpixel level, in accordance with monitor specifications. Each filter method and model has its own properties, and monitor manufacturers each have their own process for generating them. Investigators can therefore work either directly with the manufacturer to obtain the correct parameters of a particular model or a digital imaging specialist with expertise in subpixel rendering. As such, it is beyond the current article's scope to review all currently available display models for each method of dichoptic viewing. In general, investigators’ choice of a particular display model will be determined (in part) by its technical capabilities and compatibility with their BR study of interest.
Binocular Visual Field
Field of view is the angular extent of the observable field that is perceived by a subject (Figure 8), indexed by degrees of visual angle. The binocular visual field (BVF) is the region within this total field of view in which a target object is visible to both eyes. For dichoptic presentation techniques, the BVF is the allowable visuospatial area where dissimilar stimuli integrate (i.e., align) to induce BR.
In order to induce exclusive, unitary BR without any periods of piecemeal rivalry, the dichoptic stimuli should not exceed 0.1° visual angle (Blake, Reference Blake2001). This angular subtense increases with greater retinal eccentricity (Blake et al., Reference Blake, O'Shea and Mueller1992) and with lower spatial frequency (O'Shea et al., Reference O'Shea, Sims and Govan1997). Rivalry stimuli that are larger than this angular subtense induce locally perceived (piecemeal) rivalry within spatially restricted zones of the unitary image; however, stimuli of 1–1.5° visual angle induce a large proportion of exclusive rivalry visibility across the viewing time (with mixed states readily able to be identified by the subject and excluded from analysis). It is also worth noting that a reduced angular extent of the BVF may consequently limit the types of experimental design. Therefore, setups that allow a large BVF are more suitable for studies examining the presentation of large, peripheral, and/or multiple rivaling stimuli. As a general rule, the BVF can be modified by changing the distance from which the observer views the display monitor (i.e., the size of the display screen viewed) and correspondingly adjusting the angular size of the rivalry stimuli. Naturally, this option does not apply to methods that are not amenable to adjustments of subjects’ viewing distance (e.g., HMDs, ATS displays).
Both mirror and prism stereoscope setups have restrictively small BVFs, as half of the total visual field is used to present one image to each eye (Carmel, Arcaro et al., Reference Carmel, Walsh, Lavie and Rees2010). For prism stereoscopes in particular, stimuli must remain small as images presented far from central fixation can suffer from fish-eye distortion by the lens. Such distortions can be eliminated by viewing the images with a prism stereoscope through lenses of similar width, allowing for wider stimuli to be presented.
Anaglyphs, HMD and LCS goggles, PPF setups, and ATS displays are not particularly prone to this spatial restriction given their large BVF. As angular subtense will vary depending on the model, the BVF may extend for some models from approximately 60° horizontal/vertical for HMD goggles (e.g., zSight™; Nagahara et al., Reference Nagahara, Yagi and Yachida2003), 100° horizontal and 55° vertical for LCS goggles (e.g., ‘CrystalEyes’ by StereoGraphics Inc.), 89° horizontal/vertical for PPF (e.g., True3Di™ ‘SDM-400’), and 150° horizontal/vertical for ATS displays (e.g., Philips BDL5571VS/00; see also Lee et al., Reference Lee, Travis and Lin2008, for a prototype). With anaglyphs, LCS goggles and PPF setups, the BVF can also be increased by positioning the observer closer to the screen. For many ATS displays that are sensitive to head/eye location, the BVF is restricted although models with an integrated head- or eye-tracking function (e.g., HHI ‘Free2C 3D-Display’) allow greater flexibility for positioning the subject closer to the screen (see also section ‘Convenience and Design’).
Eye Vergence Stabilization via Fusion Cues
Normally, both left and right eyes make vergence movements so that similar images are fixated to corresponding retinal locations. If incongruent images are presented dichoptically, binocular vergence cannot be sustained as stable gaze in both eyes is disrupted. Spatially multiplexed dichoptic displays (except ATS) therefore require vergence stabilization via binocular fusion cues, that is, identical visual features positioned in corresponding retinal locations of both eyes. Binocular fusion cues facilitate the eyes’ convergence on the object of fixation to maintain reliable BR. For mirror and prism stereoscopes, the required fusion cues within the area of binocular integration further restricts the visual space allowed for BR. For studies that require BR viewing in the periphery or stimuli with a large angular subtense, these setups may not be ideal. If there is insufficient space in the BVF to allow for the positioning of fusion cues, observers may need fixation training to ensure accurate and reliable BR response recording.
Another important factor to consider in choosing between dichoptic display options is crosstalk. Despite variations in the mathematical conception of the term (see Woods, Reference Woods, Woods, Holliman and Dodgson2011, Reference Woods2012), crosstalk is commonly defined as light presented to one monocular channel leaking into another (Figure 9). That is, the imperfectly isolated presentation of an image to one retinal channel can unintentionally result in that image being partially perceived by the other eye. The subjective perceptual consequence of crosstalk is typically called ‘ghosting’ or ‘bleed-through’, and its visibility is often dependent on image parameters such as contrast and luminance (Daly et al., Reference Daly, Held and Hoffman2011; Pastoor, Reference Pastoor1995; Woods, Reference Woods, Woods, Holliman and Dodgson2011; Woods et al., Reference Woods, Apfelbaum and Peli2010; for a detailed discussion of crosstalk metrology for stereoscopy see Abileah, Reference Abileah2011, Reference Abileah2013; Barkowsky et al., Reference Barkowsky, Brunnström, Ebrahimi, Karam, Lebreton, Le Callet, You, Zhu, Zhao, Yu and Tanimoto2013; Blondé et al., Reference Blondé, Sacré, Doyen, Huynh-Thu and Thébault2011; Huang et al., Reference Huang, Chen, Lin, Lin, Chou, Liao and Lee2013; Hurst, Reference Hurst2012; Woods, Reference Woods2012). From the perspective of BR, it is also important to distinguish crosstalk-induced ‘ghosting/bleed-through’ from subjects’ experience of mixed percepts — i.e., perceiving a mixture of both images even with perfect dichoptic presentation (see Figure 1). In this case, pre-experimental testing of dichoptic display equipment is required to ensure accurate and reliable recording of BR parameters.
For studies of BR, crosstalk can also present other problems that require consideration. For instance, crosstalk may instigate or terminate perceptual dominance of one image over the other, which can affect temporal dynamics by preferentially breaking binocular suppression. This effect is especially salient when the perceptually masked visual feature(s) contain semantic and emotional content (e.g., fearful faces, words). Ultimately, the perceptual consequences of crosstalk can confound experimental results. Studies involving color rivalry also cannot be carried out if crosstalk is present. Moreover, even in subjects who do not report ghosting, the possibility remains that one retinal channel is processing the other eye's intended image via sub-threshold (indiscernible) crosstalk. This issue is problematic for studies examining sub-threshold processes either during BR or related phenomena using dichoptic viewing methods (see section ‘Compatibility With Related Phenomena’).
The mirror stereoscope, prism stereoscope, and dual-input HMD goggle setups avoid this problem as they present different stimuli independently to separate spatial regions of each retina. Because the images are spatially presented entirely separately, monocular channel images cannot leak into each other. Therefore, these methods are crosstalk-free and do not suffer from perceptual artifacts caused by crosstalk, provided that a divider/septum is maintained to prevent one eye seeing a portion of the other eye's presented image. It has been reported that to eliminate ghosting for filter-based methods (e.g., anaglyphs, LCS goggles, PPF, ATS), crosstalk to one eye should be ≤0.2% of the transmitted light of the unintended image (with artificially induced crosstalk using a mirror stereoscope and high-contrast CRT display; Hanazato et al., Reference Hanazato, Okui and Yuyama2000). Interocular crosstalk has also been estimated to be 0.1–0.3% for PPF and >4% for time-multiplexed techniques (Pastoor, Reference Pastoor1995). However, such work has not yet been specifically tested with BR (see Woods, Reference Woods2012, for an overview of crosstalk reduction strategies). It is worth noting though that from our experience, using red/blue anaglyph filters to view BR stimuli on a single-screen interleaved PPF monitor (AOC™ e2352Phz; Figure 5d), there is virtually no perceptible crosstalk.
For stereoscopic 3D viewing using anaglyphs, crosstalk remains an outstanding issue (for detailed discussion see Woods & Harris, Reference Woods, Harris, Woods, Holliman and Dodgson2010). Imperfect transmission and separation characteristics of filters (e.g., gradual roll-off in transmission), spectral quality of the monitor, as well as image compression and encoding are factors commonly cited as sources of spectral leakage (Woods & Rourke, Reference Woods, Rourke, Woods, Bolas, Merritt and Benton2004; Woods et al., Reference Woods, Yuen and Karvinen2007). These factors lead to slight overlap of the presented stimuli's wavelength bands, causing both images to be viewed by the retina. Such overlap results in ghosting with image elements intended for one eye leaking into the other eye's channel (Woods & Harris, Reference Woods, Harris, Woods, Holliman and Dodgson2010; Zeng & Zeng, Reference Zeng, Zeng, Eschbach, Marcu and Rizzi2011), with varying effect based on the choice of anaglyph model and filter combination (e.g., very low crosstalk with cyan and yellow; Woods & Harris, Reference Woods, Harris, Woods, Holliman and Dodgson2010). Optical leakage may also be due to the quality of the anaglyph image generation matrix (Woods, Reference Woods2010; see also Woods & Rourke, Reference Woods, Rourke, Woods, Bolas, Merritt and Benton2004; Woods et al., Reference Woods, Yuen and Karvinen2007), although it can be significantly mitigated by calibrating the anaglyph system via signal processing methods such as crosstalk cancellation (Sanftmann & Weiskopf, Reference Sanftmann and Weiskopf2011; Woods, Reference Woods2012), and using different algorithms to calculate color values of the image (Dietz, Reference Dietz, Baskurt and Sitnik2012; Dubois, Reference Dubois2001; Woods et al., Reference Woods, Harris, Leggo and Rourke2013). Multi-band spectral multiplexing (e.g., Infitec® GmbH) — where the visible light spectrum is divided into two complementary wavelength bands — can also be used as an improved alternative for separating channels (i.e., reducing crosstalk) by about 1:1,000 (Jorke & Fritz, Reference Jorke, Fritz, Woods, Bolas, McDowall, Dodgson and Merritt2006). It is important to note, however, that any solution for resolving crosstalk is system specific, that is, any algorithm needs to be adapted specifically to each particular anaglyph method and model.
While LCS goggles and PPF mostly eliminate ghosting, shutter leakage and angulation of the observer's head are major factors that contribute to crosstalk. With current active LCS technology, light obstruction to one eye does not always involve the shutter being 100% opaque (Woods & Tan, Reference Woods, Tan, Woods, Merritt, Benton and Bolas2002). As a proportion of light (~0.1%) is still transmitted while the shutter is closed, the result is the occluded eye may still see a small percentage of the image not intended for it (Woods, Reference Woods2010). Such crosstalk can increase especially if the presented images have relatively high luminance (i.e., brightness). The 0.1% value is widely accepted but may vary across different models, that is, optical filters (see Pastoor, Reference Pastoor, Schreer, Kauff and Sikora2005).
With LCS setups, another factor is the temporal synchrony between the monitor refresh rate and the LCS goggles. Because precise dichoptic presentation depends on fixed temporal synchrony between the shutters and the monitor refresh rate, disruption to this synchrony can also cause ghosting (Woods, Reference Woods2010; for review of crosstalk in LCS goggles, see Woods & Tan, Reference Woods, Tan, Woods, Merritt, Benton and Bolas2002). This outcome is especially the case if the monitor's refresh rate is low (for a detailed explanation, see Woods, Reference Woods2010), or if the LCS uses an infrared communication protocol (Woods & Helliwell, Reference Woods, Helliwell, Woods, Holliman and Favalora2012).
The type of monitor synchronized with LCS goggles also warrants major consideration (see Cowan, Reference Cowan, Bass, DeCausatis, Enoch, Lakshminarayanan, Li, MacDonald, Mahajan and Van Stryland2010, for a detailed discussion of monitor characteristics for vision research). Conventional use of CRT monitors with LCS goggles can be problematic due to their phosphor decay characteristics (i.e., afterglow; see also Lee et al., Reference Lee, Kim, Soh and Ann2002), which contributes to crosstalk (Peli & Lang, Reference Peli and Lang2001). Because CRT monitors draw images from top to bottom sequentially, the time taken for phosphor decay from its peak luminance — combined with the propagation delay between polarization states of corresponding shutters — can further increase crosstalk (Woods, Reference Woods2010). For LCD monitors, screen specifications such as temporal accuracy of the changes in luminance are critical factors contributing to crosstalk in LCS goggle setups. The rise and decay of an element in an LCD screen is heavily biased, with substantially slower decay times than high-end CRT or ultra-short persistence phosphor CRT monitors (e.g., Vision Research Graphics). As such, it is important to examine an LCD monitor's capabilities and system testing reports (e.g., Wang & Nikolić, Reference Wang and Nikolić2011) before using it in LCS goggle setups for BR presentation. Another consideration is examining whether a display system (e.g., CRT, LCD, digital light processing [DLP]) has the adjustable luminance and contrast range capable of reducing crosstalk. Though typically used for BR stimuli, green phosphor has the longest afterglow and therefore produces the most crosstalk. If ghosting is a perceptible problem in time-sequential methods (e.g., LCS), it has been suggested that the amount of green in the stimuli may need to be reduced (Lipton, Reference Lipton and McAllister1993). Angular displacement may also become a problem with any kind of LCD panel as they have a fixed viewing angle beyond which contrast, luminance, and color cannot be guaranteed.
The temporal accuracy with which an LCS panel shifts from opaque to clear can also contribute to crosstalk (Woods, Reference Woods2010). In the worst possible case, the unintended image of a previous frame is partially occluded (i.e., partially viewed) by the eye viewing the intended image of the next frame. Thus, imperfect synchronization between the monitor and time-sequential mechanism of LCS goggles remains a significant contributing factor to crosstalk. However, ghosting can be avoided by using DLP projection (which minimizes the sustained luminance problem of CRT/LCD displays and mechanical shutters; Packer et al., Reference Packer, Diller, Verweij, Lee, Pokorny, Williams and Brainard2001; Woods et al., Reference Woods, Apfelbaum and Peli2010). DLP projectors, which are capable of rapid pixel response times (~2 μs; Hornbeck, Reference Hornbeck1998) due to the Digital Micromirror Device chip™, are thus more suitable for LCS setups (e.g., ‘Galaxy series’ by Barso, ‘Mirage series’ by Christe Digital, ‘DepthQ’ by Infocus/Lightspeed Design; for a list of tested DLP projector models, see Woods & Rourke, Reference Woods, Rourke, Woods, Bolas, McDowall, Dodgson, Merritt and Holliman2007).
With PPF setups, perceptual artifacts caused by crosstalk are largely eliminated (e.g., 0.1–0.3% with linear filters; Pastoor, Reference Pastoor1995; Pastoor & Wöpking, Reference Pastoor and Wöpking1997). Nevertheless, most models primarily rely on linear polarizer filters that are highly dependent on angulation of the observer's head (i.e., angulation of filters in the glasses) in relation to the monitor (i.e., filters on the display system; Peli, Reference Peli1986). By tilting the filters to the perpendicular, its optical ability to absorb and block light of an inconsistent plane of polarization begins to decline, and more so with greater image contrast and monitor luminance (Pastoor, Reference Pastoor1995; Woods, Reference Woods2010, Reference Woods, Woods, Holliman and Dodgson2011). In particular, crosstalk in PPF rises sharply as a function of pitch rotation (y axis), compared with yaw rotation (z axis) angles for similar levels of crosstalk (Andrén et al., Reference Andrén, Wang and Brunnström2012). Head tilt is also a factor of crosstalk for LCS goggles because the optical performance of liquid crystal cells drops as head tilt deviates away from the center and toward the perpendicular (Woods, Reference Woods2010). In general, PPF are not perfect in attenuating the light of other polarizations when placed on LCD screens (e.g., transistor panels, but not in-plane switching [IPS] panels), and therefore crosstalk can result (Woods, Reference Woods2010). The extent of this crosstalk is determined primarily by the quality of the filters used on the monitor screens and those worn by the viewer.
The process of filtering light into specific orientations of polarization for dichoptic viewing introduces additional levels of complexity to the experimental protocol. Two types of polarization states (and therefore filters) are commonly used: linear and circular (see Figure 10; for detailed discussion, see Hecht, Reference Hecht2002). However, as both states attenuate the light signal — via the initial PPF on the monitor and subsequently via the PPF on the glasses — the luminance of the images will be reduced (Choubey et al., Reference Choubey, Jurcoane, Muckli and Sireteanu2009; Stevens, Reference Stevens2004). Another issue is that because dual-screen PPF setups rely upon the transmissive glass of the partially silvered mirror, it is neither a perfect reflector nor perfect transmitter. The coating on either side of the mirror is set for a specific transmission ratio (60/40, 70/30, etc.), therefore the luminance profile of the two screens needs to be adjusted accordingly to compensate for this difference. Furthermore, incidental to the stimulus presentation, the use of non-reflective coatings (or not) on one or both sides of the mirror can add to additional glare and light spill.
For linear PPF, as head stabilization is required to avoid artifacts caused by significant tilting, additional procedures need to be incorporated into the experimental protocol (e.g., pre-testing instructions, reminders, and checking by the experimenter). Formal head stabilization such as with a chin rest may also be advantageous, but is not necessary. The viewing angle to the monitor also needs to be considered to minimize crosstalk. Care with these additional experimental procedures particularly applies to studies in pediatric and some psychiatric subject groups (see section ‘Subject Group Considerations’). This potential for artifact, however, can be mitigated by employing circular passive polarization or IPS screens (e.g., LG E2370V-BF’/‘IPS231P-BN’/‘IPS226V-PN’/‘IPS236V-PN’, Mitsubishi ‘RDT233WX’).
Dichoptic viewing through circular PPF is achieved based on the same principle as a linearly polarized setup (see Figure 10a). A different image (light) projected from each monitor passes through circular passive polarizing filters of opposite handedness (left and right, Figure 10b; for detailed discussion, see Bennett, Reference Bennett, Bass, DeCausatis, Enoch, Lakshminarayanan, Li, MacDonald, Mahajan and Stryland2010). The observer views the superimposed or interleaved stimulus through glasses with circular PPF of different handedness, each matching the handedness of the monitor filters. Hence, light that is left-circularly polarized only passes through the polarized filter of left-handedness (i.e., counter-clockwise rotation) and is blocked by the filter of right-handedness (i.e., clockwise rotation), and vice versa. The resultant effect is similar to that of linear PPF, except that the observer can tilt their head further and still maintain dichoptic separation of images. While crosstalk is largely mitigated with circular PPF, a negligible amount of light may still transmit through due to the imperfect extinction properties of the polarized filters. This issue also raises a related concern for crosstalk with selective colors. Because the effectiveness of circular PPF depends on the wavelength of light (i.e., color; Hecht, Reference Hecht2002), it is possible that crosstalk may be more prominent for some colors but not others depending on which wavelength the PPF is optimized for. Monitors utilizing circular polarization technology are commercially available (e.g., Marshall Electronics ‘3D-241-HDSDI’, Zalman ‘ZM-M215W’), however they are still at a relatively early stage of development.
For ATS displays, crosstalk as a function of misalignment of the eyes is problematic (for proposed methods to alleviate such issues, see Konrad & Halle, Reference Konrad and Halle2007; Lipton & Feldman, Reference Lipton, Feldman, Woods, Merrit, Benton and Bolas2002; Lipton et al., Reference Lipton, Starks, Stewart and Meyer1985; Lueder, Reference Lueder2011; Wang et al., Reference Wang, Li, Zhou, Wang and Li2011; van Berkel & Clarke, Reference van Berkel, Clarke, Fisher, Merritt and Bolas1997). Several causes have been attributed to the imperfect separation of each eye's different view. First, because optimal viewing positions — where both images are most separated — are specific and limited, any horizontal deviation in observation angle can cause pixels intended for one eye to partially leak into the other eye (Boev et al., Reference Boev, Gotchev and Egiazarian2007; Boev, Georgiev et al., Reference Boev, Gotchev and Egiazarian2009; Boev, Gotchev et al., Reference Boev, Gotchev and Egiazarian2009). For instance, if a subject is positioned far enough to the left or right of a parallax barrier-type ATS display, they would look through adjacent slits and see subpixels intended for the other eye (Boev, Georgiev et al., Reference Boev, Gotchev and Egiazarian2009; Halle, Reference Halle1997). In addition, part of the light can still pass through the parallax barrier while within the optimal viewing position (Boev, Gotchev et al., Reference Boev, Gotchev and Egiazarian2009) due to the optical quality of the barrier (Woods, Reference Woods2010). This leakage effectively determines the minimum crosstalk level in parallax ATS displays, which is reportedly between 5% and 25% (Boev, Gotchev et al., Reference Boev, Gotchev and Egiazarian2009; Chen et al., Reference Chen, Tu, Liu and Li2008; for list of tested models, see Boev & Gotchev, Reference Boev, Gotchev, Snoek, Akopian, Sebe, Creutzburg and Kennedy2011). In such types of ATS displays, crosstalk can also result from alignment error of the lenticular screen during installation (Järvenpää & Salmimaa, Reference Järvenpää and Salmimaa2008; Woods, Reference Woods2010).
With LCS setups, flicker is observed as an abrupt and intermittent (artifactual) visible flashing effect caused by temporal asynchrony between the LCS goggles and the refresh rate of the display. In addition to causing viewer discomfort (discussed further below), flicker can also potentially confound data in measures that are highly sensitive to noise, such as those obtained from electrophysiological and brain-imaging recordings (Brown et al., Reference Brown, Candy and Norcia1999; Spang et al., Reference Spang, Gillam and Fahle2012). While flicker on a display screen is largely imperceptible to the naked eye (Andrews et al., Reference Andrews, White, Binder and Purves1996; Carmel et al., Reference Carmel, Lavie and Rees2006; Kristofferson, Reference Kristofferson1967), it can be apparent when viewed through LCS goggles (Fergason et al., Reference Fergason, Robinson, McLaughlin, Brown, Albileah, Baker, Green, Woods, Bolas, Merritt and McDowall2005).
Because LCS goggles rely on precise temporal synchronization with the corresponding monitor, they are susceptible to introducing flicker. Flicker is most apparent with LCS goggles synchronized to CRT monitors that operate on (low) refresh rates typically less than 120 Hz (Woods & Rourke, Reference Woods, Rourke, Woods, Bolas, McDowall, Dodgson, Merritt and Holliman2007), as each eye is provided with a refresh rate less than 60 Hz. Such frequencies are unable to overcome the binocular critical flicker frequency (see also Table 1 and Figure 4). Incidentally, the non-continuous illumination nature of CRT displays (Bridgeman, Reference Bridgeman1998) can also make their precise computation of the timing for stimulus presentation difficult, although the determination of display characteristics for CRTs is considerably more straightforward than for LCD monitors. While more recent models largely eliminate this problem (e.g., ‘FE-1’ by Cambridge Research Systems), the vertical refresh rate of most CRT monitors is a source of constant flicker. This inherent flicker artifact of LCS goggles may make it relatively less favorable compared to other methods for reliable BR viewing. Although CRT monitors are no longer commercially available, some models can still be acquired (e.g., ‘Mitsubishi Diamond Pro 2070SB’ by Cambridge Research Systems). Other displays that are commercially available include LCD monitors that behave and function like CRT displays (i.e., raster scan backlight; for example, ‘VIEWPixx3D’/‘ViewPixx3D Lite’/‘ProPixx’ by VPixx Technologies, Display++ by Cambridge Research Systems), and OLED displays with more precise timing characteristics for LCS goggles (e.g., Sony PVM-254; Ito et al., Reference Ito, Ogawa and Sunga2013; see also Cooper et al., Reference Cooper, Jiang, Vildavski, Farrell and Norcia2013; Wang & Nikolić, Reference Wang and Nikolić2011). Alternatively, DLP projection enables all pixels of the screen to update at once and is therefore the more suitable display option for LCS goggle setups (Woods & Rourke, Reference Woods, Rourke, Woods, Bolas, McDowall, Dodgson, Merritt and Holliman2007; see section ‘Flicker Artifact’). In general, it is important to note that a monitor suitable for time-multiplexing will not be universally compatible with every LCS goggle model available (for quantitative performance assessment of LCS goggles, see Hoffmeister, Reference Hoffmeister, Woods, Holliman and Favalora2013).
The financial cost involved in purchasing a BR setup is a key consideration, which should be weighed against the specific requirements of the experimental design. Noting that various models differ in price, with fMRI-compatible instruments and high-resolution PPF methods being substantially more expensive, the cost of each setup can be classified as low, mid-range, or high (see Table 1).
Anaglyphs, mirror stereoscope, and prism stereoscope setups are the least expensive of the methods reviewed here. These instruments are relatively easy to set up or improvise with, using low-cost materials ranging from US$1–50 that can be purchased via the Internet or from specialist optical providers and general hardware stores. Professional mirror and prism stereoscopes can also be constructed by purchasing an optical bench and mirror mounts (e.g., AnchorOptics, Data Optics Inc., Edmund Optics). Building a stereoscope from the ground up for BR research can be challenging for investigators without a background in vision science and engineering. Thus, complete pre-constructed mirror and prism stereoscopes are commercially available (e.g., US$40–300, Pocket StereoScope/Geoscope/GeoscopePro/ScreenScope by Stereo Aids; PokeScope 3D by PokeScope®). However, expensive equipment is often not necessary with vision researchers typically building their own stereoscopes using low-cost materials. Stimuli are easily prepared and displayed on a conventional monitor, which investigators can readily obtain.
The price of LCS and HMD goggles has become considerably less expensive due to their growing popularity in stereovision applications. Because the technology is relatively more advanced than typical anaglyphs and mirror/prism stereoscopes, most models sell for beyond US$100, especially if they involve wireless technology (e.g., ‘3DPixx’ by VPixx Technologies Inc.) or eye-tracking components. The cost of different types of monitors for LCS goggle setups also varies considerably (e.g., CRT, LCD, DLP). Likewise, the cost of ATS displays varies widely depending on the model, resolution, and screen size, although a typical system can start from US$400 (e.g., Sharp LL-15D).
PPF glasses typically cost three times more than comparable anaglyphs, while circular PPF glasses also cost more than linear PPF glasses. In addition, compared to all the other types of dichoptic display, the hardware requirement of a dual-screen and half-silvered mirror makes a PPF system considerably more expensive. For instance, the total cost of the True3Di™ 22-inch PPF monitor is ~US$4,000, which is the price for a typical linear PPF system (Thompson et al., Reference Thompson, Farivar, Hansen and Hess2008). However, recent advances in PPF technology mean that cheaper single-screen interleaved PPF monitors are now available (~US$300; e.g., AOC e2352Phz). These systems can achieve dichoptic display for BR without the need of a half-silvered mirror, but with the disadvantage of reduced spatial resolution and configurability of monitor settings (e.g., luminance).
Convenience and Design
With the variety of options for choosing a dichoptic display, the most accessible and straightforward are anaglyphs. For large-scale studies, the anaglyph method is especially suitable because chromatic filter-based glasses are inexpensive and thus can be purchased in bulk. It is therefore ideal for use in online-offsite BR testing via a website as anaglyphs can be feasibly distributed to large cohorts in the order of thousands, even tens of thousands, of subjects. This novel approach to subject testing has significant potential for large-scale clinical and genetic studies (discussed in detail in sections ‘Subject Group Considerations’ and ‘Large-Scale Clinical and Genetic Studies’). It should be noted that with anaglyphs, as different monochromatic input projects to distinct retinal channels, different sets of photoreceptors are activated in each eye, irrespective of how equivalent both images are in other respects. Some testing protocols therefore, depending on the experimental question of interest, may need repeating with filters in reverse positions (see Choubey et al., Reference Choubey, Jurcoane, Muckli and Sireteanu2009).
With mirror and prism stereoscopes, one disadvantage is that they require the observer's head to be stabilized in location and orientation. A chin rest is typically sufficient but requires individual adjustment to accommodate for differences in head dimensions. Pre-experimental optical adjustment is also inherent to prism stereoscope and HMD goggle setups to accommodate for interpupillary distance — which varies between 52 and 78 mm (Dodgson, Reference Dodgson, Woods, Bolas, Merritt and Benton2004) — across individuals and population groups. For the mirror stereoscope, individual adjustment and fixation training may not be necessary if BR stimuli are presented at the fixation depth of the monitor (i.e., non-parallel lines of sights for the two eyes). This arrangement can be achieved by shifting the BR stimuli and fusion cues closer toward the center of the screen, or by adjusting the outer mirrors on a four-mirror stereoscope further outward. When establishing the distance between the mirrors, applying an average interpupillary distance is often sufficient for BR purposes without the need for pre-experimental adjustment. However, for subjects with a considerably wide or narrow interpupillary distance, the mirror stereoscope may need to be fine-tuned by moving the inner mirrors further apart or closer together, respectively. Furthermore, it is important to note that, when using mirror or prism stereoscope assemblies, angular optical position needs to be calibrated to ensure both images fall on corresponding retinal locations. For the prism stereoscope, re-orientation of the lens to adjust for interpupillary distance may subsequently distort the corresponding view of the image. This distortion can be corrected by repositioning the image on the screen, but at the expense of increasing pre-test configuration time.
The disadvantage of individual adjustment and head stabilization are largely avoided with anaglyphs, LCS, PPF, and ATS display setups. While formal head stabilization is not necessary with linear PPF and ATS displays, the viewer is required to generally maintain a level head position to prevent the filters from tilting (see also section ‘Interocular Crosstalk’). During dichoptic viewing, subjects may experience ‘simulator sickness’ (see ‘Subject Group Considerations’); therefore, accurate dichoptic presentation is important to reduce and minimize such symptoms that are exacerbated by undesirable perceptual artifacts (e.g., ghosting, flicker).
These limitations can be overcome by using circular PPF (see section ‘Interocular Crosstalk’), HMDs, or ATS displays, which allow for head tilt. While some ATS monitors require the subject's head to be positioned in an optimal observation zone, higher-end models with integrated head-tracking equipment can mechanically shift the filter horizontally to allow for lateral and frontal head movements (e.g., HHI ‘Free2C 3D-Display’; see also Zschau & Reichelt, Reference Zschau, Reichelt, Chen, Cranton and Fihn2012). These setups are also suitable for large-scale studies because pre-test adjustment and calibration is not required to ensure reliable binocular integration for each individual subject (see section ‘Large-Scale Clinical and Genetic Studies’).
For typical LCS goggles and PPF/ATS displays, subjects can also view a single dichoptic display from different observation angles. These setups allow multiple subjects to view BR simultaneously, without sacrificing image quality or increasing the likelihood of binocular artifacts (cf. anaglyphs; although models such as Infitec® GmbH may be suitable). With ATS or PPF displays, multiple subjects can be seated across several observation points (i.e., next to each other). This arrangement will work as long as observers are positioned at optimal observation zones for ATS displays, or are viewing the stimuli within their foveal region for PPF setups. With LCS goggles, numerous goggles can be simultaneously linked to a single receiver via wired and wireless transmission (e.g., Infrared, Bluetooth, DLP-Link, radio frequency; e.g., NVIDIA ‘3D Vision®’, Bit Cauldron ‘BC5000’). However, wireless options may have potential response lag that can disrupt accurate temporal synchrony between shutter frequency and the monitor refresh rate (e.g., Razavi et al., Reference Razavi, Fleury and Ghanbari2006). In multiple-observer scenarios, such asynchrony — which impairs precise dichoptic presentation for BR viewing (see also sections ‘Interocular Crosstalk’ and ‘Flicker Artifact’) — is further compromised by increasing signal congestion and thus transmission error (Tian et al., Reference Tian, Xu and Ansari2005). Nevertheless, simultaneous multi-observer capability with such display types enables a novel approach to investigating individual differences in BR as well as group BR effects — for example, in social psychology contexts (cf. Anderson et al., Reference Anderson, Siegel, Bliss-Moreau and Barrett2011; see also Fox, Reference Fox1964). Indeed, further advances in hand-held ATS displays, given their unique glasses-free and portability advantage (see also section ‘Polarization-Multiplexed Techniques’), will expand the range of novel studies on BR and related phenomena. For example, the effect of simultaneous postural (vestibular), motor, and proprioceptive manipulations on rivalry can be explored, which would not otherwise be possible with fixed, heavy display devices consisting of multiple working parts (see also section ‘Infrared Camera Tracking’). Furthermore, portable devices equipped with rivalry data collection software (e.g., mobile/tablet application) to record individuals’ behavioral responses (key presses) can also facilitate novel population-level and twin studies along with large-scale clinical and genetic studies (see also sections ‘Subject Group Considerations’ and ‘Large-Scale Clinical and Genetic Studies’).
Compatibility With Related Phenomena
The dichoptic viewing methods reviewed are in theory suitable for delivering any phenomena within the class of binocular illusions. These variants include rapid eye-swap BR, flash suppression, continuous flash suppression (CFS), binocular switch suppression (BSS), and coherence rivalry (Figure 11). This section will briefly describe each phenomenon and discuss key issues relating to dichoptic viewing methods.
During rapid eye-swap BR, a dichoptic image pair is flickered and exchanged at a constant rate between the eyes (e.g., 18 Hz and ~333 ms respectively; Lee & Blake, Reference Lee and Blake1999; Logothetis et al., Reference Logothetis, Leopold and Sheinberg1996; van Boxtel et al., Reference van Boxtel, Knapen, Erkelens and van Ee2008). Despite this rapid physical change in monocular channel input, subjects experience smooth and slow perceptual alternations every few seconds, indistinguishable from conventional BR. As such, the variables of interest in common between both forms of BR include alternation rate and predominance. Other variants of BR are marked by their utility in controlling perceptual suppression of a target stimulus. They include flash suppression, CFS, and BSS, wherein a target image presented at the fovea can be erased from visual awareness (cf. change blindness, visual crowding). By manipulating perceptual suppression in such a predictable manner, they enable the examination of non-conscious visual processing, and thus have been increasingly used, particularly CFS, to examine mechanisms of visual consciousness. Variables of interest are indices of non-conscious visual processing and suppression dynamics, which include the strength and duration of perceptual suppression (Brascamp & Baker, Reference Brascamp, Baker and Miller2013; Sterzer, Reference Sterzer and Miller2013).
In flash suppression, one eye views a blank field while a target stimulus is briefly presented to the other eye (Lansing, Reference Lansing1964; McDougall, Reference McDougall1901; Wolfe, Reference Wolfe1984). This stimulus is removed from visual awareness by the abrupt subsequent onset of a different (masking) stimulus — e.g., typically a Mondrian pattern or white noise — presented to the unstimulated eye at the location corresponding to the initial target stimulus, and is perceived until its offset. Transient suppression of the initial stimulus is optimal at ~85 ms stimulus-onset asynchrony, i.e., the amount of time between the onset of initial and subsequent stimuli (Tsuchiya et al., Reference Tsuchiya, Koch, Gilroy and Blake2006).
A powerful variant of flash suppression is CFS (Tsuchiya & Koch, Reference Tsuchiya and Koch2005). Here, a fixed target stimulus is presented to one eye while a dynamic masking stimulus (e.g., Mondrian pattern or white noise) of higher signal strength is presented to the other eye at the corresponding retinal location. During such dichoptic presentation, the target stimulus is persistently suppressed from visual awareness by the masking stimulus, which rapidly updates at an optimal stimulation frequency (e.g., a stream of white noise or Mondrian patterns; Arnold et al., Reference Arnold, Law and Wallis2008; Tsuchiya et al., Reference Tsuchiya, Koch, Gilroy and Blake2006). This extended suppression of the target image lasts on the scale of minutes (Tsuchiya & Koch, Reference Tsuchiya and Koch2005), which is an order of magnitude larger in duration than with conventional BR, rapid eye-swap BR, and flash suppression. Prolonged and reliable perceptual suppression of target stimuli from visual awareness is advantageous because: (1) it enables examination of task-dependent consequences of extended non-conscious processing; (2) it is well suited to examination of non-conscious processing associated with animated stimuli; and (3) it is compatible with recording techniques where the response time is inherently limited by low temporal resolution (e.g., fMRI).
Another variant of BR that can induce persistent perceptual suppression is BSS (Arnold et al., Reference Arnold, Law and Wallis2008). Like CFS, BSS involves dichoptic presentation of a target stimulus and a masking stimulus of higher signal strength to corresponding retinal locations of the eyes. A key difference with BSS is that the presented masking stimulus is static (cf. dynamic mask in CFS). In addition, both stimuli are also continually exchanged between the eyes at constant rate (like rapid eye-swap BR), thereby resulting in the target stimulus being erased from visual awareness. With BSS, swapping the stimuli at a specific rate of 1 Hz has been shown to induce greater suppression strength and longer perceptual suppression of a target stimulus compared to CFS (Arnold et al., Reference Arnold, Law and Wallis2008).
Another variant of conventional BR that involves constant rather than flickering stimulus presentation is coherence rivalry. Here, dichoptic viewing of two complementary half-field images induces alternations between these two presented images and two perceptually regrouped coherent images (Díaz-Caneja, Reference Díaz-Caneja1928; Kovács et al., Reference Kovács, Papathomas, Yang and Feher1996; Ngo et al., Reference Ngo, Miller, Liu and Pettigrew2000, Reference Ngo, Liu, Tilley, Pettigrew and Miller2007). Therefore, coherence rivalry entails alternations among four different stable images rather than two as in conventional BR. Other forms of visual rivalry that are viewed normally (i.e., dioptically) without requiring dichoptic presentation to induce perceptual alternations include ambiguous figures (e.g., Necker cube, Rubin's vase), monocular rivalry, plaid motion rivalry, structure-from-motion (SFM) bistable rotating sphere/cylinder, bistable rotating trapezoid, bistable dot quartet motion, spinning wheel illusion, motion-induced blindness, and spinning dancer illusion.
Returning to flash suppression, CFS, and BSS, their suppression efficacy depends on the signal strength difference between the presented stimulus pair (see section ‘Overview of Binocular Rivalry Research’). As such, a target stimulus with greater signal strength (i.e., more salient) than the complementary stimulus is less likely to be suppressed from visual awareness. Therefore, it is critical to ensure the target stimulus is completely suppressed, as any potential unintended conscious experience of the target stimulus will confound the results. One situation where this could occur is if a monocular view can perceive both the target stimulus and masking stimulus. Crosstalk is therefore an important criterion when evaluating dichoptic viewing methods to present binocular illusions for the purpose of examining non-conscious processing. To date, the level of crosstalk across different display methods and its effect on CFS and BSS suppression dynamics has not yet been investigated (see Troiani & Schultz, Reference Troiani and Schultz2013). Crosstalk is also a particular concern in studies that record highly sensitive measures (e.g., galvanic skin response) and where there is no established testing protocol to verify image suppression during each viewing trial. In such cases, crosstalk-free options such as mirror/prism stereoscopes and HMDs would be more suitable. Finally, variants of classical BR that involve intermittent stimulus presentation such as those already mentioned, as well as studies of perceptual stabilization (see Pearson & Brascamp, Reference Pearson and Brascamp2008; Pitts & Britz, Reference Pitts and Britz2011), may also be affected by phosphor decay rate and flicker artifacts. The display model used would therefore need to be examined to ensure it has the suitable temporal characteristics for the study parameters of interest.
Compatibility With Other Recording Techniques
In addition to the basic considerations discussed thus far for BR studies, experimental designs will also vary according to the suitability of each dichoptic presentation method with equipment for recording other variables of interest. This section discusses the compatibility of each dichoptic display method with neuroimaging and ocular recording equipment (e.g., eye tracking, electrooculography), and offers possible solutions where incompatibility may arise.
A critical factor in brain-activity recording experiments involving visual psychophysics is reliable and accurate stimulus presentation. For instance, several fMRI studies of BR have used the anaglyphic method (e.g., Amting et al., Reference Amting, Greening and Mitchell2010; Brouwer et al., Reference Brouwer, van Ee and Schwarzbach2005; de Jong et al., Reference de Jong, Kourtzi and van Ee2012; Fang & He, Reference Fang and He2005; Haynes & Rees, Reference Haynes and Rees2005; Haynes et al., Reference Haynes, Deichmann and Rees2005; Hsieh et al., Reference Hsieh, Colas and Kanwisher2012; Lee & Blake, Reference Lee and Blake2002; Lerner et al., Reference Lerner, Singer, Gonen, Weintraub, Cohen, Rubin and Hendler2012; Lumer et al., Reference Lumer, Friston and Rees1998; Lumer & Rees, Reference Lumer and Rees1999; Meng et al., Reference Meng, Remus and Tong2005; Moutoussis et al., Reference Moutoussis, Keliris, Kourtzi and Logothetis2005; Pasley et al., Reference Pasley, Mayes and Schultz2004; Shimono & Niki, Reference Shimono and Niki2013; Stephan et al., Reference Stephan, Kasper, Harrison, Daunizeau, den Ouden, Breakspear and Friston2008; Tong & Engel, Reference Tong and Engel2001; Tong et al., Reference Tong, Nakayama, Vaughan and Kanwisher1998; Troiani et al., Reference Troiani, Price and Schultzin press; Wunderlich et al., Reference Wunderlich, Schneider and Kastner2005). Whether crosstalk influences the recorded data in such studies remains unknown, given information on anaglyphic crosstalk artifacts comes predominantly from research on stereoscopic 3D display technology rather than BR, but as mentioned above, from our experience using red/blue anaglyph filters to view BR stimuli on a single-screen interleaved PPF display (Figure 5d), there is virtually no perceptible crosstalk.
In other fMRI studies of BR, dichoptic display methods used include mirror stereoscopes (Lee et al., Reference Lee, Blake and Heeger2005, Reference Lee, Blake and Heeger2007; Moradi & Heeger, Reference Moradi and Heeger2009; Sterzer & Rees, Reference Sterzer and Rees2008), a combined mirror and lens-based setup (Polonsky et al., Reference Polonsky, Blake, Braun and Heeger2000), HMDs (Wilcke et al., Reference Wilcke, O'Shea and Watts2009) and PPF displays (Buckthought & Mendola, Reference Buckthought and Mendola2011; Buckthought et al., Reference Buckthought, Jessula and Mendola2011; Hesselmann & Malach, Reference Hesselmann and Malach2011; Hesselmann et al., Reference Hesselmann, Hebart and Malach2011). For CFS and related phenomena, fMRI studies have used anaglyphs (Bahrami et al., Reference Bahrami, Lavie and Rees2007; Jiang & He, Reference Jiang and He2006; Tse et al., Reference Tse, Martinez-Conde, Schlegel and Macknik2005; Vizueta et al., Reference Vizueta, Patrick, Jiang, Thomas and He2012; Watanabe et al., Reference Watanabe, Cheng, Murayama, Ueno, Asamizuya, Tanaka and Logothetis2011; Williams et al., Reference Williams, Morris, McGlone, Abbott and Mattingley2004), the mirror stereoscope (Sterzer et al., Reference Sterzer, Haynes and Rees2008), prism stereoscopes (Schurger et al., Reference Schurger, Pereira, Treisman and Cohen2010; Yuval-Greenberg & Heeger, Reference Yuval-Greenberg and Heeger2013) and HMDs (Troiani & Schultz, Reference Troiani and Schultz2013).
MEG studies of BR have also employed anaglyphs (Srinivasan & Petrovic, Reference Srinivasan and Petrovic2006; Srinivasan et al., Reference Srinivasan, Russell, Edelman and Tononi1999; Tononi et al., Reference Tononi, Srinivasan, Russell and Edelman1998), mirror stereoscopes (Kobayashi et al., Reference Kobayashi, Akamatsu and Natsukawa2000; Sandberg et al., Reference Sandberg, Bahrami, Kanai, Barnes, Overgaard and Rees2013), a prism stereoscope (Vanni et al., Reference Vanni, Portin, Virsu and Hari1999), a customized lens-based setup (Kamphuisen et al., Reference Kamphuisen, Bauer and van Ee2008), a hybrid LCS/PPF setup (Shinozaki & Takeda, Reference Shinozaki and Takeda2004, Reference Shinozaki and Takeda2008) and PPF displays (Cosmelli et al., Reference Cosmelli, David, Lachaux, Martinerie, Garnero, Renault and Varela2004; David et al., Reference David, Cosmelli and Friston2004; Rudrauf et al., Reference Rudrauf, Douiri, Kovach, Lachaux, Cosmelli, Chavez and Le Van Quyen2006), while MEG studies of CFS have thus far used anaglyphs (Sakuraba et al., Reference Sakuraba, Kobayashi, Sakai and Yokosawa2013) and mirror stereoscopes (Sterzer et al., Reference Sterzer, Jalkanen and Rees2009; Suzuki et al., in press).
In experiments that involve brain-activity recordings, compatibility of BR setups with strong magnetic interference (e.g., fMRI) and in environments highly sensitive to electromagnetic noise (e.g., MEG) is critical. Anaglyphs, mirror and prism stereoscopes, and PPF glasses are completely compatible in such studies because they can be made of plastic non-metallic material. Generally, however, it is critical to check for potential metal or ferromagnetic components in the framework and assembly of non-paper-based glasses/goggles for fMRI and MEG studies of BR and related phenomena.
Mirror stereoscopes, like all foreign material, need to be non-ferromagnetic to be used in the magnetically shielded environment during fMRI and MEG recording (Mackert et al., Reference Mackert, Wübbeler, Leistner, Trahms and Curio2001; Mäkelä et al., Reference Mäkelä, Forss, Jääskeläinen, Kirveskari, Korvenoja and Paetau2006; Thompson et al., Reference Thompson, Farivar, Hansen and Hess2008). While mirrors are typically included in fMRI head coils, modifications to these, while not trivial, are also not particularly onerous. One setup option is the divided bore that involves relatively simple modifications to the bore (e.g., a hanging divider) and the head coil mirror (e.g., a plastic or similar divider; Thompson et al., Reference Thompson, Farivar, Hansen and Hess2008).
The projection system that stimuli are displayed on also demands significant attention. Despite the sensitive environment of the fMRI and MEG theatre, researchers have a number of options to present dichoptic stimuli (see also Table 1). One option is to project stimuli onto a flat, partially opaque rear-projection plastic screen (e.g., lenticular pitch) placed at the end of the scanner bore or at the head coil. An important consideration with this method is that due to the non-uniform and non-linear function of the screen, distortions in luminance and contrast characteristics can exist across the screen (Kwak & MacDonald, Reference Kwak and MacDonald2000; Majumder & Stevens, Reference Stevens2004). These artifacts, such as over-proportionally bright regions or dark regions not being dark enough, reportedly lead to errors in brain-activation recordings (Strasburger et al., Reference Strasburger, Wüstenberg and Jäncke2002). Disproportionate luminance and contrast on the screen can also contribute to greater crosstalk for anaglyphs (see section ‘Interocular Crosstalk’). However, this artifact can be addressed with position-dependent gamma correction and luminance homogenization (Strasburger et al., Reference Strasburger, Wüstenberg and Jäncke2002), although the process can be time consuming (~6 hours; Choubey et al., Reference Choubey, Jurcoane, Muckli and Sireteanu2009). Alternatively, there are commercially available screens that do not suffer from non-linear effects and enable full luminance adjustable display (e.g., Avotec Inc. ‘Silent Vision 7021 or 6011’, PST Inc. ‘Hyperion™’). Another, albeit more costly, solution is to use commercially available non-ferromagnetic monitors (e.g., Cambridge Research Systems Ltd. ‘BOLDscreen’ for mirror/prism stereoscopes and anaglyphs; Cambridge Research Systems Ltd. ‘BOLDscreen 3D’ for PPF setups at ~US$25,000).
Typical LCS, HMD, and ATS display setups — due to electronic circuitry within these instruments — are incompatible with sensitive and high-field environments (e.g., MEG, fMRI; Thompson et al., Reference Thompson, Farivar, Hansen and Hess2008). Although fMRI- and MEG-compatible LCS goggles and ATS monitors have not yet been developed, HMD systems specialized for fMRI environments are commercially available (e.g., Avotec Inc. ‘SV-7021’ for >US$40,000 — used in Wilcke et al., Reference Wilcke, O'Shea and Watts2009; Resonance Technology Inc. ‘VisuaStim Digital’ or ‘CinemaVision’; NordicNeuroLab ‘VisualSystem’).
For EEG-based studies of BR and related phenomena, anaglyphs, mirror and prism stereoscopes, PPF setups and ATS displays are all fully compatible. LCS goggles though could present a problem in that the rapid switching within the LCD panels of the goggles may induce a small but noticeable current in the EEG electrodes and leads that lie close to the goggles (e.g., Fp1, Fp2, Fz). In addition, physical contact between the goggles and electrodes may introduce movement artifacts with subject movement. These potential artifacts would therefore require examination prior to designing formal data collection protocols. Similarly, for an EEG setup with HMD goggles, pre-experimental testing for possible electromagnetic interference and undue pressure and/or movement of the electrodes is also warranted (see Kramberger & Kirbiš, Reference Kramberger and Kirbiš2011, for a prototype combined HMD/EEG design for studying BR). It is beyond the current paper's scope however to systematically review the dichoptic display methods used across the extensive literature on EEG and evoked potential studies of rivalry (for overview, see Kornmeier & Bach, Reference Kornmeier and Bach2012; Pitts & Britz, Reference Pitts and Britz2011; Railo et al., Reference Railo, Koivisto and Revonsuo2011; Regan, Reference Regan2009; Thomson & Fitzgerald, Reference Thomson, Fitzgerald and Miller2013).
In regard to concurrent imaging techniques such as simultaneous EEG-fMRI, TMS-fMRI, TMS-EEG, EEG-MEG, and EEG-fMRI-TMS (e.g., Daskalakis et al., Reference Daskalakis, Farzan, Radhu and Fitzgerald2012; Reithler et al., Reference Reithler, Peters and Sack2011; Sandrini et al., Reference Sandrini, Umiltà and Rusconi2011; Siebner et al., Reference Siebner, Bergmann, Bestmann, Massimini, Johansen-Berg, Mochizuki and Rossini2009; Taylor & Thut, Reference Taylor and Thut2012), catering to the imaging technique or modality that has the more strict requirements will help to ensure the dichoptic viewing method used will be cross-compatible. For example, as fMRI has greater technical constraints, addressing its compatibility with dichoptic display methods will generally also be sufficient for EEG requirements in order to conduct an EEG-fMRI study of BR (e.g., Kobayashi et al., Reference Kobayashi, Akamatsu and Natsukawa2013). In the case of TMS, restrictive space inside the scanner bore can limit the orientation and placement of the TMS coil and thus the range of stimulation sites for TMS-fMRI investigation of BR. Even with the use of MRI-compatible non-ferromagnetic goggles, the transient (dynamic) pulses generated by TMS can cause significant electromagnetic disruption and result in potentially large artifacts being observed.
Eye-Movement Recording and Optokinetic Nystagmus
Saccadic and smooth pursuit eye movements are another variable of interest in BR experiments. The response parameters that are commonly collected (e.g., alternation rate, predominance, mixed percepts, onset state) in studies of BR have typically relied on the voluntary behavioral response of the observer (e.g., verbalization, button press). While such response methods may be adequate when data is acquired from trained and capable subjects, one problem with relying on subjective response methods to assess visual perception is that individuals can be influenced by observer confidence, response bias, and level of cooperation (Hannula et al., Reference Hannula, Simons and Cohen2005). For instance, as observers are often under-confident about their perceptual experiences (Björkman et al., Reference Björkman, Juslin and Winman1993), they may treat uncertainty as a lack of perceptual change. For example, during viewing of a presented target image under CFS (see section ‘Compatibility With Related Phenomena’), a fragment of the image may intermittently reach the detection threshold but not be recognized by the observer. The observer, unsure whether a genuine perceptual change had occurred and according to their own individual subjective criteria, may regard this ambiguity as null awareness of the target image. Hence, the observer may (erroneously) not report awareness of the presented image that was genuinely detected. Likewise, the reliance on voluntary behavioral responses as an indicator of perceptual state may not be suitable for subjects who do not reliably adhere to response instructions or have impaired neuromuscular or cognitive function (e.g., certain neurological diseases or disabilities). These issues can affect the reliability and accuracy of BR data collection, especially with pediatric and elderly populations, or, as mentioned, in subjects with various clinical disorders.
Clearly, it would be desirable if an objective indicator of perceptual experience could be adopted. Such a measure would facilitate BR research in providing an assay for the validity of subjective indicators. Optokinetic nystagmus (OKN) during BR with orthogonally drifting stimuli has been shown to provide an objective indicator of the temporal course of perceptual alternations in humans (e.g., Enoksson, Reference Enoksson1963, Reference Enoksson1968; Enoksson & Mortensen, Reference Enoksson and Mortensen1968; Fox et al., Reference Fox, Todd and Bettinger1975; Hayashi & Tanifuji, Reference Hayashi and Tanifuji2012; Hugrass & Crewther, Reference Hugrass and Crewther2012; Leopold et al., Reference Leopold, Fitzgibbons and Logothetis1995; Sun et al., Reference Sun, Tong, Yang, Tian and Hung2002, Reference Sun, Tong, Yang and Zhao2004), monkeys (Logothetis & Schall, Reference Logothetis and Schall1990), and cats (Fries et al., Reference Fries, Roelfsema, Engel, König and Singer1997). In tethered flies, alternating perceptual states in response to dichoptically presented conflicting images can also be tracked objectively with a torque meter and even direct electrophysiological recordings (Heisenberg & Wolf, Reference Heisenberg and Wolf1984; Tang & Juusola, Reference Tang and Juusola2010; reviewed in Miller et al., Reference Miller, Ngo and van Swinderen2012).
First reported by Enoksson (Reference Enoksson1963), dichoptic stimulation using gratings that drift in opposite directions — such as vertical contours moving left to right and right to left (Figure 12a) — induces an involuntary oculomotor response whereby conjugate eye movements correlate with the percept's drift direction (i.e., the dominant image; Fox et al., Reference Fox, Todd and Bettinger1975, Logothetis & Schall, Reference Logothetis and Schall1990). For instance, if the percept is a vertical grating drifting from left to right, movement of both eyes will mimic the percept's direction of movement (Figure 12b). Conjugate eye-movement recording for OKN can be recorded and quantified objectively using infrared (IR) camera eye-tracking and electrooculography (EOG). While IR eye-tracking has less noise than EOG, this advantage needs to be weighed up against any potential response decrement, lower temporal resolution, and increased cost of the device relative to EOG (Abel et al., Reference Abel, Wall, Troost and Black1980). In comparison, EOG has a more superior temporal resolution than video-based (i.e., IR) methods, and is therefore more sensitive and thus best used to quantify eye movements that correspond to perceptual alternations during BR with drifting gratings. However, high sensitivity means that EOG is susceptible to noise caused by muscle artifact (e.g., blinking, rolling of the eyes; Anderer et al., Reference Anderer, Roberts, Schlögl, Gruber, Klösch, Herrmann and Saletu1999). EOG is also a relatively less convenient setup for accurate measurement of objective eye movements and is far more time-consuming than video-based setups, although both methods require calibration.
Experimental designs in BR research have recorded eye movements either as a separate task (e.g., Hancock et al., Reference Hancock, Gareze, Findlay and Andrews2012) or simultaneously during BR viewing (e.g., Sabrin & Kertesz, Reference Sabrin and Kertesz1983; van Dam & van Ee, Reference van Dam and van Ee2006a, Reference van Dam and van Ee2006b). All the dichoptic viewing methods outlined in this article for BR presentation are compatible with IR camera and EOG eye-movement recording as separate tasks. However, concurrent eye-movement recording and BR viewing is more restricted, especially with LCS setups (see below).
Finally, it is interesting to note the recent demonstration of a novel HMD setup in which subjects could actively control the oculomotor activity of their left and right eye independently (Mizuno et al., Reference Mizuno, Hayasaka, Yamaguchi and Jobbágy2012). With two independently controlled CCD (charged-couple device) cameras connected, this ‘Virtual Chameleon’ device enabled two independent (dichoptic) monocular views to be shown on the HMD, thus inducing BR. Such non-conjugate (unyoked) eye movements in humans are analogous to the independent alternating eye movements that have been observed in the chameleon and sandlance (Pettigrew et al., Reference Pettigrew, Collin and Ott1999). Indeed, this observation in the sandlance formed part of the basis for proposing a novel neural model of BR (see Ngo et al., Reference Ngo, Barsdell, Law, Miller and Miller2013). Employing the aforementioned HMD setup in further investigations of BR (e.g., individual variation in alternation rate) may thus help to provide additional mechanistic clues to the phenomenon (see also Miller et al., Reference Miller, Ngo and van Swinderen2012, for discussion of evolutionary context).
Infrared camera tracking
IR eye-tracking is compatible with all dichoptic presentation methods outlined in the current article for simultaneous recording of eye movements during BR viewing (e.g., Mirametrix Inc. ‘S2 Eye Tracker’; Carmel, Arcaro et al., Reference Carmel, Arcaro, Kastner and Hasson2010; Schurger, Reference Schurger2009). IR eye-tracking is also known as active light eye-tracking (cf. passive/visible light eye-tracking; for a comprehensive review of camera-based methods, see Hansen & Ji, Reference Hansen and Ji2010). For anaglyphs, eye movements can be recorded by allowing IR light to transmit through each filter (see Wismeijer & Erkelens, Reference Wismeijer and Erkelens2009; using red-green anaglyphs). With mirror stereoscope setups, the eye-tracking device is placed near the periphery such that the IR beam is not blocked by the mirrors. An alternative workaround is to use ‘cold mirrors’ that allow transmission of IR light. This arrangement enables the eye-tracking camera/device to be placed in front of the viewer between the mirrors and monitor, in order to measure the point of gaze while light in the visible spectrum is reflected. Likewise with prism stereoscope and LCS goggle setups, the eye-tracking equipment should be mounted above or below the lenses and goggles so that it is not detecting the point of gaze through the lenses and filters. If eye-tracking equipment cannot be set up this way with LCS goggles, eye movements cannot be detected through the shutters due to their temporal multiplexing constraints for BR viewing. However, eye-tracking modules compatible with LCS goggles for simultaneous stimulus viewing are available (e.g., ‘JAZZ-novo’ by Ober Consulting).
For prism stereoscopes, eye-tracker calibration is required to accommodate for distortion of the image by the prism lenses (Carmel, Arcaro et al., Reference Carmel, Arcaro, Kastner and Hasson2010). More generally, eye-tracker calibration should be undertaken for all setups where the eye-tracking camera is not viewing the eye directly (i.e., through mirrors and filters). For PPF setups, eye-gaze detection through the polarized filter via IR has been reported even with rotation of the filter (Thompson et al., Reference Thompson, Farivar, Hansen and Hess2008), showing that PPF does not impede IR eye-tracking (e.g., Cambridge Research Systems ‘Eyetracker and Eyetracker Toolbox’ using Edmund Optics polarizers; Thompson et al., Reference Thompson, Farivar, Hansen and Hess2008). However, eye-tracking equipment warrants testing with a PPF over the tracked eye to verify its compatibility with a PPF setup before developing the concurrent eye-movement recording protocol.
HMD models that integrate dichoptic image presentation and eye-movement recording are commercially available (e.g., NVIS ‘xVisor MH60’; for prototypes see Beach et al., Reference Beach, Cohen, Braun and Moody1998; Curatu et al., Reference Curatu, Hua, Rolland, Sasian, Koshel and Juergens2005; David et al., Reference David, Apter, Thirer, Baal-Zedaka, Efron, Derenial, Longshore, Sood, Hartke and LeVan2009; see also review by Hua, Reference Hua, Woods, Bolas, Bolas, Merritt and Benton2001), and are an area of ongoing development (e.g., Hua & Gao, Reference Hua, Gao, Woods, Holliman and Favalora2012; Pansing et al., Reference Pansing, Hua, Rolland, Sasian, Koshel and Juergens2005). Other such models are also available as are customized integration of eye-tracking modules with typical standalone HMD setups (see Arrington Research, Inc.). Such integrated setups could potentially be further developed with software functions to enable combined stimulus presentation and rivalry data recording. For instance, presentation of dichoptic images (via the HMD screens) which induce OKN are detected with the eye-tracking device and automatically analyzed to provide an objective measure of BR parameters. This hands-free capability of BR data recording opens up further multimodal research avenues such as experimental conditions involving rivalry viewing with simultaneous upper limb, sensorimotor, and proprioceptive manipulations (e.g., Jalavisto, Reference Jalavisto1965; Lunghi et al., Reference Lunghi, Binda and Morrone2010; Maruya et al., Reference Maruya, Yang and Blake2007; Salomon et al., Reference Salomon, Lim, Herbelin, Hesselmann and Blanke2013; van Ee et al., Reference van Ee, van Boxtel, Parker and Alais2009). Another potential avenue for combining OKN recording with BR presentation would involve using anaglyphs to view orthogonally drifting gratings on a conventional domestic monitor for example, and the webcam sitting on top would be used as the eye-tracking device (e.g., Lee et al., Reference Lee, Kwon and Kim2013). The stimulus presentation, test procedure, and accompanying OKN-tracking software for the webcam could be made available via a website to enable large-scale rivalry data collection (see also section ‘Large-Scale Clinical and Genetic Studies’).
Finally, along with such studies of BR, other areas for collaborative research with 3D display technology developers that warrant consideration include the following: (1) further characterization of the relationship between stereopsis and BR (e.g., for overview, see Blake & Wilson, Reference Blake and Wilson2011; Howard & Rogers, Reference Howard and Rogers2012) and applying computational modeling of their co-occurrence (e.g., Bruce & Tsotsos, Reference Bruce, Tsotsos, Pomplun and Suzuki2013; Hayashi et al., Reference Hayashi, Maeda, Shimojo and Tachi2004) for the purpose of eliminating BR during stereoscopic 3D viewing (see also section ‘Application of Technology: Research and Development for Stereopsis and BR’); (2) using different dichoptic display types to characterize individual variation of and human factors involved in interocular stimulus parameter differences that disrupt binocular fusion and induce BR (see also section ‘Application of Technology: Research and Development for Stereopsis and BR). Indeed, this perceptual transition between stereopsis and BR (e.g., Buckthought et al., Reference Buckthought, Kim and Wilson2008; see also Buckthought & Mendola, Reference Buckthought, Mendola, Molotchnikoff and Rouat2012) has yet to be capitalized upon in electrophysiological and brain-imaging studies to help identify the neural locus of ambiguity resolution during rivalry, either in humans or other species (e.g., cats; see Sengpiel, Reference Sengpiel and Miller2013); and (3) as with HMDs (above), ATS models with eye-tracking equipment (e.g., HHI ‘Free2C 3D-Display’) could potentially be further developed with software functions to present BR stimuli that induce OKN, which are detected by the eye-tracking device for analysis to provide BR data output. The hands-free capability of such a system similarly enables examination of simultaneous postural (vestibular), motor, and proprioceptive manipulations on rivalry perception and dynamics, while the glasses-free feature of ATS displays offers an additional advantage, for example, of employing concurrent brain stimulation protocols (TMS, tDCS/tACS/tRNS/GVS; see section ‘Compatibility With Brain Stimulation Techniques’). Moreover, such a system that combines BR stimuli presentation, objective BR response recording, and BR data analysis would be advantageous in small- and large-scale studies involving pediatric and clinical groups (see also sections ‘Subject Group Considerations’ and ‘Large-Scale Clinical and Genetic Studies’).
EOG can be fully implemented with all the dichoptic display setups outlined above for simultaneous, conjugate eye-movement recording during BR viewing. With LCS goggle and HMD setups though, minor noise may be induced by slight but noticeable current to the electrodes and leads close to the goggles from the rapid switching within each LCS. However, because electric current flows to each LCS or through the HMD periodically (i.e., predictably), techniques may be applied to reduce potential noise. Specialized shielded electrodes may also be used, but they nonetheless warrant testing to verify their efficacy in removing current-induced artifacts from LCS goggles and HMDs. Requiring minimal tactile contact, Ag/AgCl (i.e., silver/silver chloride) sintered skin electrodes are placed at the lateral canthus of both eyes to obtain a separate recording from each eye (Figure 13). With rotation of an eye in the orbit toward the nearest electrode, the result is a positive-going charge from the resting potential (i.e., when the eye is at center position) in that electrode. Alternatively, pupil rotation away from the external canthus of an eye results in a negative-trending charge in the electrode nearest to that external canthus. The potential difference in charge is detected by electrodes in the nearest vicinity and measured with a DC or AC amplifier (Joyce et al., Reference Joyce, Gorodnitsky, Teder-sälejärvi, King and Kutas2002, Reference Joyce, Gorodnitsky and Kutas2004; Young & Sheena, Reference Young and Sheena1975).
For MEG compatibility, it is important to note that EOG recording requires carbon or other non-magnetic electrodes (e.g., Compumedics ‘Maglink Synamps RT’, Brain Products Fast’nEasy) to avoid image distortion, although this is much less of an issue with fMRI. Specialized fMRI-compatible EOG systems are also commercially available (e.g., ‘BrainAmp-MR’ by Brain Products GmbH). To combat the effects of radio frequency pulses during fMRI, electrodes have current-limiting resistors bonded to them and do not directly touch the observer while leads are thermally insulated from the observer.
Compatibility With Brain Stimulation Techniques
Non-invasive brain stimulation techniques have been used increasingly over recent years to directly examine the neural basis of various attentional and perceptual phenomena (e.g., Antal & Paulus, Reference Antal and Paulus2008; Been et al., Reference Been, Ngo, Miller and Fitzgerald2007; Romei et al., Reference Romei, Driver, Schyns and Thut2011; Szczepanski & Kastner, Reference Szczepanski and Kastner2009), including mechanisms of visual rivalry. To date, brain stimulation studies of rivalry have employed caloric vestibular stimulation (CVS; reviewed in Been et al., Reference Been, Ngo, Miller and Fitzgerald2007, Miller & Ngo, Reference Miller and Ngo2007), transcranial magnetic stimulation (TMS; Sandrini et al., Reference Sandrini, Umiltà and Rusconi2011; reviewed in Ngo et al., Reference Ngo, Barsdell, Law, Miller and Miller2013), and transcranial alternating current stimulation (tACS; Strüber et al., in press). For TMS, recent rivalry experiments (e.g., Carmel, Walsh et al., Reference Carmel, Walsh, Lavie and Rees2010; de Graaf et al., Reference de Graaf, de Jong, Goebel, van Ee and Sack2011; Kanai et al., Reference Kanai, Carmel, Bahrami and Rees2011; Nojima et al., Reference Nojima, Ge, Katayama, Ueno and Iramina2010) have adopted an ‘offline’ stimulation protocol — that is, stimulation applied prior to or in between sessions of rivalry presentation — compared with other studies that employed ‘online’ stimulation protocols — that is, TMS during BR viewing (Miller et al., Reference Miller, Liu, Ngo, Hooper, Riek, Carson and Pettigrew2000; Pearson et al., Reference Pearson, Tadin and Blake2007; Zaretskaya & Bartels, Reference Zaretskaya and Bartels2013; Zaretskaya et al., Reference Zaretskaya, Thielscher, Logothetis and Bartels2010; see also Nuruki et al., 2013). Earlier experiments using CVS as the stimulation technique employed offline protocols (Miller et al., Reference Miller, Liu, Ngo, Hooper, Riek, Carson and Pettigrew2000; Ngo et al., Reference Ngo, Liu, Tilley, Pettigrew and Miller2007, Reference Ngo, Liu, Tilley, Pettigrew and Miller2008; cf. Spiegel et al., Reference Spiegel, Li, Hess, Byblow, Deng, Yu and Thompson2013), while a recent study administered the technique during BR (Arshad et al., Reference Arshad, Nigmatullina and Bronstein2013). Other non-invasive non-convulsive brain stimulation techniques that have to yet be applied to rivalry phenomena are transcranial direct current stimulation (tDCS; Been et al., Reference Been, Ngo, Miller and Fitzgerald2007; Nitsche et al., Reference Nitsche, Cohen, Wassermann, Priori, Lang, Antal and Pascual-Leone2008; cf. Spiegel et al., Reference Spiegel, Li, Hess, Byblow, Deng, Yu and Thompson2013), transcranial random noise stimulation (tRNS; Guleyupoglu et al., Reference Guleyupoglu, Schestatsky, Edwards, Fregnio and Dikson2013), and galvanic vestibular stimulation (GVS; Utz et al., Reference Utz, Dimova, Oppenländer and Kerkhoff2010), all of which can be readily employed to examine mechanisms of BR and other rivalry types.
For offline stimulation, all the abovementioned dichoptic viewing methods (Table 1) are compatible with CVS, TMS, tDCS/tACS/tRNS, and GVS. For online stimulation though, compatibility of TMS and tDCS/tACS/tRNS with HMD and LCS goggles depends on the proximity of the goggles to the fields generated by the TMS coil and tDCS/tACS/tRNS electrodes. For instance, if prefrontal cortex stimulation is integral to the experimental design, close proximity of the TMS coil to the HMD or LCS goggles (which rest anteriorly on the head) may inadvertently disrupt or damage electronic circuitry in the goggles, with the resultant induced currents potentially also producing inadvertent artifacts in the HMD panels or shutters (respectively). In addition, TMS of frontal areas (e.g., primary motor cortex) may activate the facial nerve thus causing eye twitching (Sohn et al., Reference Sohn, Voller, Dimyan, St Clair Gibson, Hanakawa, Leon-Sarmiento and Hallet2004). Such transient stimulations are well known to artificially induce a change in perceptual state during BR (i.e., terminate suppression of a percept; Blake, Reference Blake2001). While a more posterior target site for stimulation (e.g., temporal/parietal/occipital cortex) may reduce electromagnetic interference, it would depend on the goggles’ design, the TMS coil orientation, and the stimulation intensity being applied. With HMDs, the headstrap component can also act as a physical barrier to the close proximity required for the TMS coil placement relative to the scalp. Ferromagnetic headstraps in particular and the relative location of connector cables can also introduce further experimental limitations. As such, the componentry of particular HMD/LCS goggle models would first need to be verified with the manufacturers, in order to determine their compatibility with TMS and tDCS/tACS/tRNS according to the study design and planned stimulation protocol.
In regard to invasive intracranial techniques, one study previously examined flash suppression with direct microelectrode recordings in epilepsy subjects (Kreiman et al., Reference Kreiman, Fried and Koch2002; see also Mukamel & Fried, Reference Mukamel and Fried2012). LCS goggles were employed to present the dichoptic stimuli, although this viewing method may not be suitable for particular individuals with epilepsy because of the potential flicker artifacts (see also Table 1 and section ‘Flicker Artifact’). As such, anaglyphs, PPF setups, or ATS displays may be preferred for studies in neurosurgery subjects generally. More recently, anaglyphs were used to examine CFS during intracranial event-related potential recordings of amygdala and insula activity (Willenbockel et al., Reference Willenbockel, Lepore, Nguyenm, Bouthillier and Gosselin2012). These particular dichoptic presentation methods can also be used to study the effect of direct electrical cortical/subcortical stimulation (e.g., deep brain stimulation) on BR perception and related phenomena, an area that has yet to be examined. It has been argued however that for direct cortical stimulation studies, interpretation of mechanisms underlying any observed effects have certain inherent limitations (Borchers et al., Reference Borchers, Himmelbach, Logothetis and Karnath2012). Other considerations for choosing a dichoptic display with respect to subject group factors are outlined next.
Subject Group Considerations
For studies of BR involving particular subject populations, a number of issues need to be considered. This section will discuss some key aspects of experimental testing protocols for optimal subject compliance and reliable data collection in typical small-scale studies (N = 5–50). The section to follow will address such aspects for large-scale studies (e.g., N = 500–100,000). In experiments with only healthy subjects and where adequate power can be obtained with a small sample size (e.g., traditional psychophysical studies), dichoptic viewing methods that may require individual adjustment such as the mirror/prism stereoscope are sufficiently practical options. The simplicity and minimal cost of anaglyphs also make them advantageous in various small-scale studies (see Table 1). Other considerations for choosing a particular dichoptic display in regard to subject group include, for example, age range, type of clinical disorder, subjects’ clinical state, level of task compliance, subjects’ susceptibility to perceptual artifacts, and visual/physical discomfort associated with dichoptic viewing.
Over the past decade, evidence supporting the clinical relevance of BR, in particular the finding of slow rivalry rate in BD (see section ‘Overview of Binocular Rivalry Research’), has exemplified the need for examining the suitability of dichoptic viewing methods in particular subject groups. Particular care in designing the experimental protocol in such contexts, and in other contexts such as with adolescents (e.g., Miller et al., Reference Miller, Hansell, Ngo, Liu, Pettigrew, Martin and Wright2010) and pediatric subjects (e.g., Kovács & Eisenberg, Reference Kovács, Eisenberg, Alais and Blake2005; using anaglyphs; see also Hudák et al., Reference Hudák, Jakab, Kovács and Albertazzi2013), is required to ensure reliable data collection. Notable issues are apparent with mirror and prism stereoscopes, which require the subject's head to be physically restrained at a fixed position and orientation with a chin rest or head restraint. Children may find such requirements unacceptable. Adolescents and some psychiatric subject groups (e.g., mania) may struggle with maintaining compliance in keeping a fixed position for extended viewing times. Psychiatric subjects with paranoid beliefs may also be mistrustful of any type of head restraint. As such, anaglyphs, HMD and LCS goggles, PPF setups, and ATS displays that do not rely on formal head stabilization may be preferable in these populations (notwithstanding the potential still for some degree of paranoia in some psychiatric subjects in relation both to the phenomenon of BR itself and to unusual computer monitors and headsets). However, with HMD and LCS goggles, the front is heavier due to the placement of the power source and electronic circuitry (for review of visual factors in HMDs, see Tsou & Shenker, Reference Tsou, Shenker, Bass, DeCausatis, Enoch, Lakshminarayanan, Li, MacDonald, Mahajan and Van Stryland2010). Pediatric subjects and some adult subjects may find this asymmetric distribution of weight uncomfortable for prolonged viewing times. For mirror/prism stereoscopes and HMDs, the inconvenience in configuring the setup and the observer to ensure vergence stabilization also needs to be considered (e.g., pre-experimental adjustments and calibrations to achieve stimuli alignment). In psychiatric groups, task compliance for such configuring, and indeed for all aspects of BR testing, may present difficulties in acutely unwell states (see further below). As mentioned previously (section ‘Eye-Movement Recording and Optokinetic Nystagmus’), an objective measure of BR data using OKN recordings would be suitable for subjects who (1) cannot reliably adhere to response instructions, (2) have impaired neuromuscular or cognitive function (e.g., certain neurological diseases or disabilities), (3) are in pediatric or elderly populations, and (4) have low observer confidence or exhibit response bias. It is important to note, however, that while all the display methods reviewed can be used to dichoptically view OKN-inducing stimuli, their compatibility with OKN recording using IR eye-tracking equipment varies (cf. EOG; see above).
Another consideration is the potential for ‘simulator sickness’ and ‘visual fatigue’ (e.g., Bando et al., Reference Bando, Iijima and Yano2012; Urvoy et al., in press), which can interfere with testing and confound results. Surveys have suggested that around a quarter to half of individuals may experience symptoms of visual discomfort associated with viewing 3D displays (see McIntire et al., Reference McIntire, Havig, Geiselman, Desjardins, Marasco, Sarma and Havig II2012). It should be noted, however, that in the field of stereoscopic 3D display research and development (see section ‘Application of Technology’), BR itself is considered a type of visual discomfort (e.g., ‘painful rivalry’; Hornung et al., Reference Hornung, Smolic and Gross2011; Lang et al., Reference Lang, Hornung, Wang, Poulakos, Smolic and Gross2010). Conversely, it is also important to note that in studies of BR, assessment of subjects’ visual discomfort and fatigue associated with dichoptic viewing are typically not experimental questions of interest. Nevertheless, visual fatigue that could occur during BR may be due to binocular asymmetry, such as geometric disparity inherent with BR stimuli and photometric differences between the two eyes due to perceptual artifacts (e.g., ghosting). Therefore, the severity of visual fatigue can vary, depending on the severity of perceptual artifacts, which is subject to the quality of the dichoptic display. The human visual system has a limited ability to tolerate perceptual artifacts such as flicker and ghosting, which can cause dizziness, headaches, eye strain, and even nausea (Howarth, Reference Howarth2011; Kim et al., Reference Kim, Jung, Kim, Ro and Park2011; Kooi & Toet, Reference Kooi and Toet2004; Lambooij et al., Reference Lambooij, IJsselsteijn, Heynderickx, Woods, Bolas, McDowall, Dodgson, Merritt and Holliman2007, Reference Lambooij, Fortuin, Heynderickx and IJsselsteijn2009). These symptoms have been reported under prolonged stereoscopic viewing using anaglyphs (Ostnes et al., Reference Ostnes, Abbot and Lavender2004), LCS goggles (Bruck & Watters, Reference Bruck and Watters2009; Yano et al., Reference Yano, Emoto and Mitsuhashi2004), HMDs (Woods et al., Reference Woods, Apfelbaum and Peli2010), and PPF (Yano et al., Reference Yano, Ide, Mitsuhashi and Thwaites2002). Nausea can also be experienced following prolonged viewing of 3D cinema projections, which adopt the same principle as the dichoptic display methods reviewed here.
With LCS goggles, induced flicker due to time-sequence asynchrony can also cause headaches and nausea. This outcome is especially the case for observers who are sensitive to these symptoms, such as migraine sufferers, individuals with epilepsy, or individuals who are photosensitive (see also Table 1 and section ‘Compatibility With Brain Stimulation Techniques’). Because flicker exhibits itself as a function of (low) refresh rate, projectors and monitors with a refresh rate over 120 Hz are preferred for the LCS setup (e.g., DLP monitors; see section ‘Flicker Artifact’). If flicker is deemed too problematic given the subject group of interest, instruments such as HMDs that eliminate flicker may be more preferable. Studies that adopt the dual-screen approach for dichoptic viewing should ensure the display settings are consistent across both screens (e.g., dual-monitor mirror stereoscope, HMDs, dual-screen PPF), as visual discomfort increases with greater photometric difference in dichoptic luminance and contrast (i.e., blur).
Crosstalk can also trigger increased subject discomfort (Pastoor, Reference Pastoor1995; Stevens, Reference Stevens2004). For linear PPF and parallax ATS displays, it has been suggested that crosstalk ≥5% is exponentially associated with increased visual discomfort such as eye strain and visual fatigue (Chen et al., Reference Chen, Tu, Liu and Li2008; Kooi & Toet, Reference Kooi and Toet2004). Circular polarizer filters, which allow for significant flexibility in head movement, may be the preferred choice for PPF systems, especially if the study involves the testing of children who may be less capable of complying with instructions to keep their heads level with the stimuli. ATS displays, which do not require subjects to wear glasses or maintain head stabilization, could also be considered. For lenticular-type ATS displays though, crosstalk threshold for moderate/intolerable viewer discomfort has been reported to be between 5% and 10% (Nojiri et al., Reference Nojiri, Yamanoue, Hanazato, Emoto, Okano, Woods, Bolas, Merritt and Benton2004; Yeh & Silverstein, Reference Yeh and Silverstein1990). Overall, it is important to emphasize that various factors (including individual variation) affecting precise dichoptic presentation covered above — for example, crosstalk, flicker, visual discomfort — have primarily been examined in the field of 3D display technology, but have yet to be tested in the context of BR research. However, from our experience with testing more than 1,500 subjects for total BR viewing periods of 14 to 21 minutes (using 100-second trials with interspersed rest breaks), there have seldom been any significant complaints beyond a mild headache, fatigue, or eye strain and virtually never nausea. In the section to follow, we focus on a new approach to BR testing for studies requiring very large subject samples sizes.
Large-Scale Clinical and Genetic Studies
As mentioned in the introduction, the demonstration of slow BR rate in BD has added a new focus toward clinical diagnostic and genetic studies of rivalry in the modern era. These studies require the collection of very large datasets — in the range of thousands to tens of thousands — to accurately assess the potential diagnostic utility (which relies on specificity for BD) and endophenotype utility of the rivalry rate trait. A major barrier to meeting such recruitment targets, particularly for psychiatric subject groups, is the enormous cost associated with recruitment, transportation, and testing in a formal laboratory setting. In genetic epidemiology, there has traditionally been emphasis on meticulous clinician-derived phenotyping, because mistakes in classification of clinical cases and healthy controls erode power. However, if the expense of meticulous phenotyping is too great, greater power can be achieved by accepting a small error rate in much larger samples obtained with cheaper, minimal phenotyping. For example, using online questionnaires, a recent genetic study replicated hundreds of previously reported GWAS loci that were originally identified by studies using stricter measures (Tung et al., Reference Tung, Do, Hinds, Kiefer, Macpherson, Chowdry and Eriksson2011). Thus, answers to ‘Have you ever been diagnosed by a doctor with high cholesterol (over 200 mg/dl) or hypercholesterolemia?’ were sufficient to replicate 19 associations with cholesterol level. On this basis, and to overcome the major challenges associated with large-scale lab-based testing, we propose a new model of rivalry testing — the development of an online BR test website. An online BR test would not require research personnel to administer or to oversee testing. Despite potentially introducing some small margin of error in the data due to lack of oversight, online BR testing would be very widely and readily applicable such that statistical power for gene-finding will be greatly enhanced (by very large sample sizes obtainable with this approach). Furthermore, it is expected that subjects will more readily participate in a standalone test that is completed entirely online, and that does not involve extensive phone-based or clinician-based assessments and test batteries (which in the case of psychiatric subjects will have been completed already as part of existing psychiatric genetics consortia).
In regard to the method of dichoptic display for online-offsite BR testing via a website, the accessibility and minimal cost of anaglyphs are especially advantageous because BR can be induced without specialized optical apparatus and displays. A conventional monitor, standard desktop computer, and stable internet connection would generally be sufficient to run a web-based program that simultaneously presents the BR stimuli and collects BR data via keyboard presses. The cardboard frame (foldable) anaglyph glasses also enable convenient and low-cost postage to subjects for dichoptic viewing. Furthermore, BR viewing with anaglyphs does not require individual adjustment and extensive pre-experimental preparation, thus eliminating pre-test configuration time.
The online BR test we propose will be based on the combined stimulus presentation and data collection program (written with MATLAB™) used in our current lab-based studies of BR. More specifically, the online BR test will consist of procedures for stimulus presentation, subject task familiarization and training, test start/rest/stop prompts, exclusion screening, visual acuity testing, subject consent, keyboard-based BR data collection, and subject questionnaire feedback. The stimulus presentation code will automatically adjust for variations in monitor resolution to maintain a uniform stimulus size. Importantly, a catch-trial component will also be included, in which the stimuli are physically alternated to mimic rivalry, thus providing an objective means for verifying subject compliance with perceptual reporting, and hence providing a basis on which to accept or reject an individual subject's data. Recorded BR data will be analyzed offline with the analysis program used in our current studies of BR. Quality assurance and pilot testing will be required on the data collection process, security procedures, data backup, data analyses, and incorporation of subject feedback.
Building on our studies of BR in BD and in twins (Miller et al., Reference Miller, Gynther, Heslop, Liu, Mitchell, Ngo and Geffen2003, Reference Miller, Hansell, Ngo, Liu, Pettigrew, Martin and Wright2010; Pettigrew & Miller, Reference Pettigrew and Miller1998), large-scale clinical diagnostic and endophenotype studies of rivalry employing the online BR test are expected to make significant inroads toward identifying genes determining rivalry rate and genes involved in the pathophysiology of BD, with subsequent understanding of molecular mechanisms therein. In addition, twin studies and GWAS provide tangible prospects for understanding not only the genetics and heritability of BR and related phenomena, but also the genetics and heritability of conscious and non-conscious processing.
This article has provided a detailed review of the common methods used for inducing and studying BR. The range of methods were compared according to (1) multiplexing principle, (2) advantages and disadvantages regarding image presentation parameters, (3) financial costs, convenience, and design, (4) compatibility with related visual phenomena, (5) compatibility with brain-imaging, eye-tracking, and brain stimulation techniques, and (6) suitability for particular subject populations and sample sizes. The information highlighted in this article aims to assist investigators to select a dichoptic display setup that is suitable for their research question, study design, and study population and to be a resource to those new to the field, particularly clinicians and geneticists. It may also help to stimulate new research questions in BR science and 3D display technology development, with a view to bridging the gap between these two rapidly growing fields.
TTN is supported by NHMRC (ID 490976). SMM is supported by NHMRC, the Defence Health Foundation, and a 2012 NARSAD Young Investigator Grant (ID 19163) from the Brain & Behavior Research Foundation. We thank Loes van Dam for helpful comments on an earlier version of the article.
Disclosure of Interests
SMM is a co-inventor on a granted University of Queensland, national and international patent concerning slow binocular rivalry in bipolar disorder. There are currently no commercialization activities. The remaining authors declare no conflict of interest.