It is foreseen that the most convenient hardware for depiction of augmented reality (AR) will be optical seethrough head-mounted displays. Currently such systems are utilizing single focal plane and are inflicting vergenceaccommodation conflict to the human visual system – limiting wide acceptance. In this work, we analyze an optical seethrough AR head-mounted display prototype which has four focal planes operating in time-sequential mode thus mitigating limitation of single focal plane devices. Nevertheless, optical see-through nature implies requirement of very short motion-to-photon latency not to cause noticeable misalignment between the digital content and real-world scene. The utilized prototype display relies on commercial visual-SLAM spatial tracking module (Intel realsense T265) and within this work we analyzed factors improving motion-to-photon latency with the provided hardware setup. The performance analysis of the T265 module revealed slight translational and angular jitter – on the order of <1 mm and <15 arcseconds, and velocity readout of few cm/s from a completely still IMU. The experimentally determined motion-tophoton latency and render-to-photon latency was 46±6 ms and 38 ms respectively. To overcome IMU positional jitter, pose averaging with variable width of the averaging window was implemented. Based on immediate acceleration and velocity data the size of the averaging window was adjusted. To perform pose prediction a basic rotational-axis offset model was verified. Based on prerecorded head movements, a training model reduced the error between the predicted and actual recorded pose. The optimization parameters were corresponding offset values of the IMU’s rotational axis, translational and angular velocity as well as angular acceleration. As expected, the highest weight for the most accurate predictions was observed for velocities following angular acceleration. The role of offset values wasn’t significant. For improved perceived experience and motion-to-photon latency reduction we consider further investigation of simple trained neural networks for more accurate real-time pose prediction as well as investigation of content-driven adaptive image output overriding default order of image plane output in a time-sequential sequence.
Inconsistency between the binocular and focus cues in stereoscopic augmented reality overburdens the visual system leading to its stress. However, a high individual variability of tolerance for visual stress makes it difficult to predict and generalize the user gain associated with the implementation of alternative visualization technologies. In this study, we investigated the relationship between the binocular function and perceptual judgments in augmented reality. We assessed the task completion time and accuracy of perceptual distance matching depending on the consistency of binocular and focus cues in the stereoscopic environment of augmented reality. The head-mounted display was driven in two modes: multifocal and monofocal mode, providing consistent-cues and inconsistent-cues condition, respectively. Participants matched the distance of a real object with images displayed at three viewing distances (concordant with distances of display focal planes in the consistent-cues condition). A thorough vision screening was performed before the experiment. As a result, individuals with low convergent fusional reserves and receded near point of convergence misjudged distances to a higher extent in comparison to others in the inconsistent-cues condition. In contrast, perceptual judgments were fast and less overestimated, as well as no significant effect of binocular function was revealed in the consistent-cues condition. We suggest that the binocular function measures characterizing individual tolerance for visual stress might be used as the predictors of user gain in the comparative assessment of new visualization technologies for the augmentation of reality.
Recent advancements in visualization systems have triggered a growing demand for the objective and accurate comparison of user cognitive requirements when perceiving three-dimensional images demonstrated in different ways. In this work, we present the first comparative assessment of brain activity in subjects viewing stereoscopic images and volumetric images. Electroencephalography was employed to assess the short-term changes in event related potentials and neural oscillations which were further interpreted in terms of cognitive requirements for relative depth judgments. As a result, considerably higher activity have been registered in the beta band and gamma band in case of judging relative depth of stereoscopic images in comparison to performing a similar task on the volumetric display. In addition, the higher neural activity in the parietal area and occipital area has been observed in the case of stereoscopic images in the moments which reflect cognitive responses on the depth component of the visual stimulus. We suggest that the greater demands on cognitive load may lead to a faster onset of fatigue in a long-term perspective. Overall, EEG-based assessment of brain activity indicates that the depth extraction from volumetric images requires less cognitive effort in comparison to stereoscopic images.
In this work we investigate design parameters of a stereoscopic head-worn augmented reality display that would facilitate a wider uptake of technology by enterprise and professional users. The emphasis is put on mimicking a way of how naturally the ambient world is perceived by human visual system. To solve this, we propose a solid-state multi-focal display architecture, which is tailored for near-work oriented tasks. The core of the proposed technology is a solid-state multi-plane volumetric screen, with four physical image depth planes which form the secondary image source. The volumetric screen utilizes electrically controllable liquid-crystal based diffuser elements, which receive the image information from the primary source – a pico projection unit. The volumetric screen is coupled with a bird-bath type optical image combiner/eyepiece to yield a 40-degree horizontal field of view covering a representable depth space of 0.35m to infinity where no effects of vergence-accommodation conflict are experienced.
In the field of 3D display technologies for a long-time accommodation-based depth cues have been dismissed. On one hand they are treated as weak depth cues, but on other hand their inclusion has been technologically challenging. Either way, accommodation depth cues are essential in ensuring natural image perception; they add realism to the 3D scene and help overcoming technologically inhibiting effects of vergence-accommodation conflict. In this work we examine implementation and associated considerations of optical diffuser technology via spatial volume demultiplexer chip (SVDC) within a stereoscopic Augmented Reality (AR) wearable display. The role of SVDC is to demultiplex series of two-dimensional image depth planes into a perceivably three-dimensional scene with said focus depth cues. The SVDC chip is designed to be entirely solid-state solution, requiring only voltage driving signal for the image demultiplexing action. In case of using an SVDC for multi-plane display architecture, the image source is a rear image projection unit ensuring high refresh-rate stream of required 2D image depth planes. The SVDC technology is scalable, it facilitates improved light efficiency due to controlled internal reflections which allows for diverse optical design in AR as well as VR settings. Provided is indicative evaluation and comparison of different optical image combiner solutions in respect to using a SVDC display architecture for near-eye stereoscopic AR display systems. Considered designs of optical image combiners include flat beam splitter with a refractive eyepiece, “bird-bath” optics, and single curved (free-form) reflective image combiner.
LightSpace Technologies have developed a prototype of integrated head-mounted stereoscopic display system based on a proprietary multi-plane optical diffuser technology. The system is entirely solid-state and has six focal planes which covers ~3 diopters (from 32 cm to 8 m). For the operation no eye-tracking is utilized. The new display system virtually entirely eliminates vergence-accommodation conflict and adds a monocular accommodation as an important depth cue for improved 3D realism. In regards to content rendering the processing load in contrast to conventional single-focalplane stereoscopic displays with similar image resolution is only slightly increased. The differences in terms of comparative performance are the worst in the case of simple 3D scenes, while for high-complexity scenes this difference has a tendency to slightly decrease. On average the processing burden for multi-plane stereoscopic displays is no more than 1.5% higher than for conventional stereoscopic displays. Furthermore, increasing a number of physical focal planes doesn’t notably worsen the image rendering performance allowing the display device to be efficiently driven by already readily available hardware – including high-performance mobile platforms. Overall, the user feedback about the developed multi-plane stereoscopic 3D display prototype confirms prior proposed assumptions of multi-plane architecture yielding higher acceptance rate due to improved 3D realism and eradicated vergence-accommodation conflict, thus currently being one of the most noteworthy advancements in the field of 3D stereoscopic displays.
For the visualization of naturally observable 3D scenes with a continuous range of observation angles on a multi-plane volumetric 3D display, specific data processing and rendering methods have to be developed and tailored to match the architecture of a display device. As one of the most important requirements is a capability of providing real-time visual feedback, the data processing pipeline has to be optimized for effective execution on general consumer-grade hardware. In this work technological aspects and limitations of volumetric 3D display based on a static multi-planar projection volume have been analyzed in the context of developing an effective real-time capable volumetric data processing pipeline. Basic architecture of data processing pipeline has been developed and tested. Initial results showed a very slow performance for the execution on central processing unit. Based on these results, the data processing pipeline was optimized to utilize acceleration of graphics processing unit (GPU), which resulted in a substantial decrease of execution times, reaching the goal of real-time capable volumetric refresh rates.
In this work a detailed analysis of technologies and methods required for a construction and operation of passive multiplane volumetric 3D display based on the arrangement of electrically controllable optical diffuser elements has been provided. Current methods of displaying 3D images have been compared. Challenges and solutions of representing realistic looking 3D content with associated physical depth cues in regards to multi-plane approach have been highlighted. The main focus has been devoted to consideration of improving user experience when viewing and interacting with the 3D content on a multi-plane volumetric display by utilizing various task-specific computational methods in the data processing pipeline.