Holoscopic 3D Imaging Systems: A Review of History, Recent Advances and Future Directions

Liu, Yi; Meng, Hongying; Swash, Mohammad Rafiq; Huang, Yiyuan; Yan, Chen

doi:10.3390/app151810284

Open AccessReview

Holoscopic 3D Imaging Systems: A Review of History, Recent Advances and Future Directions

by

Yi Liu

^1,*

,

Hongying Meng

^2,*

,

Mohammad Rafiq Swash

²

,

Yiyuan Huang

¹ and

Chen Yan

¹

New Media Department, Beijing Institute of Graphic Communication, Qingyuan Rd, Beijing 102699, China

²

Electronic and Electrical Engineering Department, Brunel University of London, Kingston Ln, London UB8 3PH, UK

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2025, 15(18), 10284; https://doi.org/10.3390/app151810284

Submission received: 2 August 2025 / Revised: 12 September 2025 / Accepted: 19 September 2025 / Published: 22 September 2025

(This article belongs to the Special Issue State-of-the-Art 3D Imaging, Processing and Display Technologies)

Download

Browse Figures

Versions Notes

Abstract

As the demand for high-quality visual experiences continues to grow, advanced imaging technologies offering higher realism and immersion are being increasingly integrated into various fields. Among them, glasses-free 3D imaging has gained significant attention for enhancing user experience without the need for wearable equipment. Holoscopic 3D imaging systems, known for their capability to reconstruct true volumetric images and provide natural depth perception, have emerged as a promising direction within this domain. Originating from early 20th-century optical theory, holoscopic imaging has evolved in response to diversified application scenarios and rapid advancements in micro-optics and computational imaging. This paper presents a representative historical overview of the development of holoscopic 3D systems, their unique features compared to other glasses-free 3D technologies, and their expanding presence in these applications. By analyzing representative use cases across sectors such as healthcare, education, cultural heritage, and media entertainment, this review offers a broader and more detailed perspective on the deployment of holoscopic 3D systems. Furthermore, this paper discusses current technical challenges and outlines future research directions, with a particular focus on the transformative potential of holoscopic 3D in the creative and entertainment industries. This study aims to provide both theoretical grounding and practical insights to support the next generation of holoscopic 3D imaging technologies.

Keywords:

holoscopic 3D imaging; glasses-free 3D display; application review

1. Introduction

Holoscopic 3D (H3D) imaging (also known as integral imaging) is an emerging technique that records and reconstructs the full spatial–angular light field of a scene using a single-aperture optical system combined with a microlens array. Unlike stereoscopic or camera-array approaches, which require multiple lenses or viewpoints and often produce visual fatigue due to limited parallax, H3D simultaneously encodes multiple perspectives as a dense grid of elemental images (EIs). These EIs can later be processed into viewpoint images (VPIs) that reproduce continuous parallax and natural depth cues, enabling glasses-free 3D visualization. Because of its compactness and ability to capture all directional information in a single shot, H3D provides a unique platform for real-time and embedded 3D imaging applications.

The theoretical foundations of H3D can be traced back to Lippmann’s pioneering concept of integral photography at the beginning of the 20th century, which proposed recording a three-dimensional light field using an array of small lenses [1]. Although limitations in materials and fabrication initially hindered practical implementation, subsequent researchers refined and extended the idea. In particular, Eugène Estanave [2] developed pinhole and multi-lens configurations that improved light throughput and angular sampling compared to Lippmann’s original design, laying an important bridge between theory and early prototypes. With advances in micro-optics, sensor technology, and computation during the late 20th and early 21st centuries, integral imaging evolved into the present-day holoscopic 3D systems [3,4,5].

Recent developments have accelerated this progress. High-precision microlens arrays, high-resolution CMOS sensors and GPU-accelerated reconstruction pipelines now make it possible to implement compact H3D cameras for autonomous vehicles [6], gesture recognition [7,8], minimally invasive medical procedures [5], industrial inspection and cultural heritage digitization [9]. These advances have also stimulated research into hybrid architectures, such as combining holoscopic capture with time-of-flight depth sensing or neural radiance-field rendering to enhance depth fidelity and reduce computational cost.

Although the global 3D display market is projected to reach USD 190 billion in the coming years, with glasses-free or autostereoscopic technologies identified as the fastest-growing subsegment [10,11,12,13,14], there remains a pressing need to better understand the underlying technological principles and practical implications of advanced 3D imaging systems. In this context, the present review aims to provide a overview of holoscopic 3D imaging, including its fundamental theoretical concepts, recent technological advances, and representative applications across diverse fields such as medical imaging, entertainment, virtual reality, and industrial inspection. Furthermore, the review critically discusses the major technical and practical challenges currently faced by holoscopic 3D systems, such as limitations in resolution, depth perception, and real-time processing, and explores promising directions for future research and development to enhance the performance, accessibility, and commercial viability of this emerging technology.

2. Holoscopic 3D Imaging System Principle and Development

The development of Holoscopic 3D (H3D) imaging traces a rich lineage that bridges classical optical theory, early stereoscopic exploration, and contemporary light field technologies. This section provides an overview of the key milestones that have shaped the evolution of holoscopic imaging and explains the foundational principles upon which it is built.

2.1. From Stereoscopic Vision to Integral Photography

Three-dimensional perception stems from binocular disparity—the brain’s ability to fuse slightly different images from each eye into a single depth-aware view. In 1838, Charles Wheatstone applied this principle to invent the stereoscope [15], marking the development of the first three-dimensional visualization system.

As photography evolved in the 19th century, various techniques aimed to simulate depth. Anaglyph stereoscopy, introduced by Rollmann in 1853, used red-cyan filters to deliver stereo pairs, but suffered from color distortion and eye fatigue [16]. Later, polarized stereoscopy in the 1930s improved image quality by projecting two images with orthogonal polarization and using matching glasses [17]. While widely adopted in cinemas, it still required passive eyewear and specialized screens.

In the 1980s, active shutter stereoscopy offered full-resolution imaging by alternating left and right images rapidly on a single display, synchronized with electro-optical shutter glasses [18]. Despite excellent quality, the need for powered glasses and precise synchronization limited comfort and accessibility.

These stereoscopic techniques provided limited depth realism, offering mostly horizontal parallax and requiring wearable devices. They failed to record the full angular variation of light rays in natural scenes [19].

Efforts to eliminate glasses led to two key autostereoscopic methods: the parallax barrier and the lenticular lens. Parallax barriers used vertical slits to direct alternating image columns to each eye, but suffered from brightness loss and narrow viewing zones [20]. Lenticular lenses, which refract light rather than block it, improved brightness and viewing angles, but still lacked vertical parallax and could produce distortion when viewer alignment shifted [21].

A breakthrough came in 1908, when Gabriel Lippmann proposed integral photography, inspired by the compound eyes of insects [1]. His system used a microlens array to capture multiple angular views of a scene, encoding spatial, angular, and intensity data in a single image. Unlike stereoscopy’s two fixed viewpoints, integral photography aimed to record a continuous light field offering glasses-free viewing, natural parallax, and richer depth [22]. Notably, in the 1930s, Eugène Estanave advanced integral photography by experimenting with pinhole and multi-lens configurations that improved light throughput and angular sampling compared to Lippmann’s original design. These early refinements provided an important bridge between theoretical foundations and later digital holoscopic prototypes [23].

Although Lippmann’s theory was revolutionary, limitations in optics and materials delayed its practical implementation. Nonetheless, it laid the theoretical groundwork for modern holoscopic 3D imaging.

2.2. Experimental Refinements and Optical Principles

While Lippmann’s concept of integral photography was groundbreaking, its practical realization was hindered for decades by limitations in optics, sensor resolution, and system integration [6]. Early implementations suffered from optical aberrations, low microlens density, and alignment issues, leading to artifacts like pseudoscopic imaging, poor depth accuracy, and spatial aliasing [24,25,26,27]. Inefficient light throughput and internal scattering further degraded image brightness and usable depth [28].

The emergence of digital sensors, especially high-resolution CCD and later CMOS technologies, marked a turning point. These enabled precise capture of elemental images across dense microlens arrays and supported on-chip processing, facilitating real-time acquisition and playback of holoscopic 3D scenes.

The integration of computational optics and high-speed sensors allowed for dense plenoptic data capture using a single-aperture device. This compact, scalable alternative to multi-camera light field systems helped distinguish holoscopic 3D imaging from traditional light field techniques [29]. Holoscopic systems balance angular sampling density [30], optical simplicity, and display compatibility, making them ideal for immersive, space-constrained scenarios [31,32].

Recent advances have also revitalized Lippmann’s original vision. Innovations in AI-powered 3D reconstruction, multi-depth microlens arrays, and volumetric display systems have extended the capabilities of holoscopic imaging. These developments support new applications in biomedical visualization [33], AR/VR [34], and intelligent sensing [35].

To summarize, the evolution of holoscopic imaging reflects the convergence of classical optics, sensor innovation, and computational imaging transforming a theoretical idea into a practical platform for next-generation 3D technologies.

Table 1 summarizes the key historical and technological milestones that shaped the evolution of holoscopic 3D imaging. Starting from early stereoscopic principles and Lippmann’s proposal of integral photography [1], through Estanave’s refinements in the 1930s [2] and the commercial adoption of stereoscopic display technologies in the 1960s [20], to the emergence of digital holoscopic prototypes in the 1990s, the field has progressed incrementally. The integration of high-resolution CMOS/CCD sensors, computational optics, and microlens array design in the late 20th century enabled practical implementation [22,23,24,31], while more recent advances, such as AI-driven reconstruction, compact optics, and light-field displays have expanded holoscopic imaging into real-time and immersive applications. These developments reflect the transformation of a theoretical concept into a scalable platform for next-generation glasses-free 3D systems.

Holoscopic 3D capture systems have also been validated in various practical scenarios beyond laboratory settings, including minimally invasive surgery, artifact digitization, and autonomous driving. For example, Liu et al. [33] demonstrated micro-gesture recognition for human–machine interaction, Alfaqheri et al. [36] proposed low-delay conversion for multi-view displays in endoscopic procedures, and Cao et al. [6] applied H3D cameras for semantic scene classification in autonomous vehicles.

Table 1. This table summarizes the key milestones of the technologies.

Year	Milestone	Description
1838s	Charles Wheatstone	Invention of the stereoscope, based on the theory of binocular disparity [15].
1908s	Gabriel Lippmann	Proposal of integral photography using a microlens array to capture 3D scenes [1].
1930s	Eugène Estanave	Extended Lippmann’s concept using pinhole arrays and multi-lens configurations [2].
1960s	Anaglyph, Polarized, Stereo Film	Commercial adoption of stereoscopic display technologies in cinema and media [17,18,19].
1990s	Digital Holoscopic Prototypes	Emergence of digital systems combining CCD sensors with microlens arrays [22,23,24].
2010s	Light Field and Holoscopic	Rise of computational light field imaging and holoscopic displays with AI [37,38,39].
2020s	AI + Holoscopic	Integration of deep learning with holoscopic systems for applications [35,36,40].

2.3. Holoscopic 3D Imaging vs. Light Field Imaging vs. Stereoscopic Imaging

Three-dimensional (3D) imaging technologies have evolved significantly over the past century, leading to multiple paradigms for acquiring and displaying spatial information. Among the most prominent are stereoscopic imaging, light field imaging (LFI), and holoscopic 3D imaging (H3D). While these systems all aim to recreate the perception of depth and volume, they differ fundamentally in optical architecture, data representation, system complexity, and viewer experience.

Stereoscopic imaging operates by delivering two offset images one for each eye based on the principle of binocular disparity. It requires either physical or optical separation between the images and relies on viewer-side equipment (e.g., polarized or shutter glasses) to deliver 3D perception [41]. Although this approach is widely adopted due to its simplicity and low cost, it suffers from issues such as visual fatigue, limited depth realism, and the inconsistency between vergence and accommodation mechanisms [42].

Light field imaging, in contrast, captures a dense array of spatial and angular light rays (the plenoptic function), often using microlens arrays or multi-camera setups. It enables post-capture refocusing, synthetic aperture generation, and depth estimation. However, it typically requires substantial computational processing, and many consumer-grade light field systems only capture horizontal parallax, limiting immersion [37].

Holoscopic 3D imaging builds upon integral imaging principles and uses a single-aperture system with a microlens array (MLA) to encode both spatial and angular information in a single shot [43,44,45]. Unlike LFI, holoscopic systems often support full-parallax reconstruction (horizontal and vertical) and offer real-time 3D rendering through optical replay without glasses or head tracking [45,46,47]. Moreover, H3D systems are compact, energy-efficient, and better suited for embedded or mobile platforms. Table 2 illustrates the fundamental distinctions and overlaps among Holoscopic 3D Imaging (H3D), Light Field Imaging (LFI), and Stereoscopic Imaging across various dimensions of imaging workflow, viewer experience, and technical performance.

Table 3 presents representative quantitative metrics for spatial resolution, depth accuracy, processing speed and relative hardware cost across typical Holoscopic 3D (H3D), Light-Field Imaging (LFI) and Stereoscopic systems. The values are drawn from representative configurations reported in the recent literature and commercial datasheets, and reflect typical ranges under specific system parameters rather than absolute limits. In MLA-based systems (both H3D and LFI), the effective spatial resolution of each reconstructed view is not simply the total sensor pixel count, but depends on the hardware configuration—particularly the microlens array (MLA) pitch, arrangement and the number of pixels per microlens. For clarity, the effective spatial resolution can be approximated as

{S p a t i a l R e s o l u t i o n}_{e f f e c t i v e} \approx \frac{N_{p i x l e s, s e n s o r}}{N_{l e n s l e t s}}

(1)

where

N_{p i x e l s, s e n s o r}

is the total number of pixels on the image sensor and

N_{l e n s l e t s}

is the total number of microlenses (or angular samples). This trade-off between spatial and angular resolution explains why the effective per-view resolution of H3D and LFI systems is typically much lower than the raw sensor megapixel count. This quantitative overview complements the qualitative comparison in Table 3 and illustrate the strengths and limitations of each approach. Comparison of Holoscopic 3D, Light-Field and Stereoscopic imaging systems in terms of spatial resolution, depth accuracy, processing speed and hardware cost. The figures are drawn from recent studies and commercial datasheets and reflect typical ranges under specific system configurations rather than absolute limits. This quantitative overview complements the qualitative comparison in Table 2 and clarifies how the reported performance metrics depend on sensor specifications, microlens array design and reconstruction algorithms.

From a capture architecture perspective, both H3D and LFI share a common heritage in integral imaging, leveraging microlens arrays to record spatial and angular information. However, holoscopic systems typically employ a single-aperture configuration, which simplifies optical alignment and mechanical design compared to LFI systems that often require multi-camera arrays or custom plenoptic optics [44]. In contrast, stereoscopic systems rely on dual-viewpoint capture and do not intrinsically encode angular light field information, limiting their flexibility in post-capture depth manipulation.

In terms of depth reconstruction and realism, H3D stands out by naturally preserving full-parallax cue both horizontal and vertical within the encoded elemental image array [53]. This offers more complete motion parallax and viewpoint continuity compared to most LFI implementations, which tend to focus on horizontal parallax only due to microlens layout or sensor limitations [44]. While stereoscopic systems simulate binocular disparity, they often fail to reproduce other essential depth cues such as occlusion, shading variation, and multi-view coherence, resulting in eye strain and limited depth realism.

Regarding computational demands, stereoscopic systems are relatively lightweight, primarily involving stereo matching or disparity computation [43]. LFI systems, by contrast, are computationally intensive—requiring algorithms for depth estimation, refocusing, and view synthesis—often implemented using deep learning or epipolar image analysis. Holoscopic 3D systems strike a middle ground: while optical replay is computationally light, elemental to viewpoint image transformations [54] and depth enhancement techniques still require algorithmic support, especially when integrated into machine learning pipelines for gesture or object recognition [39].

On the display side, holoscopic systems can be seamlessly coupled with display-integrated microlens arrays for real-time, glasses-free 3D playback [26,51]. This offers a distinct advantage over stereoscopic systems, which rely on wearable displays, and LFI displays, which are still largely experimental or limited in resolution and field of view. Importantly, H3D also preserves natural accommodation-convergence relationships, reducing viewer fatigue in prolonged usage scenarios [22,34,48].

Finally, the application scope of H3D is expanding rapidly due to its compatibility with compact hardware, real-time performance, and integrability with existing CMOS/CCD sensors. While LFI is dominant in computational photography and VR, and stereoscopic imaging remains widespread in entertainment and communication, H3D is increasingly being deployed in medical diagnostics, gesture recognition, mobile devices, and cultural heritage visualization—areas where depth fidelity, non-invasiveness, and energy efficiency are critical.

In summary, while each of the three 3D imaging paradigms offers unique strengths, holoscopic 3D imaging presents an attractive compromise between optical simplicity, depth fidelity, and display compatibility—making it a promising candidate for the next generation of immersive imaging systems.

3. Holoscopic 3D Imaging System Architecture

Holoscopic 3D (H3D) imaging systems are based on the principle of optically capturing and reproducing the light field of a scene using a single-aperture system integrated with a microlens array. This chapter describes the architectural components and signal flow of a holoscopic system, from image capture, 3D reconstruction and display, shown Figure 1.

3.1. Capture System: Microlens Array-Based Single Aperture Acquisition of 3D Information

Building on the principles outlined in Section 2, the holoscopic camera implements a single-aperture optical system combining an objective lens, a microlens array (MLA) and a 2D image sensor. This arrangement enables single-shot capture of the light field, reducing hardware alignment complexity and simplifying calibration compared to stereo or multi-camera setups (Figure 2). This configuration described schematically in Figure 2 consolidates capture into a compact device while preserving full parallax and compatibility with standard CMOS sensors, making it well-suited for real-time and embedded applications.

Figure 3 shows a holoscopic 3D camera prototype [48] comprising a single-aperture objective lens, an integrated microlens array (MLA), relay optics and an image sensor. In this configuration, the objective lens forms an intermediate image of the 3D scene which is relayed onto the MLA directly in front of the sensor. Each microlens captures a slightly different perspective, encoding spatial and angular information as an elemental image array (EIs). This single-shot capture simplifies hardware alignment and calibration compared with stereo or multi-camera setups.

In operation, the objective lens first forms an intermediate image of the 3D scene, which is then relayed onto the MLA positioned directly in front of the sensor. Each microlens subsequently re-images the scene from a slightly different perspective, encoding spatial and angular information in a single exposure. This optical configuration underpins the system’s full-parallax capability and compatibility with standard CMOS sensors, making it suitable for real-time and embedded applications.

The objective lens in front of the MLA first forms an intermediate real image of the 3D scene. This intermediate image is then relayed to the MLA, where it is re-imaged through the individual microlenses. As a result, each microlens captures a slightly different perspective of the same object point, depending on its spatial position in the array (see in Figure 4). Each microlens then produces an elemental image (EI) representing a distinct view of the scene. The dense grid of EIs encodes both spatial and angular information, which can later be resampled into viewpoint images (VPIs) for reconstruction and analysis.

The spatial-angular structure encoded by the microlens array enables the formation of a dense grid of Elemental Images (EIs), each representing a slightly different view of the 3D scene. During the capture process, each microlens projects an elemental image onto the sensor, storing both intensity and directional information in a 2D array format. To extract meaningful directional views from this encoded data, Viewpoint Images (VPIs) [7] are synthesized through a computational resampling process [30]. This process involves selecting corresponding pixels—located at the same relative position—across all elemental images. Each viewpoint image effectively represents the scene as observed from a unique angular direction, mimicking a virtual camera perspective. The number and resolution of these VPIs are directly determined by the microlens density and sensor resolution [23], reflecting a trade-off between spatial resolution and angular sampling density.

The performance of holoscopic image acquisition is also highly dependent on the specific configuration of the microlens array. Various MLA geometries and layouts have been developed to optimize angular sampling, light throughput, and system compactness, each with its own advantages [55].

The most common configuration is the rectangular (orthogonal grid) microlens array, as illustrated in Figure 5, where spherical or plano-convex microlenses are arranged in a uniform grid. This layout simplifies calibration and is compatible with standard CMOS sensors, but may introduce angular aliasing when the scene contains fine textures or rapid depth transitions [51,54].

Some holoscopic systems adopt free form or customized MLA designs, such as variable pitch or gradient-index microlenses, to target specific depth ranges or viewing zones [24,29,44]. Additionally, multi-focal MLA architectures have been proposed to extend the depth-of-field and improve reconstruction at multiple focal planes. These approaches are particularly beneficial for wide-angle scenes, biomedical microscopy, and augmented reality.

In terms of fabrication, microlens arrays can be produced via precision injection molding, photolithography, laser writing, or replication techniques, depending on the desired optical quality and production volume. The choice of material—glass, polymer, or hybrid—also affects lens curvature accuracy, transmission, and chromatic behavior.

Therefore, the MLA configuration must be carefully co-designed with both the objective lens and the image sensor to ensure an optimal trade-off among spatial resolution, angular resolution, and depth fidelity [24].

This arrangement allows for the system to encode both spatial and directional (angular) information from the scene into a 2D elemental image array. Although the data are recorded on a flat sensor, the parallax variations across elemental images contain sufficient cues to infer depth, shape, and structure when decoded properly. In addition to its microlens array-based architecture, the holoscopic 3D camera distinguishes itself from other 3D imaging systems through several unique characteristics. First, the system operates with a single optical axis, which simplifies calibration and ensures better geometric consistency across all elemental views. Unlike stereo or multi-camera systems, holoscopic cameras do not suffer from baseline disparity misalignment or inter-camera synchronization issues, making them particularly well-suited for compact, real-time, and mobile platforms.

Second, holoscopic cameras inherently preserve full-parallax capability, capturing angular variations in both horizontal and vertical directions. This contrasts with many commercial 3D or light field cameras that typically provide only horizontal parallax, limiting their spatial realism. The rich angular information encoded by holoscopic systems supports natural motion parallax, accurate occlusion representation, and smooth viewpoint transitions—all essential for immersive and fatigue-free viewing experiences.

Third, the system is fully self-contained, requiring no external tracking or structured illumination for depth estimation. The use of passive optical components (objective lens, MLA, and sensor) makes the hardware both scalable and power-efficient, while also reducing latency and increasing acquisition speed. These traits are particularly advantageous in applications such as medical visualization, cultural heritage digitization, and 3D telepresence, where non-intrusive, real-time, and photorealistic depth capture is essential.

Finally, the compatibility with standard imaging sensors and optical fabrication techniques allows for holoscopic cameras to be economically viable. The modular design facilitates integration into existing camera platforms, and recent research has demonstrated their adaptability in consumer-grade devices and head-mounted displays. Together, these architectural and operational strengths position the holoscopic camera as a promising solution for next-generation 3D imaging technologies.

Beyond their theoretical and structural advantages, H3D capture systems have been successfully applied in several real-world domains, demonstrating their versatility and effectiveness in practical settings. In the field of gesture recognition, Liu et al. proposed a microgesture recognition framework using holoscopic 3D data, achieving high accuracy through deep learning-based viewpoint image analysis and decision fusion [34]. For facial recognition and tracking, the system’s full-parallax capability and high angular resolution provide enhanced robustness under pose variation and occlusion [35]. Moreover, H3D imaging has been experimentally applied in intelligent driving systems [6], where it has shown promise for object recognition and spatial scene understanding under complex lighting and motion conditions.

These case studies validate that the microlens-based single-aperture architecture not only captures rich 3D information in a compact form, but also supports computationally efficient feature extraction, making it suitable for integration into human–machine interfaces, mobile vision platforms, and autonomous systems.

3.2. Display System: Image Reconstruction, MLA Integration, and Spatial Parallax

The display subsystem in holoscopic 3D imaging plays a critical role in decoding and presenting the encoded 3D information captured in the elemental image array. The goal of this stage is to reconstruct the original spatial-angular structure of the light field in such a way that viewers can perceive depth, parallax, and immersion without the need for auxiliary glasses or head tracking. Holoscopic display systems support both optical reconstruction and computational rendering, depending on the hardware configuration and application context.

In optical replay configurations, a display panel—typically an LCD or OLED screen—is overlaid with a microlens array that is geometrically matched to the array used during image capture. When the recorded elemental image is back-illuminated through this display-integrated MLA, each microlens emits light in the same angular distribution as during the original scene capture. This arrangement enables each eye of the viewer to receive slightly different views depending on position and angle, thereby recreating the spatial parallax and depth cues experienced in the original scene. Importantly, this reconstruction is physically realized in the light path, which means it operates in real-time without reliance on computational rendering or view synthesis algorithms.

A key step within the field of computational processing of holoscopic 3D images is the extraction and rearrangement of viewpoint images (VPIs) from the elemental image array [7]. Figure 3 illustrates the VPI extraction process, which utilizes a 3

\times

3 pixel region under each 3

\times

3 elemental image (EI). Mathematically, the VPI can be defined as a periodic subset of pixels from the Holoscopic 3D Image (H3DI), such that for a given pixel location (u,v) under each microlens, all corresponding pixels form a single VPI:

{V P I}_{u, v} (k, l) = H 3 D I (k \cdot U + u, I \cdot V + v)

(2)

where (k,l) denote the microlens indices, and (U,V) represent the number of pixels per elemental image. This structured transformation preserves depth and parallax cues across the reconstructed views, forming the foundation for downstream rendering, reconstruction, or recognition tasks (see Figure 6).

Historically, image reconstruction in holoscopic 3D systems has relied heavily on traditional image processing techniques such as pixel rearrangement, interpolation, and filtering to synthesize viewpoint images and render depth cues effectively [22,55]. As same time, The elemental image array displayed on the 2D panel is composed of thousands of sub-images, each corresponding to a slightly different perspective of the 3D scene. These microimages are carefully arranged such that each microlens captures and projects them into space, reconstructing the original light rays of the scene. Precise alignment between the MLA and the underlying pixels is critical; even subpixel misalignments can introduce significant visual artifacts such as crosstalk, moiré patterns, or depth inversion [26,54].

To address these challenges and improve display fidelity, a variety of pixel mapping techniques have been proposed. One approach, pixel mapping, leverages slanted lens arrays to interleave RGB subpixels, enhancing horizontal angular resolution at the cost of vertical detail. Pixel mapping further refines this strategy by distributing viewpoint information across multiple layers of subpixels [31]. More recent innovations such as Distributed Smart Pixel Mapping(DSPM) [25] dynamically allocate pixel subcomponents across microlenses to improve both resolution and angular coherence. These mapping techniques enable better control over the trade-off between image sharpness and depth continuity, especially when applied to omnidirectional displays [26].

The perceptual quality of holoscopic 3D displays is largely attributed to the richness and continuity of the depth cues they offer. Binocular disparity arises naturally from the angular offsets encoded in the elemental image array, allowing for the human visual system to infer relative depth based on interocular differences. As the viewer moves laterally or vertically, the changing perspectives observed through the MLA provide smooth motion parallax, which enhances the spatial realism of the scene. Furthermore, holoscopic displays inherently preserve the natural linkage between accommodation and convergence responses, reducing the visual fatigue commonly associated with stereoscopic systems. The fine sampling density achieved by the microlens array also supports the preservation of occlusion boundaries, edge detail, and surface layering, which are crucial for depth continuity and shape recognition [54].

Compared to stereoscopic and lenticular-based displays, holoscopic 3D displays present several advantages in visual comfort and immersion. Because they deliver full-parallax viewing—across both horizontal and vertical directions—they allow for multiple observers to perceive consistent depth from different positions without experiencing visual conflicts or alignment artifacts. Moreover, as they do not require any glasses or active shutter devices, holoscopic displays are well-suited for applications in public installations, scientific visualization, and professional workstations where accessibility and user comfort are paramount.

Holoscopic display systems can take various physical forms depending on the target application and performance requirements. Direct-view displays with built-in microlens overlays offer a compact and portable solution for personal and tabletop 3D visualization. Larger-scale systems may employ projector arrays in combination with optical lens sheets to generate walk-around viewing experiences. In immersive environments, head-mounted displays may incorporate microlens layers to support naturalistic depth rendering for VR content. Regardless of the configuration, maintaining proper alignment between the display pixels and the microlens array, as well as ensuring optical conjugacy with the capture system, is critical to achieving high-quality depth reproduction.

4. Recent Advances and Technical Innovations

The evolution of holoscopic 3D imaging has been significantly driven by interdisciplinary progress in computer vision, deep learning, and computational optics. Traditional limitations in resolution, depth accuracy, and system complexity are increasingly being overcome by algorithmic innovations and hardware–software co-design. This chapter outlines the most recent technical developments that have enhanced the capabilities, flexibility, and real-world applicability of holoscopic 3D systems.

4.1. High-Precision Sensing and Recognition in Holoscopic 3D Capture Systems

Holoscopic 3D imaging systems, due to their ability to simultaneously capture spatial and angular light field data in a single exposure, have increasingly been employed as high-precision sensors in domains that demand fine motion analysis and accurate recognition. This capability allows for these systems to encode the 4D light field (comprising two spatial and two angular dimensions) within a micro-scale elemental image array, thereby capturing multi-view information in a single snapshot [28]. This makes them particularly suitable for applications such as microgesture tracking, facial recognition, and scene understanding in autonomous vehicles.

In the gesture recognition domain, holoscopic 3D systems offer dense angular sampling that is highly sensitive to subtle changes in hand position and orientation. Unlike conventional 2D cameras or even stereo setups, holoscopic systems can distinguish micro-movements with sub-millimeter accuracy, which is particularly beneficial for non-contact interaction interfaces in human–machine interaction (HMI). For example, Liu et al. proposed a deep learning-based microgesture recognition model that utilizes elemental images (EIs) from holoscopic data and achieves high recognition rates by leveraging viewpoint diversity and spatiotemporal decision fusion techniques [7]. Figure 7 shows the real micro-gesture data acquisition setup used in experiments.

In facial recognition and tracking, the full-parallax and high angular resolution offered by holoscopic systems significantly improves recognition robustness, especially under varying lighting conditions, pose variations, and partial occlusions. Unlike traditional 2D-based or structured light approaches, holoscopic imaging does not rely on external depth estimation hardware, enabling a compact, passive, and accurate solution for 3D facial analysis [50].

In the context of autonomous driving, holoscopic 3D cameras have been investigated as compact alternatives to LiDAR or multi-camera arrays. Their ability to reconstruct real-time depth from a single-aperture image with embedded angular information reduces hardware complexity while maintaining scene depth fidelity. Recent studies have shown that holoscopic imaging can support robust object detection, distance estimation, and semantic segmentation in dynamic driving environments when integrated with machine learning algorithms [6]. Figure 8 depicts the holoscopic 3D camera deployed for autonomous-vehicle scene capture.

Furthermore, the integration of holoscopic 3D imaging with deep neural networks has led to a new class of recognition systems. This integration is pivotal because DNNs provide the necessary computational framework to directly process the unique data structure of holoscopic imagery. Specifically, deep learning architectures, such as convolutional neural networks (CNNs), are employed to perform critical tasks including feature extraction from elemental images, synthesis and enhancement of viewpoint images, and accurate depth estimation from the implicit parallax information [7,8]. This enables robust 3D classification, detection, and segmentation, which are essential for applications such as microgesture tracking, facial recognition, and autonomous driving scene understanding.

These applications underscore the utility of holoscopic 3D imaging not just as a visualization tool, but as a computational sensor that tightly integrates capture, encoding, and intelligent analysis in a unified framework.

4.2. Synthetic Holoscopic Data Generation

While holoscopic 3D imaging systems offer rich spatial and angular information, their adoption in deep learning-based applications such as gesture recognition, face analysis, and autonomous driving is still constrained by the lack of large-scale, annotated datasets. Capturing real holoscopic data is often expensive, complex, and limited by hardware constraints, which has led to a growing interest in synthetic data generation.

Advances in computer graphics, physically based rendering, and simulation frameworks have enabled the realistic modeling of holoscopic imaging pipelines. Tools such as Blender, Unity, and custom ray tracing engines now support accurate modeling of camera arrays, microlens-based sampling, and light transport simulation, allowing for researchers to generate high-fidelity synthetic holoscopic datasets under fully controllable conditions.

One common approach is to render a set of virtual viewpoint images using a camera grid, then re-map these images into elemental images to synthesize a complete holoscopic 3D image (H3DI). This method provides pixel-level control over scene content, lighting, and motion, and allows for the automatic generation of ground-truth annotations such as depth maps, object masks, gesture labels, or facial landmarks.

Recent progress in holoscopic image simulation has notably improved the availability and accessibility of synthetic datasets. Notably, Almatrouk et al. [53] proposed a novel raw holoscopic image simulator and dataset that addresses several key limitations of prior simulators, such as rendering inefficiency, lack of native raw output, and insufficient resolution flexibility. Their system, built as a Blender-based add-on, generates viewpoint images with adjustable baselines, remaps them into elemental images, and creates high-quality ground truth depth maps using multi-view stereo algorithms. The resulting dataset includes raw holoscopic 1.0 images, a 5D holoscopic matrix, and a versatile toolbox for data manipulation. This comprehensive and efficient platform offers a valuable benchmark for the development and evaluation of algorithms for holoscopic 3D image analysis, stereo reconstruction, and 3D visualization.

In addition to physically based simulation, generative models—particularly GANs and diffusion models—could potentially be used to synthesize elemental images conditioned on semantic input. These models are capable of learning joint distributions of spatial-angular variations and generating plausible multi-view representations for novel objects or poses. Such synthetic datasets could be particularly useful for data augmentation, class balancing, and improving generalization in neural networks, although their application to holoscopic 3D data remains largely unexplored.

Another promising direction is the development of application-specific synthetic datasets, such as those tailored for micro-gesture classification, driver attention detection, or human activity recognition. By simulating realistic environments with parametrized control, researchers can pre-train and validate architectures under ideal conditions before deployment.

Beyond training, synthetic holoscopic datasets also serve as standardized benchmarks for algorithm comparison under known and repeatable settings, facilitating reproducible research in depth estimation, light field reconstruction, and computational refocusing.

In summary, synthetic holoscopic data generation has become an essential complement to physical imaging systems. It reduces cost and complexity while enabling scalable, annotated, and task-relevant datasets for training robust and efficient 3D vision models.

5. Applications and Comparative Evaluation of Holoscopic 3D Imaging Systems

Holoscopic 3D imaging systems have witnessed substantial growth in both theoretical development and practical implementation over the past two decades. As their optical design and computational pipelines continue to mature, they have found increasing relevance across various domains, including human–computer interaction (HCI), intelligent transportation, medical diagnostics, and cultural heritage digitization. This chapter synthesizes recent application outcomes of holoscopic imaging and offers a comparative evaluation against other major 3D imaging modalities.

(a): Human–Machine Interaction and Gesture Recognition

Holoscopic imaging’s full-parallax encoding and micro-perspective resolution make it particularly effective for fine-grained gesture recognition. Liu et al. [7] demonstrated a convolutional neural network (CNN)-based microgesture recognition model using elemental images (EIs), significantly outperforming stereo vision systems due to better angular sampling and motion discriminability. These systems enable intuitive and contactless interaction interfaces, with potential deployment in AR/VR environments and assistive technologies.

(b): Autonomous Driving and Spatial Perception

Recent studies have explored the use of holoscopic cameras for dynamic depth perception in intelligent vehicles. Cao et al. [6] applied H3D imaging for object tracking and obstacle detection, showing promise in scenarios involving complex motion and lighting. The single-aperture structure offers compact integration without the calibration overhead of multi-camera setups.

(c): Medical Imaging and Surgical Assistance

Holoscopic imaging has been applied in minimally invasive surgery and endoscopy, providing surgeons with accurate spatial cues and depth continuity without the need for stereoscopic displays. Ongoing studies focus on integrating holoscopic visualization with robotic surgical systems for better depth-guided manipulation [5].

(d): Cultural Heritage and 3D Documentation

In the digital preservation of cultural artifacts and immersive exhibition design, holoscopic imaging facilitates non-invasive, high-fidelity 3D recording [9]. Its full-parallax rendering supports realistic visualizations for museums, virtual tourism, and education platforms, especially when combined with holographic or volumetric displays.

6. Discussion

Although holoscopic 3D imaging systems offer unique advantages such as full-parallax capture, compact single-aperture design, and glasses-free display compatibility, several practical limitations and emerging trends warrant critical reflection and future investigation.

6.1. Technical Challenges

(1): Spatial–Angular Resolution Trade-off

Due to the finite number of pixels on the sensor, increasing angular resolution through denser microlens arrays inevitably reduces the spatial resolution of each elemental image. This trade-off presents a persistent challenge in holoscopic system design and motivates the development of novel reconstruction algorithms capable of upscaling or compensating lost spatial fidelity.

(2): Parallax Aliasing and Angular Artifacts

Scenes with sharp depth transitions, repetitive textures, or fine structural detail often produce visual artifacts during VPI synthesis, particularly when microlens pitch is too wide or depth sampling is insufficient. These angular inconsistencies degrade the quality of reconstructed depth maps and impair motion parallax perception.

(3): Dataset Scarcity and Benchmarking Difficulties

Due to the limited accessibility and high cost of physical holoscopic cameras, the availability of large-scale annotated datasets is restricted. This hinders the development and evaluation of machine learning-based reconstruction models. As discussed in before, the use of simulated data generation pipelines is an emerging strategy to circumvent this constraint.

6.2. Emerging Trends and Opportunities

(1): Neural Rendering and Learning-Based Depth Estimation

Recent progress in deep learning enables the better extraction of 3D features from elemental image arrays. Models such as CNNs, transformers, or neural radiance fields (NeRF) are increasingly employed to refine EI-to-VPI transformations, enhance depth prediction, and facilitate real-time applications. These learning-based strategies are gradually replacing traditional stereo or block-matching algorithms.

(2): Cross-Modal Sensing Fusion

Holoscopic systems are now being integrated with complementary modalities, such as time-of-flight (ToF), event-based cameras, or structured light sensors, to address limitations in depth range or motion capture. Such hybrid configurations combine the angular richness of holoscopic data with the high temporal or depth precision of other techniques, paving the way for robust 3D perception in complex environments.

(3): Miniaturization and Device-Level Integration

The trend toward hardware miniaturization, especially in head-mounted displays (HMDs) and mobile devices, demands compact and energy-efficient 3D sensors. Holoscopic cameras, with their single-aperture, passive optical architecture, are well-suited to this need. Ongoing research focuses on optimizing microlens fabrication, CMOS integration, and GPU-based processing pipelines to meet mobile deployment constraints.

(4): Commercial Viability, Practical Barriers and Economic Considerations

Although holoscopic 3D imaging has achieved substantial progress in laboratory prototypes and pilot demonstrations, only a limited number of systems have reached pre-commercial deployment. Recent studies illustrate the integration of holoscopic cameras in autonomous vehicles, medical endoscopy and cultural-heritage digitization, demonstrating their readiness for niche markets. However, large-scale adoption is still hindered by several practical constraints.

From a cost perspective, the manufacturing of high-precision microlens arrays and the tight assembly tolerances of single-aperture systems remain significantly more expensive than conventional stereo or plenoptic cameras. The trade-off between spatial and angular resolution also limits scalability to larger displays, while compatibility with existing flat-panel display infrastructures and the lack of standardized pixel-mapping interfaces slow down mass production.

Beyond these technical barriers, the economic feasibility and business models of H3D are equally important. Volume manufacturing and integration with CMOS sensor foundries are expected to reduce unit costs significantly within 3–5 years, following trajectories similar to those of plenoptic and ToF cameras. In the near term, viable business models are likely to focus on high-value niches—such as medical endoscopy, industrial inspection, cultural-heritage digitization, and gesture recognition—where H3D’s unique capabilities justify premium pricing. In the longer term, standardization of data formats and display interfaces may enable licensing of H3D intellectual property (IP)—such as patented microlens designs, calibration algorithms, and reconstruction software—to third-party manufacturers, as well as offering holoscopic 3D sensor modules as OEM components that can be embedded into consumer devices or industrial systems. These economic pathways suggest a gradual transition from bespoke research prototypes to scalable commercial products, with cost reductions driven by cross-industry collaborations and economies of scale.

6.3. Implications for the Broader 3D Imaging Landscape

The evolution of holoscopic 3D imaging technologies presents notable implications for the future of the broader 3D imaging ecosystem. As a self-contained, optically efficient system capable of capturing and displaying full-parallax images, holoscopic imaging offers a compelling alternative to more complex or resource-intensive methods such as stereo vision, structured light scanning, or conventional light field imaging. Its compact architecture and optical coherence make it particularly suited to applications requiring real-time 3D acquisition and display, including biomedical visualization, cultural heritage documentation, and human–machine interaction. In addition, compared with other prevalent modalities such as light-field cameras, VR/AR sensing pipelines and LiDAR systems, holoscopic 3D offers a distinctive balance of full-parallax capture, compact single-aperture hardware and passive operation, making it an effective and energy-efficient alternative for real-time volumetric imaging in mobile or embedded scenarios.

One significant implication is the potential for holoscopic imaging to become a standard modality in embedded and mobile 3D sensing. Unlike multi-camera arrays or depth sensors that require synchronized hardware and external illumination, holoscopic cameras operate passively and require only a single image sensor and a microlens array. This simplicity reduces calibration errors and system cost, while enhancing energy efficiency—factors that are crucial for deployment in consumer electronics, wearable displays, and robotic systems.

Moreover, the compatibility of holoscopic imaging with conventional CMOS fabrication and digital processing pipelines enables smoother integration into existing hardware platforms. With advances in GPU-accelerated rendering and neural network-based decoding, holoscopic data can now be reconstructed and interpreted in real-time. This makes it particularly relevant for applications involving dynamic scenes or user interaction, such as gesture recognition, eye tracking, or 3D telepresence.

In terms of data representation, the structured nature of holoscopic image capture—specifically its dense elemental image array encoding both spatial and angular cues—offers a well-aligned format for emerging paradigms in light field compression, machine vision, and neural rendering. Unlike many traditional light field systems that rely on post-capture depth estimation and suffer from angular redundancy or disparity noise, holoscopic images encode viewpoint diversity at the point of capture, allowing for more direct and stable 3D reconstruction. This not only improves the fidelity of depth perception, but also enables more efficient compression and data transmission strategies in networked applications such as remote diagnostics or virtual inspection.

At a theoretical level, the ability of holoscopic systems to capture a 4D or even 5D plenoptic function—incorporating spectral or temporal dimensions—positions them at the intersection of computational optics and multimodal perception. As researchers continue to explore hybrid sensor fusion techniques and deep learning models that integrate holoscopic data with complementary signals (e.g., ToF, event-based cameras), new forms of robust, high-fidelity 3D sensing are expected to emerge.

In summary, the continued refinement of holoscopic 3D imaging systems not only enhances their viability for specific technical tasks, but also contributes to a broader redefinition of how volumetric visual information is acquired, processed, and experienced. As such, holoscopic imaging is poised to play a central role in the next generation of 3D vision technologies, offering an elegant and efficient alternative to current paradigms while fostering new directions in immersive media and computational perception. Holoscopic 3D imaging, however, remains a niche technology in terms of industry and mass-market applications.

7. Conclusions

Holoscopic 3D (H3D) imaging has emerged as a promising approach to capturing and displaying true 3D information with a compact, single-aperture optical system. This review summarizes the theoretical foundations, key technological advances, and application areas of H3D, highlighting its distinctive ability to encode full spatial–angular light-field information within elemental images and reconstruct realistic viewpoint images without the need for multiple cameras or glasses.

Despite these advantages, several technical and practical challenges remain. High-precision microlens fabrication, trade-offs between spatial and angular resolution, and the lack of standardized display interfaces still limit large-scale deployment. Moreover, commercial adoption has been slowed by manufacturing cost, scalability issues, and integration with existing display infrastructures.

Future Outlook and Time Horizons: In the short term (3–5 years), incremental improvements in microlens manufacturing, CMOS sensor integration, and GPU-accelerated reconstruction pipelines are expected to enable compact, consumer-grade holoscopic cameras for applications such as gesture recognition, medical endoscopy, and cultural-heritage digitization. The standardization of pixel-mapping interfaces and the availability of synthetic training datasets are also likely within this horizon. Over the longer term (5–10 years), more disruptive developments—such as hybrid holoscopic-ToF sensor fusion, neural radiance-field-based real-time rendering, and 5D plenoptic capture (including spectral and temporal dimensions)—are anticipated to move from research prototypes to specialized markets. These longer-term innovations will be crucial for large-scale immersive displays, fully autostereoscopic VR/AR headsets, and embedded robotic vision systems.

Looking further ahead, the convergence of holoscopic imaging with deep learning, neural rendering, and hybrid sensor platforms presents a fertile ground for innovation. As hardware miniaturization and computational efficiency continue to improve, holoscopic 3D systems are poised to evolve from niche prototypes into widely adopted solutions, shaping the next decade of 3D imaging and display technologies.

Author Contributions

Conceptualization, H.M. and Y.L.; methodology, Y.L.; validation, M.R.S., H.M., Y.L. and C.Y.; formal analysis, Y.L.; investigation, Y.L. and C.Y.; writing—original draft preparation, Y.L. and H.M.; writing—review and editing, H.M. and M.R.S.; visualization, Y.L., Y.H. and C.Y.; supervision, H.M. and M.R.S.; project administration, Y.L., Y.H. and C.Y.; funding acquisition, Y.L., Y.H. and C.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by The Beijing Institute of Graphic Communication Excellent Project Fund, grant number: Ea202417 and Ea202421. Scientific Research Platform Construction Project of Beijing Institute of Graphic Communication, grant number: KYCPT202511 and the China National Undergraduate Innovation and Entrepreneurship Competition Fund, grant number: 202410015024.

Acknowledgments

The authors gratefully acknowledge the partial financial support from the funding sources mentioned above.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

H3D	Holoscopic 3D
MLA	microlens array
VPIs	viewpoint images
EI	elemental image
CNNs	convolutional neural networks
RNNs	recurrent neural networks
GANs	generative adversarial networks
CCD	charge-coupled device
CMOS	complementary metal oxide semiconductor
LFI	Light Field Imaging
ToF	Time of Flight
HMDs	head-mounted displays
IP	intellectual property
OEM	Original Equipment Manufacturer
DSPM	Distributed pixel mapping

References

Lippmann, G. La photographie intégrale. Comptes-Rendus l’Acad. Sci. 1908, 146, 446–451. [Google Scholar]
Estanave, E. Le stéréophotographie et la photographie intégrale. C. R. Acad. Sci. 1930, 190. [Google Scholar]
Steurer, J.; Pesch, M.; Hahne, C.; Kauff, P. 3D Holoscopic Video Imaging System. Proc. SPIE 2012, 8291, 829109. [Google Scholar] [CrossRef]
Aggoun, A.; Tsekleves, E.; Zarpalas, D.; Dimou, A.; Daras, P.; Nunes, P.; Ducla Soares, L. Immersive 3D Holoscopic Video System. IEEE Multimed. 2013, 20, 28–37. [Google Scholar] [CrossRef]
Makanjuola, J.K.; Aggoun, A.; Swash, M.; Grange, P.C.R.; Challacombe, B.; Dasgupta, P. 3D-Holoscopic Imaging: A New Dimension to Enhance Imaging in Minimally Invasive Therapy in Urologic Oncology. J. Endourol. 2013, 27, 535–539. [Google Scholar] [CrossRef] [PubMed]
Cao, C.; Swash, M.R.; Meng, H. Semantic 3D Scene Classification Based on Holoscopic 3D Camera for Autonomous Vehicles. In Proceedings of the International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery, Xi′an, China, 1–3 August 2020; Springer International Publishing: Cham, Switzerland, 2020; pp. 897–904. [Google Scholar]
Liu, Y.; Peng, M.; Swash, M.R.; Chen, T.; Qin, R.; Meng, H. Holoscopic 3D Microgesture Recognition by Deep Neural Network Model Based on Viewpoint Images and Decision Fusion. IEEE Trans. Hum.-Mach. Syst. 2021, 51, 162–171. [Google Scholar] [CrossRef]
Alnaim, N.; Abbod, M.; Swash, R. Recognition of Holoscopic 3D Video Hand Gesture Using Convolutional Neural Networks. Technologies 2020, 8, 19. [Google Scholar] [CrossRef]
MarketsandMarkets. 3D Imaging Market by Component (Hardware, Software, Services), Deployment Mode, Organization Size, Application Area (Healthcare, Entertainment & Media, Automotive, Industrial) and Region—Global Forecast to 2028; MarketsandMarkets: Maharashtra, India, 2023; Available online: https://www.marketsandmarkets.com/Market-Reports/3d-imaging-market-998.html (accessed on 18 June 2025).
IMARC Group. 3D Display Market: Global Industry Trends, Share, Size, Growth, Opportunity and Forecast 2024–2033. Available online: https://www.imarcgroup.com/3d-display-market (accessed on 20 June 2025).
Fact.MR. Autostereoscopic 3D Display Market Forecast 2023–2033. Available online: https://www.factmr.com (accessed on 20 June 2025).
Verified Market Reports. Naked Eye 3D LED Display Market Size and Forecast 2024–2033. Available online: https://www.verifiedmarketreports.com (accessed on 20 June 2025).
Data Insights Market Research. Global Autostereoscopic Display Market Report. Available online: https://www.datainsightsmarket.com (accessed on 20 June 2025).
Wheatstone, C. Contributions to the Physiology of Vision.—Part the First. On Some Remarkable, and Hitherto Unobserved, Phenomena of Binocular Vision. Philos. Trans. R. Soc. Lond. 1838, 128, 371–394. [Google Scholar]
Rollmann, W. Zwei neue stereoskopische Methoden. Ann. Phys. 1853, 165, 186–187. [Google Scholar] [CrossRef]
Kim, J.; Kim, Y.; Hong, J.; Park, G.; Hong, K.; Min, S.W.; Lee, B. A Full-Color Anaglyph Three-Dimensional Display System Using Active Color Filter Glasses. J. Inf. Disp. 2011, 12, 37–41. [Google Scholar] [CrossRef]
Turner, T.L.; Hellbaum, R.F. LC shutter glasses provide 3-D display for simulated flight. Inf. Disp. 1986, 2, 22–24. [Google Scholar]
Javidi, B.; Okano, F. Three-Dimensional Imaging, Visualization, and Display. In Three-Dimensional Imaging, Visualization, and Display; Son, J.Y., Ed.; Springer: New York, NY, USA, 2009; Volume 14, pp. 281–299. [Google Scholar]
Okoshi, T. Three-Dimensional Imaging Techniques; Elsevier: Amsterdam, The Netherlands, 2012. [Google Scholar]
Roberts, D.E. History of Lenticular and Related Autostereoscopic Methods; Leap Technologies: Hillsboro, OR, USA, 2003. [Google Scholar]
Swash, M.R.; Aggoun, A.; Abdulfatah, O.; Li, B.; Fernández, J.C.; Tsekleves, E. Holoscopic 3D Image Rendering for Autostereoscopic Multiview 3D Display. In Proceedings of the 2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB), London, UK, 5–7 June 2013; pp. 1–4. [Google Scholar] [CrossRef]
Almatrouk, B.; Meng, H.; Swash, M.R. Holoscopic Elemental-Image-Based Disparity Estimation Using Multi-Scale, Multi-Window Semi-Global Block Matching. Appl. Sci. 2024, 14, 3335. [Google Scholar] [CrossRef]
Swash, M. Holoscopic 3D Imaging and Display Technology: Camera/Processing/Display. Ph.D. Thesis, Brunel University London, Uxbridge, UK, 2013. [Google Scholar]
Swash, M.R.; Aggoun, A.; Abdulfatah, O.; Li, B.; Fernández, J.C.; Alazawi, E.; Tsekleves, E. Pre-Processing of Holoscopic 3D Image for Autostereoscopic 3D Displays. In Proceedings of the 2013 International Conference on 3D Imaging, Palo Alto, CA, USA, 3–5 December 2013; pp. 1–5. [Google Scholar]
Swash, M.R.; Aggoun, A.; Abdulfatah, O.; Fernandez, J.C.; Alazawi, E.; Tsekleves, E. Distributed pixel mapping for refining dark area in parallax barriers based holoscopic 3D Display. In Proceedings of the 2013 International Conference on 3D Imaging, Seattle, WA, USA, 29 June–1 July 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 1–4. [Google Scholar]
Huang, Y.; Swash, M.R.; Lei, T.; Li, K.; Xiong, N.; Wang, L. Implementation and Evaluation of Innovative 3D Pixel Mapping Method for LED Holoscopic 3D Wall Display. In Advances in Natural Computation, Fuzzy Systems and Knowledge Discovery. ICNC-FSKD 2020; Lecture Notes on Data Engineering and Communications Technologies; Springer: Cham, Switzerland; Volume 88, pp. 104–117. [CrossRef]
Alazawi, E.; Swash, M.R.; Abbod, M. 3D Depth Measurement for Holoscopic 3D Imaging System. J. Comput. Commun. 2016, 4, 41–49. [Google Scholar] [CrossRef][Green Version]
Fernández, J.C.J. Capturing of 3D Content Using a Single Aperture Camera. Ph.D. Thesis, University of Bedfordshire, Luton, UK, 2018. [Google Scholar][Green Version]
Aggoun, A.; McCormick, M.; Spilsbury, M.; Velisavljevic, V.; Reid, D.; Davies, P. Immersive 3D Holoscopic Video System. IEEE MultiMedia 2013, 20, 28–37. [Google Scholar] [CrossRef]
Belhi, A.; Bouras, A.; Alfaqheri, T.; Aondoakaa, A.S.; Sadka, A.H. Investigating 3D Holoscopic Visual Content Upsampling Using Super-Resolution for Cultural Heritage Digitization. Signal Process. Image Commun. 2019, 75, 188–198. [Google Scholar]
Alazawi, E.; Aggoun, A.; Abbod, M.; Swash, M.R.; Fatah, O.A.; Fernandez, J. Scene Depth Extraction from Holoscopic Imaging Technology. In Proceedings of the 2013 3DTV-Conference: Vision Beyond Depth (3DTV-CON), Aberdeen, UK, 7–9 October 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 1–4. [Google Scholar] [CrossRef]
Almatrouk, B.; Swash, M.R.; Sadka, A.H. Innovative 3D depth map generation from a holoscopic 3D image based on graph cut technique. arXiv 2018, arXiv:1811.04217. [Google Scholar] [CrossRef]
Liu, Y.; Meng, H.; Swash, M.R.; Gaus, Y.F.A.; Qin, R. Holoscopic 3D Micro-Gesture Database for Wearable Device Interaction. In Proceedings of the 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), Xi′an, China, 15–19 May 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 802–807. [Google Scholar] [CrossRef]
Zhang, W.; Zhang, W.; Shao, J. Classification of Holoscopic 3D Micro-Gesture Images and Videos. In Proceedings of the 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), Xi′an, China, 15–19 May 2018; pp. 815–818. [Google Scholar] [CrossRef]
Fang, L. Plenoptic Imaging and Processing; Springer: Cham, Switzerland, 2024; ISBN 978-981-97-6914-8. eBook ISBN: 978-981-97-6915-5. [Google Scholar] [CrossRef]
Alfaqheri, T.; Aondoakaa, A.S.; Swash, M.R.; Sadka, A.H. Low-delay single holoscopic 3D computer-generated image to multiview images. J. Real-Time Image Process. 2020, 17, 2015–2027. [Google Scholar] [CrossRef]
Lei, T.; Jia, X.; Zhang, Y.; Zhang, Y.; Su, X.; Liu, S. Holoscopic 3D Micro-Gesture Recognition Based on Fast Preprocessing and Deep Learning Techniques. In Proceedings of the 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), Xi′an, China, 15–19 May 2018; pp. 795–801. [Google Scholar] [CrossRef]
Sharma, G.; Jyoti, S.; Dhall, A. Hybrid Neural Networks Based Approach for Holoscopic Micro-Gesture Recognition in Images and Videos. In Proceedings of the 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), Xi′an, China, 15–19 May 2018; pp. 808–814. [Google Scholar] [CrossRef]
Liu, Y.; Yang, S.; Meng, H.; Swash, M.R.; Shan, S. A Novel Pseudo Viewpoint Based Holoscopic 3D Micro-Gesture Recognition. In Proceedings of the Companion Publication of the 2020 International Conference on Multimodal Interaction (ICMI ′20 Companion), Virtual, 25–29 October 2020; ACM: New York, NY, USA, 2021; pp. 77–81. [Google Scholar] [CrossRef]
Peng, M.; Wang, C.; Chen, T. Attention-based Residual Network for Micro-Gesture Recognition. In Proceedings of the 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), Xi′an, China, 15–19 May 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 790–794. [Google Scholar] [CrossRef]
Fatah, O.A.; Aggoun, A.; Swash, M.R.; Alazawi, E.; Fernandez, J. Generating Stereoscopic 3D from Holoscopic 3D. In Proceedings of the 2013 3DTV-Conference: Vision Beyond Depth (3DTV-CON), Aberdeen, UK, 6–8 October 2013; pp. 1–3. [Google Scholar] [CrossRef]
Swash, M.R.; Aggoun, A.; Abdulfatah, O.; Li, B.; Fernández, J.C.; Tsekleves, E. Omnidirectional Holoscopic 3D Content Generation Using Dual Orthographic Projection. In Proceedings of the 2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB), London, UK, 5–7 June 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 1–4. [Google Scholar] [CrossRef]
Alazawi, E.; Abbod, M.; Aggoun, A.; Swash, M.R.; Fatah, O.A.; Fernandez, J. Super Depth-Map Rendering by Converting Holoscopic Viewpoint to Perspective Projection. In Proceedings of the 2014 3DTV-Conference: The True Vision—Capture, Transmission and Display of 3D Video (3DTV-CON), Budapest, Hungary, 2–4 July 2014; pp. 1–4. [Google Scholar] [CrossRef]
Almatrouk, B.; Meng, H.; Swash, M.R. Disparity Estimation from Holoscopic Elemental Images. In Advances in Natural Computation, Fuzzy Systems and Knowledge Discovery. ICNC-FSKD 2020; Lecture Notes on Data Engineering and Communications Technologies; Meng, H., Lei, T., Li, M., Li, K., Xiong, N., Wang, L., Eds.; Springer: Cham, Switzerland, 2021; Volume 88. [Google Scholar] [CrossRef]
Huang, Y.; Swash, M.R.; Sadka, A.H. Real-Time Holoscopic 3D Video Interlacing. In Proceedings of the 2020 7th International Conference on Signal Processing and Integrated Networks (SPIN), Noida, India, 27–28 February 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–4. [Google Scholar] [CrossRef]
Almatrouk, B.; Meng, H.; Swash, R. Elemental Images Labelling and Grouping to Minimise Disparity Error in Texture-less Regions of Holoscopic Images. In Proceedings of the 2023 8th International Conference on Image, Vision and Computing (ICIVC), Dalian, China, 27–30 June 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 531–536. [Google Scholar] [CrossRef]
Cao, C.; Swash, M.R.; Meng, H. Reliable Holoscopic 3D Face Recognition. In Proceedings of the 7th International Conference on Signal Processing and Integrated Networks (SPIN), Noida, India, 27–28 February 2020; pp. 696–701. [Google Scholar]
Aondoakaa, A.S.; Swash, M.R.; Sadka, A. 3D Depth Estimation from a Holoscopic 3D Image. In Imaging and Applied Optics 2017 (3D, AIO, COSI, IS, MATH, pcAOP); Paper DW1F.5; OSA Technical Digest (online); Optica Publishing Group: Washington, DC, USA, 2017. [Google Scholar] [CrossRef]
Kim, M.; Jeong, M.; Lee, M.; Kim, J.; Choi, Y.J.; Kim, S.S.; Jeon, H.G.; Shin, J. The path-tracing simulation of light-field camera system: SurfCam/GrainCams for lunar surface exploration. Adv. Space Res. 2025, 75, 4050–4060. [Google Scholar] [CrossRef]
Huang, Z.; Fessler, J.A.; Norris, T.B. Focal stack camera: Depth estimation performance comparison and design exploration. Opt. Contin. 2022, 1, 2030–2042. [Google Scholar] [CrossRef]
Liu, Y.; Wang, L.; Zhao, L.; Yu, Z. (Eds.) Advances in Intelligent Systems and Computing; Springer: Cham, Switzerland, 2020; Volume 1075. [Google Scholar] [CrossRef]
Shin, C.; Jeon, H.-G.; Yoon, Y.; Kweon, I.S.; Kim, S.J. EPINET: A fully-convolutional neural network using epipolar geometry for depth from light-field images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2018), Salt Lake City, UT, USA, 18–22 June 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 4748–4757. [Google Scholar] [CrossRef]
Almatrouk, B.; Meng, H.; Aondoakaa, A.; Swash, R. A New Raw Holoscopic Image Simulator and Data Generation. In Proceedings of the 2023 8th International Conference on Image, Vision and Computing (ICIVC), Dalian, China, 14–16 July 2023; pp. 489–494. [Google Scholar] [CrossRef]
Qin, R.; Liu, Y.; Wang, L.; Zhao, L.; Yu, Z. A Fast Automatic Holoscopic 3D Micro-Gesture Recognition System for Immersive Applications. In Advances in Natural Computation, Fuzzy Systems and Knowledge Discovery, ICNC-FSKD 2019; Springer: Cham, Switzerland, 2020. [Google Scholar]
Hosseini, S.; Swash, M.R.; Sadka, A. Immersive 360 Holoscopic 3D Imaging System Design. In Imaging and Applied Optics 2017 (3D, AIO, COSI, IS, MATH, pcAOP); Paper DW1F.4; OSA Technical Digest (online); Optica Publishing Group: Washington, DC, USA, 2017. [Google Scholar] [CrossRef]
Starks, M. Stereoscopic Imaging Technology; 3DTV Corporation: New York, NY, USA, 1996; Available online: http://www.3dmagic.com/articles/sit.html (accessed on 10 June 2025).

Figure 1. Holoscopic 3D capture, computational processing and replay device. (a) Holoscopic capture device. (b) Processing/Reconstruction. (c) Holoscopic replay device.

Figure 2. The standard holoscopic 3D imaging architecture is designed to capture the target object.

Figure 3. Holoscopic 3D Camera prototype [48].

Figure 4. H3D microgesture images consist of multiple 2D element images (EIs) [7].

Figure 5. Rectangular microlens array illustrating array dimensions and microlens pitch.

Figure 6. This illustration of the principle of H3D image viewpoint image extraction. (a) The 3

\times

3 pixels under each micro-lens. (b) One-viewpoint image extracted from same position under different micro-lenses. (c) Nine-viewpoint images extracted from 3

\times

3 EIs [27].

Figure 6. This illustration of the principle of H3D image viewpoint image extraction. (a) The 3

\times

3 pixels under each micro-lens. (b) One-viewpoint image extracted from same position under different micro-lenses. (c) Nine-viewpoint images extracted from 3

\times

3 EIs [27].

Figure 7. Holoscopic 3D microgesture data acquisition scene [56].

Figure 8. Scene capturing using a Holoscopic 3D Camera for autonomous vehicle testing [6].

Table 2. Comparison of stereoscopic, light field, and holoscopic 3D Imaging.

Category	H3D	LFI	Stereoscopic
Capture Method	Holoscopic 3D camera	Plenoptic camera/	Dual-camera/
	Single sensor + MLA	MLA-based camera/	Dual-view capture
	Elemental image array	Camera arrays
Parallax	Full (horizontal + vertical)	Horizontal only	Horizontal only
Depth Reconstruction	Embedded in EI View synthesis Optical replay	Computed from plenoptic function using disparity or rendering	Derived from binocular disparity
Display Mode	Glasses-free autostereoscopic with MLA display	Computationally rendered or specialized display	Requires glasses or headgear
Computation Overhead	Moderate	High	Low
Computation Overhead	decoding and rendering algorithms	refocusing and dense depth processing	Simple stereo matching
Advantages	Real-time 3D Compact/immersive No eyewear needed	Refocusing ability Digital zoom Multiple-view synthesis	Mature tech Cost-effective Widespread
Limitations	Spatial-angular resolution trade-off Calibration needed	Resolution limits Heavy computation Bulky hardware	Visual fatigue Lack of vertical parallax Limited realism
Applications	AR/VR Gesture/facial recognition Medical imaging	Computational photography scientific imaging VR	3D cinema gaming entertainment

Table 3. Representative quantitative comparison of 3D imaging technologies.

Metric	H3D	LFI	Stereoscopic
Spatial Resolution (MP effective)	~2–5 MP per view (35 MP sensor divided by MLA) [48]	~4 MP per view (40 MP Lytro sensor; space–angle trade-off) [49]	2–8 MP per view (Full HD–4K per eye)
Depth Accuracy	mm–cm level via disparity/graph-cut depth estimation [33]	mm-level RMSE reported in recent algorithms [50]	~1% of distance at near; up to ~9% at far
Depth Reconstruction	Moderate; near real-time possible with optimized algorithms [33]	GPU accelerated; some deep networks achieve near real-time [51,52]	Real-time feasible (low computational load)
Hardware Cost	Moderate: single sensor + MLA [48]	High: custom MLA or multi-camera arrays	Low: consumer stereo or depth modules

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, Y.; Meng, H.; Swash, M.R.; Huang, Y.; Yan, C. Holoscopic 3D Imaging Systems: A Review of History, Recent Advances and Future Directions. Appl. Sci. 2025, 15, 10284. https://doi.org/10.3390/app151810284

AMA Style

Liu Y, Meng H, Swash MR, Huang Y, Yan C. Holoscopic 3D Imaging Systems: A Review of History, Recent Advances and Future Directions. Applied Sciences. 2025; 15(18):10284. https://doi.org/10.3390/app151810284

Chicago/Turabian Style

Liu, Yi, Hongying Meng, Mohammad Rafiq Swash, Yiyuan Huang, and Chen Yan. 2025. "Holoscopic 3D Imaging Systems: A Review of History, Recent Advances and Future Directions" Applied Sciences 15, no. 18: 10284. https://doi.org/10.3390/app151810284

APA Style

Liu, Y., Meng, H., Swash, M. R., Huang, Y., & Yan, C. (2025). Holoscopic 3D Imaging Systems: A Review of History, Recent Advances and Future Directions. Applied Sciences, 15(18), 10284. https://doi.org/10.3390/app151810284

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Holoscopic 3D Imaging Systems: A Review of History, Recent Advances and Future Directions

Abstract

1. Introduction

2. Holoscopic 3D Imaging System Principle and Development

2.1. From Stereoscopic Vision to Integral Photography

2.2. Experimental Refinements and Optical Principles

2.3. Holoscopic 3D Imaging vs. Light Field Imaging vs. Stereoscopic Imaging

3. Holoscopic 3D Imaging System Architecture

3.1. Capture System: Microlens Array-Based Single Aperture Acquisition of 3D Information

3.2. Display System: Image Reconstruction, MLA Integration, and Spatial Parallax

4. Recent Advances and Technical Innovations

4.1. High-Precision Sensing and Recognition in Holoscopic 3D Capture Systems

4.2. Synthetic Holoscopic Data Generation

5. Applications and Comparative Evaluation of Holoscopic 3D Imaging Systems

6. Discussion

6.1. Technical Challenges

6.2. Emerging Trends and Opportunities

6.3. Implications for the Broader 3D Imaging Landscape

7. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI