1. Introduction
Light microscopy has played an important role for many years in biological research by enabling observations of a wide range of objects, from cells or other biological samples to macroscopic organisms. Fluorescence microscopy enables the observation of even the smallest structures of organisms in great detail. Individual sections or compartments can be examined by staining them with specific fluorescent labels. This enables the high-resolution acquisition and visualization of multidimensional and multicolor data. Although the raw data in confocal microscopy are 2D images, 3D images can be generated by collecting a sequence of these 2D images and stacking them atop each other. This collection of sections is called a “z-stack” and can be represented as a three-dimensional matrix. The widespread use of fluorescence microscopy has provided motivation for technological improvements and the development of new methods to improve resolution and overcome the limitations associated with light diffraction. Modern super-resolution microscopy methods can achieve a lateral resolution of less than 200 nm and an axial resolution of less than 600 nm. For example, stimulated emission depletion microscopy (STED) [
1] achieves a lateral resolution of about 30 nm. This method uses an additional depletion beam, with a higher wavelength than the excitation beam, which has an annular cross section. The resolution is improved by quenching of the fluorescence at the edge of the illuminated area, with fluorescence occurring only in the unquenched area within the annulus. In addition to STED, many other different high-resolution microscopy techniques have been developed, such as reversible saturable optically linear fluorescence transitions (RESOLFTs) [
2], structured illumination microscopy (SIM) [
3], stochastic optical reconstruction microscopy (STORM) [
4], photoactivated localization microscopy (PALM) [
5], and fluorescence photoactivation localization microscopy (FPALM) [
6].
The acquisition of high-resolution 3D volumetric data is no longer limited to (scanning) confocal microscopy. Light-sheet fluorescence microscopy (LSFM) is another interesting technology for generating 3D fluorescence data. In this modification of wide-field fluorescence surface microscopy, the excitation beam passes perpendicular to the objective. A thin optical section is obtained from the plane of sharpness; the thickness and dimension of the field of view depend on the numerical aperture of the lens used and the depth of focus of the excitation beam. Several planes of the section are gradually acquired and can then be combined in a 3D reconstruction. However, LSFM requires a special hardware component for the laser illumination as an external module to the wide-field microscope [
7]. The 3D detailed display is, thus, while increasingly available, quite technically complicated.
Thick samples, such as cells and tissue sections, can present problems for conventional wide-field fluorescence optics. Strong fluorescent signals from objects outside the focal plane result in a low-contrast image. Procedures for processing 3D volumetric data can help compensate for these effects. For example, deconvolution algorithms are used in both confocal and wide-field fluorescence microscopy. In the latter case, a blurred image is a common problem due to the out-of-focus capture of a signal; the blurriness is easily corrected with these algorithms. In wide-field fluorescence microscopy, the combination of the technological capabilities of gradual focusing and obtaining images from several z-planes with the 3D reconstruction from the obtained sequence of images allows a resolution close to that of confocal microscopy. Leica Microscopy’s newly developed Thunder Imager solution offers a simplified implementation of this software analysis without the need for special external hardware [
8]. This software system uses a combination of the sequential acquisition of images in different z-positions and the processing of individual images to remove out-of-focus blur through computational cleaning. The results are fast and easily rendered as biologically relevant 3D models. Practically, this means that the use of additional hardware or software solution for obtaining 3D fluorescence data in wide-field microscopy can be helpful to obtain 3D biological reconstruction for further analysis.
Modern microscopic techniques, such as laser scanning confocal microscopy (LSCM), yield 3D and multidimensional data (4D and 5D). Unlike in the 2D case, visualizing 3D and multidimensional data can be a difficult task. Computer screens are exclusively two-dimensional, which makes it difficult to visualize the third dimension of information in an image. The 3D data must be transformed into a single 2D image, and information will always be lost during this step. Most light microscopes give us a 2D view of a physical object. We usually then observe projections of a three-dimensional physical structure onto a two-dimensional surface. This means that one dimension is lost, which significantly limits our perception of the physical reality. Determining three-dimensional structures from the macro to the atomic level is a key focus of current biochemical research. Basic biological processes, such as DNA metabolism, photosynthesis, and protein synthesis, require the coordinated action of many components. Understanding the three-dimensional organization of these components and their detailed atomic structures is essential to interpret their function. Therefore, nowadays, the method for displaying and visualizing the obtained data is an important part of microscopy. Properly designed visualization of volumetric fluorescence data must allow the data to be displayed at any scale and an array of details to be highlighted. There are many open-source imaging applications originally developed to solve the problem of visualizing three-dimensional images. The commonly used ImageJ [
9,
10] and Fiji [
11] can store multiple channels and visualize them in three dimensions using the “hyperstack” function. The Visualization Toolkit (VTK) [
12] does not provide a module for visualizing “more than RGB” channels. Thus, with VTK-dependent software tools (OsiriX [
13], 3D Slicer [
14,
15], etc.), it is difficult to fully visualize multichannel fluorescence microscopy data. The main disadvantage of Fiji is that it does not include an adequate tool for the 3D rendering of multichannel data. FluoRender [
16] corrects this shortcoming.
Virtual reality no longer is just a part of the gaming industry but is gradually being utilized for scientific applications and integrated into the teaching process. It has a place, for example, in medicine in the simulation of operations, the teaching of anatomy, and the imaging and analysis of biological objects. Another potential use is the visualization of 3D multicolor objects obtained by fluorescence microscopy. Conventionally, 3D fluorescence data are visualized on large monitors or using data projectors. However, now VR technology provides an innovative way to visualize multidimensional image data and models so that a full 3D structure can be understood quickly and intuitively, even at large volumes, with various object parameters or details on or inside the object easily viewable.
A considerable part of the visualization software designed for specialized applications, such as observation and work with biological data, is software based and is mostly used for more advanced work with the object (observation, analysis). Classical (default) drivers and direct interaction of these drivers with individual elements of the visualized menu or with the object are used. These controllers are usually displayed in VR in the form of virtual controllers, for example, Medicalholodeck (medicalholodeck.com; medical virtual reality for 3D surgery planning and education), CellexalVR [
17] as a virtual reality platform for the visualization and analysis of single-cell gene expression data, and VR neuron tracer [
18]. Some other control systems are based on monitoring hands in space [
19]. The hands and fingers are then visualized in a virtual environment and can be used for direct virtual object interactions. As shown in [
20], it is also possible to combine the use of a gamepad (for object manipulation) and hand tracking to control other functions.
In 2018, a new software tool, ConfocalVR [
21], was introduced. It uses VR systems to completely immerse the user within a 3D cellular image, which he or she can directly manipulate via interactive controls. In this virtual environment, the user can adjust image parameters (contrast, brightness, lighting), work with the histogram and different color channels, rotate the object to observe surface details from different sides, zoom, and even communicate with other users in a shared virtual space (enabling real-time research and discussion). The display process of ConfocalVR consists of several steps. Multidimensional microscope images are first preprocessed in ImageJ and then loaded into ConfocalVR. The user puts on a VR headset and immediately sees the image data as a fully rendered 3D image that can be grabbed and rotated, moved and resized, and visually adjusted using virtual slider controls [
21]. Three primary visual rendering modes have been implemented, each providing a unique view of the image structure. The default is a translucent mode. Another, an illuminated mode, uses a virtual external light to create isosurfaces that reflect the boundaries between cellular structures. The last, a cut-range mode, can be used to remove background noise as well as oversaturated voxels from the image. In some cases, this restores important details in the image. ConfocalVR has been replaced by ExMicroVR [
22] (free to nonprofits), which newly offers a number of extensions. These include, in particular, a larger range of visual parameters in the menu and the addition of more menu elements (such as checkboxes). Menu items can now be accessed and controlled using a virtual pointer. On the other hand, ConfocalVR is now only available in the commercial version.
There are several other examples of using virtual reality to display microscopic data. Arivis [
23], a provider of software for the visualization, analysis, and management of large image data, announced the new VisionVR [
24], which now allows users to view 3D and 4D image volumes and surfaces in an immersive environment. It uses direct volume rendering techniques to display image data in virtual reality, allowing the assignment of each individual data point to the original multidimensional voxel image within the rendered object. VisionVR gives users complete control over how their data are rendered in VR. Using VR controllers, a user can directly rotate, move, scale, shape, mark, classify, measure, edit, and segment their digital image data. The tool also allows a user to save a so-called 360-degree movie, which can be played on a standard desktop computer using a mouse. The disadvantage of this commercial software is its high purchase price.
FluoRender [
16] brings new possibilities as an advanced free and open-source software released under the MIT License. It is an interactive tool for multichannel fluorescence microscopy data visualization and analysis. The FluoRender visualization combines the rendering of multichannel volume data and polygon mesh data, with the properties of each dataset independently and quickly adjustable. It is also able to draw and mix channels in different modes, apply 2D image space enhancements, play confocal time sequences, extract structures by painting, and visualize polygon models from these extracted structures along with volumetric data. FluoRender offers two main plot methods for basic plots. The direct volume rendering (DVR) method is very computationally intensive but generates realistic 3D images that reflect the physics of light transmission and absorption. The second method, maximum intensity projection (MIP), is much simpler. This method takes the maximum value of the signal intensities between the voxels that are observed by the observer along the same light path [
16].
In addition to the VR visualization itself, there is a need to interface with the virtual environment and objects. Outside of VR, differently complex models, such as those created by CAD and other computer modeling, or visualized medical volumetric data, are commonly displayed and controlled on PCs and computer workstations using available common controls—computer mouse, keyboard, and trackball, which allow the manipulation of the object and control of basic object properties. Other functions are usually accessed via software navigational and functional tools located in visualization or modeling software. There are also special desktop controllers, such as the trackball mouse or SpaceMouse [
25], that make it easier to work with model controls. In virtual reality, the control of various virtual object parameters is more complicated due to the absence of visual contact with the controls (software controls and usually also hardware controls). Many free visualization programs (e.g., ImageJ, FluoRender) solve the problem by allowing the connection and use of a keyboard, PC mouse, or gamepad as an auxiliary control for object basic manipulation: it’s rotation and translation in
x-,
y-, and
z-axes or it’s resizing. For controlling of further functions, the user is dependent on switching between VR and reality unless the software environment is visualized directly in VR. Most VR sets (e.g., HTC Vive, Oculus One) are equipped with universal controllers that are designed to handle common uses defined by the entertainment industry. The main controls, such as an index finger trigger or a small joystick or a small circular touchpad, are designed for such applications, where shooting, grabbing, and releasing of objects or moving or teleporting of the user is needed. Additional buttons then allow menu launch and menu item selection to change a virtual object’s features, such as its color or size. The main limitation of common controls for VR systems is the need for sufficient space around the person to allow the manipulation of the controllers both in front of and around the user. This is also associated with a need to monitor the position of the controls in the hands in space with additional hardware, either IR beacons with sensors on the controller (HTC Vive Pro) or VR headsets with a built-in camera system (Oculus One).
When working with virtual objects, a number of operations are performed in the form of “holding” an object in space and moving or rotating it via hand motions. Especially object size can be usually changed by handling the object with both hands and by stretching or pulling the hands. Similarly, when changing an object’s parameters, it is necessary to first grab the object in the space and then select and adjust options from the menu with the help of the controller’s additional buttons. In addition, individual controls, such as buttons on the controller, are difficult to locate in virtual reality. Therefore, a tile-type virtual menu that is visualized in front of the user is more often used to make other controls available. The user can select from the menu using a virtual pointer. However, the need for sufficient space around the user creates a number of constraints (e.g., when observing and manipulating objects in VR in a small-space microscopy lab). This is especially complicated when working with a multiparametric object, such as a visualized fluorescent multicolor sample. The need to handle fluorescent objects is partly solved by ConfocalVR, which offers a large virtual menu with a number of different parameter controls in the form of horizontal sliders. However, there is still a need for strenuous manipulation of the object and menu controls and, therefore, a need to dedicate substantial space. ExMicro and the new (but commercial) version of ConfocalVR make menu control more comfortable when a virtual pointer is used.
Our goal was to create a control system for virtual multidimensional and multicolor microscopic data without the limitations described above. Our gesture-based VR control system is intended to allow manipulations of the object, handle a number of object parameters, and last but not least, reduce hand strain during operations and eliminate the need for substantial clear space in front of the user. Our device, a two-handed controller with a large touch disk at the front, is therefore designed to control virtual objects with only the controls on the controller, eliminating the need for the user to move their hands in space. It is therefore possible to control objects while standing or sitting, without the need to move the body in space. We particularly focused on the manipulation of fluorescent objects acquired using confocal microscopy. For this purpose, the controller was connected to the open-source FluoRender system. FluoRender provides access to many other features of the object and processes to manipulate it, including the ability to slice the object and change its transparency, contrast, and surface type. Our solution is advantageous when used directly at the workplace with a confocal microscopic system, or anywhere at the desk, for example, by students when sharing and viewing virtual objects during their lessons, or, similarly, at professional conferences. The controller with a large touchable disk (multitouch sensor) for gestures with up to five fingers in connection with classic buttons serves as an optimal instrument used to process a wide array of control combinations and handle many actions performed with and on virtual objects.
2. Materials and Methods
The introduced control system is designed to easily handle and observe confocal microscopy 3D models projected in virtual reality (see
Figure 1).
Although most of the controls in FluoRender are available from the menu around the displayed virtual object (see
Figure 2), the control method must account for the inability of the user to use these software controls or to look at the physical controls on the controller while observing the model in the virtual environment. Exiting from the virtual environment to work with the visualization software or hardware out of VR is extremely inefficient. The control device is designed to allow easy application of various functions to control a virtual object and its visual parameters without this need.
The designed control system includes controls for spatial manipulation (spatial move, rotation, zoom, etc.) of the object to get the optimal view, pointing and positioning for observing details on the object surface or inside the object, and can directly control a number of image properties of the virtual 3D objects (e.g., brightness, contrast, color channel transparency and color saturation, and switching off and on color channels) at the same time. It is designed to be ergonomic for being grasped and held for long periods to allow comfortable detailed observing or analyzing of volumetric microscopic data.
To handle a range of operations, the designed device uses a large circular multitouch panel (T201 MTCW, Grayhill, Inc., La Grange, IL, USA) that has several advantages. The touchpad can provide detection of a range of actions performed by one or more fingers, whether single touches, taps, movements, or specific gestures can be used. Suitably positioned in the palm of the hand together with the ergonomic layout of the other controls, it can also be easily perceived and used even with a VR headset on. The control device is designed to be held in both hands. The back and front of the controller contain separate controls for each hand. The dominant hand is used for active control of the touch panel by means of touches and gestures (see
Figure 3). The fingertips of the nondominant hand are placed directly on the buttons when the device is held, so it is not necessary to locate the buttons while in the VR space, which is usually difficult to do while using standard controllers. By combining the actions performed on the touchpad and by the buttons, quick engagement of the various functions available in the FluoRender environment can be achieved.
The touch panel has a sufficiently large touch area for manipulation and enables the use of up to 5 fingers at the same time, together with detection of the number of touches and tracing of positions in the x–y plane for all 5 fingers. By using several fingers, it is possible to perform a wide range of gestures or obtain just specific values by continually moving one or more fingers in the x- and/or y-axis. The controller is wirelessly connected to the VR computer workstation using a receiver connected via the computer’s USB port (receiving dongle). RF communication with the ESP-NOW protocol is used for transmission. ESP-NOW is a P2P protocol developed by Espressif that enables multiple ESP devices to communicate with one another directly via low-power 2.4 GHz wireless connectivity. It is suitable for battery-powered devices, allowing an ESP32-based microcontroller unit (MCU) to be used as the control board. FluoRender is then supplied with data from the USB dongle via the serial communication. A custom algorithm is implemented in the FluoRender environment, interpreting the incoming instructions to perform actions in the FluoRender environment, which are then projected into the VR environment. Use of the USB dongle as a communication intermediary ensures secure and stable data transfer; the dongle also performs some supporting function initialization settings forwarding.
2.1. Navigations and Notifications
In order to systematically access the individual functions of FluoRender, the functions were sorted into categories by type, with switching enabled between the categories. The user must be aware of the current function category and the other categories offered. For this purpose, a tile menu was created with the icon of the active category highlighted to indicate the current selection. The menu is visualized directly in the FluoRender environment with sufficient contrast to be well visible. An example of the designed tile menu with 4 different categories of functions symbolized by specific icons is shown in
Figure 4. The menu is a single line, but in practice, it is expandable by any number of rows and columns. The menu contains the following function categories: 1, object manipulation in space (e.g., rotation, centering, moving, resizing); 2, object clipping (clipping planes) in up to 3 separate color channels, or channel switching; 3, image features (e.g., gamma, luminance, HDR setting); and 4, render modes—layered, depth, or composite mode.
To prevent the menu display from interfering with the display of the 3D object, the menu is positionally variable—the default position (in front of the user) can be changed to the bottom of the user’s field of view. The controller can be used to display and hide the menu as required.
2.2. Tasks Management
The algorithm developed for our VR controller works with support from the SDK provided by the manufacturer of the touch panel (Grayhill, Inc., La Grange, IL, USA). The algorithm provides detection of various tasks performed on the circular touch panel: number of touches, fingers’ X and Y positions, and various actions performed on the buttons (single press, double press, long press). Then, the tasks detected are evaluated in a specific hierarchical order to determine (1) functions group (category), (2) action to be performed, and (3) positional or calculated values. The functions category (a set of available functions) is switched by a special action (e.g., pressing the index finger button). Then the tasks that will be executed in the virtual environment are selected from the operations performed using the controls (on the touch disk or via buttons) (e.g., the detected number of fingers that touch together with the distance between individual fingers). During gesture performance, x and y coordinates of the fingers are obtained, and the distance between fingers is calculated. The corresponding instructions are transmitted wirelessly to the dongle as a coded sequence of values and passed via serial communication to FluoRender, where they are executed as specific tasks. The 3 buttons were also chosen because the fluorescence data contain 3 color channels (respectively, FluoRender allows you to display and work with 3 separate channels). Each of the buttons can thus be easily used to focus actions on a specific channel (red, green, or blue).
A sequence is generated during any action performed using the controls (touching the disk or depressing the buttons). Common instructions associated with operations regarding the 3D object can be replaced by special actions, such as reset (position or values), display, or hide menu. Relative changes in the evaluated finger positions are used with different signs when moving forward and backward on the touch panel. These values are then used to change the value of a specific virtual object visual parameter (e.g., adjusting the contrast and transparency levels or switching the color channels on and off). In the spatial manipulation mode,
x and
y values are used to rotate, displace, or scale the object. An example of the application of several different touches and related gestures is shown in the examples in
Figure 5.
A scheme showing the sequence of the individual operations from
Figure 5 for manipulation of the virtual object (translation, rotation, resizing) is in
Figure 6. As can be seen, the operations differ in the number of touches. In the first 2 cases (A,B), changes in the
x and
y positions are applied to interact with the virtual object. In the case of changing the size of the object, a touch with 2 fingers must be performed over the preset minimum finger distance, and at each movement, the distance between the fingers is calculated and applied in object resizing.
2.3. Electronics and Functional Design
The power supply and control and communications components are integrated on a single-sided user-made electronic control board. The core of the control board is an ESP32 MCU (Espressif Systems); this compact system on a chip enables interfacing with the touch panel via a USB host, detection of events from the controls, processing of values, and wireless communication. In addition, it handles various support functions related to battery charging or power management (see
Figure 7). A small USB dongle, including ESP32-PICO-KIT (Espressif Systems), is used in combination with the board to enable ESP-NOW communication.
The touch panel is mounted in a structural sleeve at the front of the controller, to which the rear panel is also attached. The circular electronic control board is mounted behind the touch disk. The indicator LED and USB-C port for charging are located on the side of the board, close to the holes in the back cover of the controller. Another electronic circuit (a button module) with 3 buttons is in the back cover from inside. The MCU is powered by an integrated LiFePO4 battery cell with a capacity of 600 mAh. The smart power system provides indication of battery status, power from the connected USB-C power supply, and simultaneous battery charging. A power saving system is implemented on the device, which is put to sleep to save energy when idle. The device can be switched on by pressing any of the control buttons, rather than using a dedicated on/off switch.
4. Discussion
As shown in the case studies, a classical object manipulation method requires considerable manipulation in space. This mostly involves the actions of moving closer to the object, grasping the object after extending the hands, and moving with one or two controllers in space. This can be considered a major disadvantage when handling confocal data in a small space laboratory, in a small room, or at a desk. Moreover, achieving z-axis rotation cannot be easily achieved as it is limited by the reduced mobility of the wrist of the hand, whereas x- and y-axis rotation can be achieved more easily but only with limited tilt angles. Similar limitations can be expected for hand-tracing systems. This is partly eliminated by ExMicroVR, where only some workspace in front of the user (up to the distance of the outstretched arm) is required to use the controllers as laser pointers properly. Another issue relates to grabbing the controlled object when manipulated with it. The object needs to be held continuously. This causes small movements of the object (due to hand tremors when holding it) and makes it impossible to perform this task for a long time. On the other hand, our controller allows for alternating the observations of the object with the actions of manipulating the object. Additionally, a movement in x, y, and z is simpler and more comfortable. Movement is limited only by the space on the touch disk. When the movement limit on the disk is reached, the fingers can be put back and the action (e.g., x rotation) can be continued. Many other various specifics and limitations of the compared control systems have already been mentioned in the Results section.
One of the most advanced and challenging tasks is to crop objects in three different axes from both sides in each axis. In the ConfocalVR and ExMicroVR software, cropping of individual object channels could not be achieved separately. These SWs offer the possibility to crop the object only as a whole using the so-called Excluder, which allows moving into and looking around inside the virtual object. Then, some parts of the whole object can be hidden. The resulting effect is different from the use of object cropping in FluoRender. In FluoRender, clipping planes are used for this purpose, and the object can be clipped in all three (x-, y-, and z-) axes from both sides, yet separately in all channels. This brings significant advantages, and our controller is adapted to these features. However, the operation of these functions is complex, and even in our case, the control was not set up to provide a comfortable operation, as it consists of several different steps that the user must apply. The current solution needs to be optimized for better handling of these functions in the future.
It should be noted that a gamepad can be used in FluoRender to control the object. However, the gamepad has limited capabilities and only handles spatial manipulation of the object. Even with the potential mapping of additional elements, the number of functions that could be controlled would be limited, or the function buttons would be hard to access when VR glasses are on. Therefore, the gamepad tends to be used in VR specifically with the aim of moving an object.
The presented gesture-based control system has some other benefits. It allows a virtual object to be manipulated independently of the additional VR hardware equipment, such as headset cameras or IR emitters and sensors that are usually used to sense the spatial position of the classic controller. Control instructions are transferred from the gesture-based controller to the FluoRender PC installation with which the device communicates, while FluoRender performs the visualization. Therefore, the hardware is dependent only on FluoRender’s requirements. On the other hand, the introduced control system is incompatible with VR systems that require free hands or fingers for special VR actions, because both hands are fully used and unavailable for any further operations. However, the control system was designed only for fast and comfortable observation of the virtual multidimensional and multicolor object. It is not intended for further detailed object analysis.