CSVO: Clustered Sparse Voxel Octrees—A Hierarchical Data Structure for Geometry Representation of Voxelized 3D Scenes

Madoš, Branislav; Chovancová, Eva; Chovanec, Martin; Ádám, Norbert

doi:10.3390/sym14102114

Open AccessArticle

CSVO: Clustered Sparse Voxel Octrees—A Hierarchical Data Structure for Geometry Representation of Voxelized 3D Scenes

Department of Computers and Informatics, Faculty of Electrical Engineering and Informatics, Technical University of Košice, Letná 9/A, 042 00 Košice, Slovakia

^*

Authors to whom correspondence should be addressed.

Symmetry 2022, 14(10), 2114; https://doi.org/10.3390/sym14102114

Submission received: 29 August 2022 / Revised: 27 September 2022 / Accepted: 2 October 2022 / Published: 12 October 2022

(This article belongs to the Section Computer)

Download

Browse Figures

Versions Notes

Abstract

:

When representing the geometry of voxelized three-dimensional scenes (especially if they have been voxelized to high resolutions) in a naive—uncompressed—form, one may end up using vast amounts of data. These can easily attack the available memory capacity of the graphics card, the operating memory or even secondary storage of computer. A viable solution to this problem is to use domain-specific hierarchical data structures, based on octant trees or directed acyclic graphs, which, among other advantages, provide a compact binary representation that can thus be considered to be their compressed encoding. These data structures include—inter alia—sparse voxel octrees, sparse voxel directed acyclic graphs and symmetry-aware sparse voxel directed acyclic graphs. The paper deals with the proposal of a new domain-specific hierarchical data structure: the clustered sparse voxel octrees. It is designed to represent the geometry of voxelized three-dimensional scenes and can be constructed using the out-of-core algorithm proposed in the paper. The advantage of the presented data structure is in its compact binary representation, achieved by omitting a significant number of pointers to child nodes (82.55% in case of Angel Lucy model in 128

^{3}

voxels resolution) and by using a wider range of child node pointer lengths, including 8b, 16b and 32b. We achieved from 6.57 to 6.82 times more compact encoding, compared to sparse voxel octrees, whose all node components were 32b aligned, and from 4.11 to 4.27 times more compact encoding, when not all node components were 32b aligned.

Keywords:

clustered sparse voxel octrees (CSVO); sparse voxel octrees (SVO); sparse voxel directed acyclic graphs (SVDAG); symmetry-aware sparse voxel directed acyclic graphs (SSVDAG); hierarchical data structure; geometry representation; voxelized scenes

1. Introduction

In science, medicine or industry, voxelized data representations and related voxel-based rendering have been used for decades, similarly to voxelized three-dimensional (3D) scenes, employed in computer graphics and augmented or virtual reality. In this context, Ref. [1] provides an excellent overview of compact data representation in GPU-based direct volume rendering (DVR). However, encoding various aspects of the voxelized 3D scenes—ranging from their geometry through colours to other material properties of voxels—while representing them as high-resolution grids, requires massive amounts of memory and—consequently—memory bandwidth.

In its naive, uncompressed form, the geometry of a voxelized 3D scene, formed by a regular 3D voxel grid, may be represented by a regular 3D grid of 1b scalar values. Each voxel of the scene is thus assigned 1 bit of memory (1b/vox). If set to 0, the corresponding bit represents a passive (empty) voxel and vice versa—if set to 1, it represents an active (occupied) voxel. This encoding may require significant amounts of data. For example, the binary representation of the geometry of a scene consisting of

4096 \times 4096 \times 4096

(4K

^{3}

) voxels would require up to 8 GB of memory in its uncompressed form. At an extremely high resolution (such as 128K

^{3}

), this scene geometry representation would require a whopping 256 TB. This exceeds the capacity of off-the-shelf hardware resources used in 3D-scene processing and visualisation. Therefore, it is necessary to find a solution allowing a significantly more compact form of encoding.

A popular solution to this issue is the use of domain-specific hierarchical data structures (HDSs), both in the form of octant trees and directed acyclic graphs (DAGs). These HDSs decompose the 3D scene space, halving it in each of its three principal axes, thus creating eight subspaces called octants. Passive octants (homogeneously empty, i.e., containing only passive voxels) or active octants (homogeneously filled, i.e., containing only active voxels) can be encoded frugally, resulting in a more compact representation. Partially active octants—those containing both passive and active voxels—are then recursively decomposed into further octants.

Modern HDSs have focused on representing 3D scenes sparsely populated with active voxels (i.e., the proportion of passive voxels can reach even 99.999%), which is reflected in the use of the sparse attribute in their names. These also divide the available space into octants; however, only the passive ones—composed entirely of passive voxels—are represented in the frugal form. Active octants are those containing at least a single active voxel; these are then further recursively decomposed. Such HDSs include Sparse Voxel Octrees (SVOs)—whose main advantage is the aforementioned capability to provide a compact representation of passive octants—and other HDSs introducing the application of the Common Subtree Merge (CSM) technique—such as Sparse Voxel Directed Acyclic Graphs (SVDAGs)—and complementing the CSM technique with the use of reflective subtree transformation—as in the case of Symmetry-aware Sparse Voxel Directed Acyclic Graphs (SSVDAGs).

By implementing a binary representation of pointers to child nodes, these data structures enable fast traversal. However, pointers can represent a significant portion of the total size of the binary representation of an HDS. Therefore, researchers have been keen to optimize them by diversifying their lengths. This also allowed further optimization using Frequency-Based Compaction (FBC), in which pointers are assigned to child nodes so that the shortest pointers are assigned to the child nodes having the highest numbers of references and vice versa.

Data structures free of such pointers—such as Pointerless Sparse Voxel Octrees (PSVOs) or Pointerless Sparse Voxel Directed Acyclic Graphs (PSVDAGs)—allow for achieving high compression ratios by completely omitting the binary representation of pointers to child nodes. However, this degrades their traversability and makes them more suitable for streaming or archiving purposes. These HDSs are discussed in more detail in Section 2 and Section 3 hereof.

In this paper, we propose a novel hierarchical data structure, the Clustered Sparse Voxel Octrees (CSVOs), based on the structure of SVOs. CSVOs allow—without losing fast traversability—omitting a significant number of pointers to child nodes and shortening others significantly by representing them using 8b, 16b and 32b. These pointers do not represent addresses in the global address space of the data structure or offsets from the beginning of the representation of the respective nodes level in the tree, as it is the case in other HDSs. Instead, the value of such a pointer represents the offset of the position of the child node measured from the end of the pointer array of its parent node. With this modification, when testing the method, we managed to achieve a many times more compact binary representation of CSVOs, compared to SVOs.

The contribution hereof lies in the following:

The design of a domain-specific hierarchical data structure, the CSVO, designed for a compact representation of the geometry of three-dimensional voxelized scenes, sparsely populated with active voxels, employing lossless compression
The design of a two-step out-of-core algorithm, aimed at constructing a CSVO from an ordered list of active scene voxels, represented by their Morton addresses.

The structure of this paper is as follows:

Section 2 discusses the related works in terms of linearization of multi-dimensional data using Space-Filling Curves (SFCs) and—notably—the representation of the geometry of 3D scenes using domain-specific HDSs. Due to the vast number of papers published in this field, this section of the paper focuses on the works more closely related to its contribution. Section 3 introduces Sparse Voxel Octrees (SVOs), as the domain-specific hierarchical data structure, as a means for representing the geometry of three-dimensional voxelized scenes, along with their pointerless version called Pointerless Sparse Voxel Octrees (PSVOs). Section 4 represents the most important part of the contribution of our work. It presents CSVO, a domain-specific hierarchical data structure proposed herein, designed for representing the geometry of voxelized three-dimensional scenes. Section 5 introduces a two-step out-of-core algorithm for the construction of the aforementioned CSVOs, proposed herein. Section 6 presents the test results. The first part of the section presents three scenes at six different voxel resolutions, whose geometry was stored in multiple SVO versions and in the newly proposed CSVO. This is followed by the evaluation of the achieved results. The last part of the section discusses the compression sources allowing higher data compression rates within the CSVOs (in comparison with the SVOs). Section 7 summarizes the conclusions drawn from the test results described in the preceding section hereof.

2. Related Works

To rearrange—linearize—pixels of a regular 2D grid or voxels of a regular 3D grid, one may use Space Filling Curves (SFCs), introduced back in the 19th century [2]. In computer graphics, Morton Space Filling Curves (MSFCs) [3] are the SFCs frequently used to linearize multi-dimensional data. Hilbert Space Filling Curves (HSFCs) [4] are also often used in computer science, as they preserve the locality of the data better. See Figure 1 for an illustration of MSFCs and HSFCs.

In their 2010 works [5,6], Laine and Karras presented Efficient Sparse Voxel Octrees (ESVOs), based on octant trees, along with an efficient ray casting algorithm using this HDS. The main advantage of this structure was the possibility of replacing entire subtrees if contour information could be used instead (or the situation could be interpreted as increasing the geometric resolution of the voxel using this contour information). At the binary level, contour information was represented using 32 bits—of these, 24 were used to store the contour pointer and 8 to store the contour mask. This allowed for increasing the geometric resolution, compressing the binary representation of smooth surfaces and accelerating ray casting.

In their 2013 work [7,8], Baert and Dutré presented a two-step out-of-core algorithm transforming a mesh of triangles into an SVO. In the first step, an intermediate product—a list of active voxels—was created from the mesh of triangles. Each was represented using its Morton coordinate, while the list items were ordered according to this coordinate in an ascending order. In the next (second) step, an SVO was constructed from this intermediate product. The size of the input set of polygons, the intermediate product and even the resulting SVO could exceed the available amount of operating memory by far. In its first step, the aforementioned two-step algorithm produced an intermediate product whose representation could consume large amounts of data. In their 2015 work [9], Pätzold and Kolb presented an algorithm combining parallel voxelization on GPUs with an out-of-core approach not processing an intermediate product (a voxel grid), but rather producing an SVO directly.

In 2013, Kämpe et al. introduced Sparse Voxel Directed Acyclic Graphs (SVDAGs) [10]. Compared to SVOs, this HDS allowed a significant increase in data compression, due to the possibility of using CSM. With the CSM technique, two or more identical subtrees of the HDS could be represented by fully expanding the binary representation of only a single copy of such a subtree, whose root node was then referenced multiple times, by multiple child node pointers from the respective parent node level. Thus, the other copies of the subtree could be omitted from the binary representation of this HDS. Using multiple references to nodes in the data structure also led to a change in terminology, as these data structures were no longer octant trees, but rather directed acyclic graphs (DAGs). All node parts of this data structure, the nodes themselves, and also the entire HDS—were 32-bit aligned. Despite a more compact data representation, SVDAGs could be constructed so that the decompression overhead of their binary representation is identical to that of SVOs.

Besides the use of DAGs for the representation of the geometry of voxelized scenes, efforts are being made to add information about other attributes of voxels, both by integrating information into more complex SVDAGs and by creating separate data structures developed for this purpose. Williams presented Moxel DAG HDS in [11], where an extended High Resolution SVDAG is used in connection with an external data structure called Moxel Table, where the material information is stored. Dado et al. proposed in [12] decoupling of geometry and voxel data, using a novel mapping scheme, to apply the DAG principle to encode the topology, while using a palette-based compression for the voxel attributes. Dolonius et al. presented in [13,14] a novel method for connecting each node in SVDAG to its corresponding colors in a separate 1D array of colors using a small amount of additional information incorporated into the DAG. In connection with DAGs, attention is paid also to their use in the compact representation of voxelized shadows [15,16].

In 2016, Villanueva et al. introduced a hierarchical data structure, Symmetry-aware Sparse Voxel Directed Acyclic Graphs (SSVDAGs) in [17,18]. Like SVDAGs, this data structure also allowed using the CSM technique; however, it added the possibility of common subtree merging even if reflective transformations (i.e., mirroring) were required to make the subtrees identical. These transformations could be implemented independently in each of the principal axes of the represented 3D scene. To achieve this mirroring, an extra 3 bits had to be inserted into the child node pointers. The pointer could be either shorter (16b) or longer (32b). The shorter pointers were assigned to the most frequently referenced nodes, using frequency-based compaction (FBC). Due to the greater number of child node pointers of various lengths, 2 bit Header Tags (HTs) were used to form a 16b Child Node Mask (CHNM)—the size of these was the double of those used in the aforementioned HDSs (for both HT and CHNM). In order to be able to achieve a compact representation of the leaf node layer and to minimize the number of child node pointers, the leaf node layer of this HDS was made up of voxel grids having a size of 4

^{3}

voxels. Node components, nodes, grids and the whole data structure were aligned to 16 bits. While the mirroring itself did not increase the decompression overhead during rendering, compacting the child node pointers led to a 15% overhead.

In 2020, Pointerless Sparse Voxel Directed Acyclic Graphs (PSVDAGs) were introduced by Vokorokos et al. [19]. This HDS combined the advantages of PSVOs and SVDAGs. As it was in the case of PSVOs, this structure omitted child node pointers, and, similarly to SVDAGs, it allowed common subtree merging (CSM). To make this possible without implementing child node pointers, this HDS introduced the concept of labels and callers. Labels denoted subtrees serving as patterns referenceable by callers multiple times. In order to achieve a more compact representation of the data structure, both labels and callers had variable lengths and even FBC was applied, when the most frequently referenced subtrees were assigned the shortest labels and callers and vice versa. Due to the absence of pointers to child nodes, this data structure had the same drawback as the one of PSVOs: a limited traversal rate. Therefore, in 2021, Madoš and Ádám presented an algorithm enabling fast transformation of these data structures into SVDAGs [20].

While the aforementioned works focused on lossless compression of scene geometries using hierarchical data structures, attention was also given to lossy compression. In 2020, van der Laan et al. introduced Lossy Sparse Voxel Directed Acyclic Graphs (LSVDAGs), based on SVDAGs [21]. In its construction process, not only absolutely identical subtrees were searched for, but also more rarely occurring subtrees that only required minimal modification to become identical. This increased the number of subtrees to which the CSM technique could be applied. The achieved reduction (compared to SVDAGs) ranged from 10% to 50% when modifying 1% to 5% of the active voxels.

The geometry representation of voxelized scenes using aforementioned HDSs is suitable for static data. When a change of geometry is implemented, it is necessary to decompress the corresponding HDS and then re-compress it. Therefore, in [22], Careil et al. introduced a new data structure called HashDAG that enables interactive modifications of such compressed voxel geometry without requiring de- and recompression. This data structure is compatible with the attributes compression introduced in [13,14].

HDSs find their application also in the representation of time-variable voxelized scenes. In [23], Kämpe et al. presented a temporal DAG, which stores time-varying voxel data in one DAG, while special attention is also paid to the optimization of pointer lengths. In [24], Martinek et al. proposed the Motion DAG data structure which interleaves a temporal interval binary tree for filtering time consecutive data and a sparse voxel octree (SVO) which simplifies spatially nearby data. Zhang et al. in [25,26,27] dealt with an octree-based motion representation method that can be applied to compress animated geometric data.

3. Octree-Based Hierarchical Data Structures

Domain-specific hierarchical data structures designed to represent 3D scene geometry include octree-based SVOs and PSVOs. This section contains a brief introduction and formalization of these.

3.1. Sparse Voxel Octrees

An SVO represents the geometry of a voxelized 3D scene containing

N \times N \times N = N^{3}

voxels;

N = 2^{m}

, where

m \geq 1

,

N \in N

and

m \in N

. The nodes of the SVOs are hierarchically arranged into m layers, which can be numbered. The root node, representing the whole 3D scene, shall form layer 0, while the leaf nodes (LNODEs) shall form layer

m - 1

. All nodes in layers 0 to

m - 2

are internal nodes (INODEs). The nodes represent specific octants of the 3D scene, and their child nodes represent the recursive decomposition of these octants into sub-octants.

Thus, INODEs can potentially have eight child nodes. A suboctant can be either passive, i.e., homogeneously filled with passive voxels (in this case, there is no child node associated with it in the HDS, which is a significant source of compression) or active, i.e., containing at least one active voxel (in this case, a child node exists). Information about the passive and active octants is stored in the node’s Child Node Mask (CHNM), composed of eight Header Tags (HTs), one for each potential child node. If the HT is set to ‘0’, the associated octant is passive, without any corresponding child node. If the HT is set to ‘1’, the octant is active and the child node exists.

Following the CHNM, there is a concatenated array of pointers (PTS) to the active child nodes—as their count (

< 1; 8 >

) may vary, so does the total length of the binary representation of the PTS. A pointer (PT) may represent an address within the global address space of the SVO, pointing to the start of the binary representation of the corresponding child node. Alternatively, if each SVO level has its own separate address space, a PT may represent an offset of the start of the binary representation of the corresponding child node from the beginning of that address space.

The order of the PTs in the PTS array corresponds to the order of the HTs in the CHNM. In this paper, we used the Morton-order to determine this order.

The CHNM of LNODEs encodes individual voxels directly (there are no child nodes and therefore no pointers to these child nodes), i.e., HT = ‘0’ represents a passive voxel and HT = ‘1’ represents an active voxel, respectively. Again, their order is consistent with the Morton-order in this paper.

In order to formalize the binary representation of SVOs, we used the Backus–Naur Form (BNF):

SVO ::= (n)<NODE>
NODE ::= <INODE>|<LNODE>
INODE ::= <CHNM>(p)<BIT><PTS>
LNODE ::= <CHNM>(q)<BIT>
CHNM ::= (8)<HT>
PTS ::= (1)*(8)<PT>
PT ::= (r)<BIT>
HT ::= <BIT>
BIT ::= “0” | “1”

where the following applies:

<SYM>—a mandatory non-terminal symbol SYM
“sym”—terminal symbol sym
(n)<SYM>—the SYM symbol, concatenated n-times
(n)*(m)<SYM>—the SYM symbol, concatenated n to m times
|—alternative

In this formal notation, the parameters p and q represent the number of reserved bits appended to the CHNM in the INODE and the LNODE, respectively. They are used to align the CHNM to the desired number of bits. The parameter r is then used to set the desired length of the binary representation of the child node pointers.

If we set the parameters to

p = q = 24

and

r = 32

, all node parts, the entire nodes and even the whole data structure are aligned to 32 bits. The 8b child node mask (CHNM) is complemented by 24 reserved bits—this applies to both internal (INODE) and leaf (LNODE) nodes. Child node pointers are represented using 32b. For testing purposes, we denoted this version of the data structure SVO

_{1}

.

If we set the parameters to

p = q = 0

and

r = 32

, the binary representation of SVO is more compact, but not all parts of the data structure are aligned to 32b. For testing purposes, we denoted this version of the data structure SVO

_{2}

.

An example of encoding a two-dimensional space (for simplicity and greater clarity, therefore using only the lower 4 HTs) into an SVO, with the parameters set to

p = q = 0

and

r = 8

, is depicted in Figure 2. Here, the root node constructed as an INODE is shown, having three active child nodes, all of them represented as LNODEs. The addresses of the nodes and their components are given below them, in decimal notation.

3.2. Pointerless Sparse Voxel Octrees

In the case of the PSVO data structure, child node pointers are not present in the binary representation of the nodes. The nodes consist exclusively of 8b CHNMs—this applies to both INODEs and LNODEs. In order to represent the relationship between parent and child nodes, the binary representation of the child node is appended right to its HT in the parent node.

To formalize the binary representation of PSVOs, we used the BNF:

PSVO ::= <NODE>
NODE ::= <INODE> | <LNODE>
INODE ::= (8)<HT>
HT ::= 0 | 1<NODE>
LNODE ::= (8)<BIT>
BIT ::= “0” | “1”

For testing purposes, we denoted this version of the data structure PSVO.

An example of encoding a two-dimensional space (for simplicity and greater clarity) into a quadrant tree analogous to PSVO is depicted in Figure 3.

4. Clustered Sparse Voxel Octrees

The HDS proposed herein—CSVOs—is designed to represent the geometry of a voxelized 3D scene containing

N \times N \times N = N^{3}

voxels;

N = 2^{m}

, where

m \geq 2

,

N \in N

and

m \in N

. While the nodes of traditional SVOs are hierarchically arranged into m layers, the nodes of the CSVOs are arranged into

m - 1

layers. The CSVO root node is stored in layer 0 and the CSVO leaf node layer, stored in layer

m - 2

, is equivalent to the last two layers of the SVO. Thus, if

m = 8

,

N = 2^{8} = 256

and hence the 3D-scene comprises

256^{3}

voxels. The SVO nodes are then stored in eight levels (numbered 0 to 7) and the CSVO nodes are stored in seven levels (numbered 0 to 6).

The CSVO consists of three kinds of nodes, denoted as follows:

Internal Nodes (INODEs), stored in layers 0 to $m - 4$ . Their child node masks, denoted as Long Child Node Masks (LCHNMs), require 16 bits, as each of the eight HTs uses 2 bits. These nodes support 8b, 16b, and 32b pointer lengths to child nodes, respectively. HT = ‘01’ indicates an 8b pointer, HT = ‘10’ indicates a 16b pointer, and HT = ‘11’ indicates a 32b pointer. HT = ‘00’ indicates that there is no child node and therefore no pointer to this child node.
Pre-Leaf Nodes (PLNODEs), stored in layer $m - 3$ . Their CHNMs have 8 bits, each of the HTs has 1 bit. These nodes support 8b length pointers to child nodes and are indicated in the CHNM by setting the HT to ‘1’. HT = ‘0’ indicates that there is no child node and therefore no pointer to this child node.
Leaf Nodes (LNODEs), stored in layer $m - 2$ . Their CHNMs have 8 bits. Each of the HTs has 1 bit. They do not have pointers to their child nodes. HT = ‘1’ indicates that the corresponding child node (in the form of 8b CHNM, where each HT represents activity/passivity of particular voxel) is appended in an array of child nodes following the CHNM of LNODE.

If the dimension of the 3D scene is

N \geq 16

, i.e.,

m \geq 4

, the CSVO root node is encoded as an INODE; if

N = 8

, i.e.,

m = 3

, it is encoded as a PLNODE; and if

N = 4

, i.e.,

m = 2

, it is encoded as an LNODE. The child node of an INODE must be either an INODE (if the parent node belongs to the levels 0 to

m - 5

) or a PLNODE (if the parent node belongs to level

m - 4

). The child node of a PLNODE must be an LNODE.

4.1. Internal Node

Each CSVO internal node consists of a 16b “long” child node mask (LCHNM) followed by a concatenated array of child node pointers (PTS). Child nodes are then stored immediately following this parent node (they are further recursively decomposed here to form their own clusters of nodes—for the purposes of this HDS, a cluster is an encoded subtree of the CSVO, with the root node being the particular child node). The pointer represents the offset of the start of the child node (and its cluster) from the end of the PTS pointer array of its parent node (in bytes). This offset is represented by the pointer with the smallest possible number of bits. If the offset value is from the range

< 1; 255 >

, it is represented by an 8b pointer and HT = ‘01’ is used in the LCHNM; if it is from the range

< 256; 65535 >

, it is represented by a 16b pointer and HT = ‘10’ is used in the LCHNM; and finally, if it is from the range

< 65536; 2^{32} - 1 >

, it is represented by a 32b pointer and HT = ‘11’ is used in the LCHNM. The offset of the first child node in the sequence is always 0 and therefore it does not need to have a binary representation in the node, although its HT is set to ‘01’ in the LCHNM. The number of child nodes ranges from 1 to 8, the number of child node pointers in the PTS ranges from 0 to 7.

For example, the internal node depicted in Figure 4 has four active child nodes whose cluster sizes are 27B, 3450B, 72080B and 870B, respectively. Therefore, the first child node cluster has an offset of 0B from the end of the pointer array to the child node cluster and its pointer PT0 is thus omitted from the pointer array (it is indicated in the figure only for illustrative purposes); however, its HT = ‘01’ is present in the LCHNM. The second child node cluster has an offset of 27B, so its pointer PT1 will have an 8b binary representation and its HT set to ‘01’. The third child node cluster has an offset of 3477B (27B + 3450B) and a 16b pointer PT2 with HT = ‘10’. Finally, the fourth child node cluster has an offset of 75557B (27B + 3450B + 72080B), a 32b pointer PT3 and its HT = ‘11’. The size of this node amounts to 9B.

4.2. Pre-Leaf Node

Each CSVO pre-leaf node consists of an 8b child node mask (CHNM) followed by a concatenated array of child node pointers (PTS). The child nodes are then stored right following this parent node. The pointer represents the offset of the start of the child node from the end of its parent node’s pointer array. This offset is always represented by an 8b pointer and is always assigned HT = ‘1’ in the CHNM (since the maximum size of a child node in the case of PLNODEs cannot exceed 9B and the number of such child nodes is at most 8, the offset of the last child node can be at most 63). By analogy with INODEs, the pointer to the first child node is not encoded, but its HT = ‘1’ is stored in the CHNM. Figure 5 shows an example of a pre-leaf node, having four active child nodes with cluster sizes 5B, 7B, 4B and 9B, respectively. Therefore, the first child node cluster has an offset of 0B from the pointer array to the cluster of child nodes and its pointer is thus not represented in the pointer array; however, its HT is present in the CHNM. The second child node cluster has an offset of 5B; the third child node cluster has an offset of 12B (5B + 7B). Finally, the fourth child node cluster has an offset of 16B (5B + 7B + 4B). The size of this node amounts to 4B.

4.3. Leaf Node

Each CSVO leaf node consists of an 8b child node mask (CHNM). It is the equivalent of a CHNM node from SVO node layer

m - 2

. However, this CSVO node no longer includes an array of child node pointers (PTS). If the HT is set to ‘1’ in this CHNM, this indicates that right after the CHNM, there will be an 8b node appended—this already carries information on the geometry of the voxels themselves (i.e., their passivity or activity). The node is thus equivalent to the CHNM leaf node from SVO node layer

m - 1

. Since these nodes have a constant size of 1B, their offset from the CHNM of an LNODE can be calculated using the CHNM of the particular node itself, without using pointers. Therefore, these pointers are omitted from the LNODEs. If all HTs in the CHNM of an LNODE are set to 1, they indicate 8 other 1B nodes being appended, so the maximum size of an LNODE is 9B.

An example of a CSVO leaf node is depicted in Figure 6, in which the CHNM size is 1B. An HT value of 1 indicates the existence of four concatenated nodes, each 1B long. Thus, in total, the binary representation of this node cluster requires 5B of space.

5. Out-of-Core Algorithm for CSVO Creation

The algorithm for constructing CSVOs, proposed herein, allows out-of-core construction of this data structure in two steps. The first step determines CHNMs of CSVO nodes; the second step determines child node pointers of CSVO INODEs.

The input of the algorithm is a list of active voxels read from a file, where each voxel is represented by its Morton address. In this file, the voxels are sorted in ascending order according to the aforementioned Morton address.

The first step of the CSVO construction algorithm is implemented using a modified version of Baert’s algorithm (see [7,8] for details).

Baert’s algorithm allows for compiling SVOs by writing each level of the generated tree into a separate file; in this, the nodes are stored in the order, in which they can be read from center to right at the time of rendering the graphical representation of the SVO. Each node is represented by both its child node mask and an array of child node pointers. It is possible to set the parameters described in Section 3.1 of this paper within its implementation, i.e., the CHNM alignment for both INODEs and LNODEs and the length of the binary representation of pointers to child nodes, which is constant for the entire HDS in the case of Baert’s algorithm. Using the original Baert’s algorithm, SVO

_{1}

and SVO

_{2}

were generated for the test scenes used in this paper (details of scenes can be seen in Table 1). The obtained sizes of the binary representation of these HDSs can be seen in Table 2.

We made two modifications of Baert’s algorithm. The first modification is that only the second step of Baert’s algorithm is used because the input of this second step is the same as the input of our algorithm. The second modification is that pointers to child nodes are not created and written to the output files because they are determined in the second step of our algorithm.

Through the modification of Baert’s algorithm, only the 8b CHNMs of nodes are determined and written to the output files. This causes a significant reduction in the size of the node’s representation in the output files of intermediate product generated in the first step of our algorithm, compared to the outputs of the classical Baert’s algorithm. For example, in the case of INODEs of SVOs generated by the original Baert’s algorithm, where all node components and thus the entire SVOs are aligned to 32b, the size of the INODEs written to the output file is from 64b (one 8b child node mask aligned to 32b and one 32b pointer to the child node) up to 288b (one 8b child node mask aligned to 32b and eight 32b pointers to the child nodes). Therefore, there is an 8- to 36-fold compression of binary representation for each INODE in our intermediate output files compared to the original Baert’s algorithm. In the case of LNODEs, there are no pointers to child nodes, and the 32b LNODE is replaced by 8b, allowing 4-fold compression.

The reason for using Baert’s algorithm is its out-of-core nature and simple possibility to modify it in a way that the generated intermediate product is significantly more compact compared to SVO. This intermediate product represents the optimal input for the second step of our algorithm. The size of the binary representation of this product for a specific scene is the same as the size of its PSVO representation, so the obtained results for the test scenes can be viewed in more detail in Table 2. It is possible to compare the size of the resulting product of Baert’s algorithm (SVO

_{1}

and SVO

_{2}

) and the size of the intermediate output from our modification (equal to the size of PSVO). Comparison of number and lengths of pointers from SVO generated by Baert’s algorithm and our final data structure for each layer of nodes can be seen in Table 3 and Table 4.

The Intermediate output of the first step of our algorithm is composed of m node layers, each stored in a separate file and numbered from 0 to m − 1.

The example in Figure 7 shows the generated CHNMs of the SVO nodes, having four node levels. The root node is at level 0. This node and the nodes of levels 1 and 2 are INODEs of the SVO. Level 3 contains the SVO LNODEs. The red arrows, showing the association between the HT of the parent node and the child node, are added to the figure for illustrative purposes only; they are not included in the binary representation of this intermediate product. The relationship of a particular HT of a parent node and a particular child node is determined by the fact that the n-th HT set to 1 (counted across all CHNMs in a particular node layer, from its start) in the parent layer, is associated with the n-th CHNM in the child layer. In the example, level 0 occupies 1B, level 1 occupies 3B, level 2 occupies 4B, and—finally—level 3 occupies 6B. In total, we used 14B.

Step 2 of the algorithm processes the obtained intermediate product in a bottom-up approach and finally generates the CSVO. In its first sub-step, it generates a layer of LNODEs; in the second, a layer of PLNODEs; while, in its third, it generates a layer of INODEs. The third sub-step is then repeated until layer 0 is processed.

In sub-step 1, the leaf node (LNODE) layer of the CSVO is generated. The first CHNM is loaded from node layer

m - 2

(i.e., level 2 in the example depicted in Figure 8). The number of HTs set to ‘1’ in this CHNM determines the number n of nodes from node layer

m - 1

(i.e., level 3 in the example depicted in Figure 8) that will be appended to the loaded CHNM. Then, SIZE is calculated as the size of the resulting LNODE. This value is then written into the

r e s u l t 0

output file at 32b (marked in yellow in the example depicted in Figure 8). Then, the CHNM loaded from layer

m - 2

and the n CHNMs from layer

m - 1

are written to the output file. In this way, the algorithm creates the first LNODE. Then, it continues with the next LNODE until the last CHNM from layer

m - 2

is processed. The result of this sub-step is the file

r e s u l t 0

(an eponymous file is shown in the example depicted in Figure 8).

In sub-step 2, a pre-leaf node (PLNODE) layer of the CSVO is generated. The first CHNM is loaded from node layer

m - 3

(i.e., level 1 in the example depicted in Figure 9). The number of HTs set to ‘1’ in this CHNM determines the number n as the number of clusters of result0 (result0 in the example depicted in Figure 9) that will be appended to the loaded CHNM. From each such cluster, the values of SIZE are retrieved, to calculate the SIZE value of the generated output node cluster. This information is written to the

r e s u l t 1

output file (result1 in the example depicted in Figure 9) at 32b; then, the following are written to the output file: the CHNM loaded from layer

m - 3

and the generated child node pointers that can be determined from the cluster SIZEs loaded from result0. Subsequently, the n clusters read from the result0 layer are written to the file (without their SIZE information though). In this way, the first PLNODE cluster is created; the algorithm then continues with the next one, until the last CHNM of layer

m - 3

is processed. The result of this sub-step is the

r e s u l t 1

file (an eponymous file appears in the example depicted in Figure 9).

In sub-step 3, the internal node (INODE) layer of the CSVO is generated. The first CHNM is loaded from node layer

m - 4

(i.e., level 0 in the example depicted in Figure 10). From the number of HTs set to ‘1’, n is calculated as the number of clusters of

r e s u l t 1

(result1 in the example depicted in Figure 10) that will be appended to the loaded CHNM. From each such cluster, the values of SIZE are retrieved, to calculate the SIZE of the output node cluster. This information is written to the

r e s u l t 2

output file (result2 in the example depicted in Figure 10) at 32b. Subsequently, the LCHNM is generated from loaded CHNM and written to the output file; then, the child node pointers are generated and written to the output file. Subsequently, the n clusters read from the

r e s u l t 1

layer are appended to the file (without their SIZE information, though). By this, the first INODE and its cluster is created; the algorithm then continues with the next one, until the last CHNM of layer

m - 4

is processed. The result of this sub-step is the

r e s u l t 2

file (an eponymous file appears in the example depicted in Figure 10).

If the intermediate file processed in sub-step 3 represented level 0 (i.e., it contained the root node), no SIZE information is added to the generated result file and the obtained result file contains the final CSVO. If the intermediate file containing the root node has not been processed, the algorithm repeats sub-step 3 to iteratively process the next node layer of the intermediate product, along with the last generated result file, which leads to the generation of another result file.

6. Results and Discussion

This section summarizes the results of the comparison of the proposed CSVO (compiled using the algorithm proposed herein) with two SVO versions and a single PSVO version. In the first part of this section, we present the testing datasets we used to obtain the results shown in the next part of the section; then, in the final part of the section, we discuss sources of increase of the compression ratio within CSVOs, compared to other HDSs.

6.1. Datasets

The three-dimensional test scenes were created from 3D polygonal models, originally saved in the Wavefront Technologies OBJ geometry definition file format. These models include “Angel Lucy”, consisting of 488,880 triangles; “Skull”, containing 80,016 triangles; and “Porsche”, containing 22,011 triangles. These models were embedded into scenes and these were then voxelized to various resolutions, ranging from 128

^{3}

to 4096

^{3}

(4K

^{3}

) voxels. This resulted in 18 voxelized scenes.

Subsequently, we created separate geometry representations of each and every scene involved. Every representation had the form of a regular 3D grid of scalar values, with the same grid dimensions as the corresponding voxelized scene, using a scalar value size of 1b. Thus, in this uncompressed form, describing the geometry of the scenes required 1b/vox. Passive (empty) voxels were represented as 0 s, while active (filled) voxels as 1 s.

The proportion of active voxels in the test scenes ranged from 3.53% (in the case of the “Skull” model, voxelized to a resolution of 128

^{3}

) to 0.03% (in case of the “Angel Lucy” model, voxelized to a resolution of 4096

^{3}

, i.e., 4K

^{3}

). In contrast, the absolute number of active voxels was the smallest with the lowest resolution (in this case, the “Angel Lucy” model, voxelized to a 128

^{3}

resolution, consisted of

22.48 \times 10^{3}

active voxels) and the largest in the case of the Skull model, voxelized to a resolution of 4096

^{3}

(4K

^{3}

), consisting of

64.61 \times 10^{6}

active voxels.

We used Morton Space Filling Curve (MSFC) to linearize the data.

The detailed parameters of the respective 3D scenes are shown in Table 1; their visualizations are depicted in Figure 11.

Then, for each active voxel of the particular scene—using its x, y, and z coordinates—we calculated its Morton coordinate, representing its location in the scene, as shown in the example depicted in Figure 12: here, a 24b Morton coordinate is constructed from three 8b coordinates. All active voxels of the scene were then sorted in ascending order and stored in a file with the extension *.pts.

6.2. Test Results

The test datasets represent the geometry of the respective scenes as lists consisting of only active voxels, represented as their 64b Morton addresses (constructed as described in Section 6.1). The voxels in the dataset were sorted in ascending order, according to the value of this address. Using the algorithm proposed by Baert et al., SVO

_{1}

and SVO

_{2}

were constructed for each scene; later, the PSVO structure was also created, as described in Section 3.1 and Section 3.2 hereof, respectively. The sizes of the binary representations of these HDSs were then compared with the size of the binary representation of the CSVO structure, described in Section 4 and compiled using the algorithm described in Section 5. The tests were performed on a computer with an Intel(R) Core(TM) i5-3470 CPU @ 3.20 GHz and 8 GB RAM, running Debian Linux version 4.19.0-6 and gcc version 8.3.0.

The achieved results, i.e., the size of the binary representation of the aforementioned HDSs and the achieved relative compression ratios between these HDSs and the CSVO proposed herein, are summarized in Table 2.

As it is evident from Table 2, the binary representation of the CSVO data structure exceeds that of the PSVO data structure. The relative compression ratio (CR), measured as the ratio of the sizes of the PSVO and CSVO data structures (denoted as PSVO/CSVO CR in Table 2), ranges from 0.82 to 0.85. However, it should be noted that the PSVO data structure—due to the absenting child node pointers—is not easily traversable. Compared to SVO

_{1}

, having all of its parts aligned to 32b, the CSVO data structure was 6.57 to 6.82 times more compact. Compared to SVO

_{2}

, not having all of its parts aligned to 32b, the CSVO data structure was 4.11 to 4.27 times more compact.

With the increasing voxel resolution of the model, the compression ratio of the CSVO, compared to the other HDSs, gradually decreased. This is due to the increasing volume of the binary representation of the CSVO and the associated increasing offset of the child nodes from their parent nodes at higher levels of the HDS (i.e., closer to the root). The binary representations of the pointers to these child nodes are longer in this case. A deeper analysis of the number of pointers of various lengths in the respective levels of the tree of SVO

_{1}

, SVO

_{2}

and CSVO for the Angel Lucy model voxelized to a resolution of 128

^{3}

is shown in Table 3, Table 4 and Table 5.

6.3. Compression Gains

The sources of compression of the binary representation of CSVO that allow for outperforming the compared SVOs include:

compression of the child node mask representation;
omitting a significant number of child node pointers;
shorting a significant number of 32b child node pointers to 8b and 16b.

One of the most significant sources of compression—in terms of binary representation—when using the CSVO (instead of SVO

_{1}

) is the removal of reserved bits appended to the CHNMs, since in both the INODE and LNODE of this SVO there are up to 24 reserved bits appended to the 8b CHNMs. In the CSVO, both the LNODEs and the PLNODEs contain only 8b CHNMs, which allows for achieving up to 4-fold compression of the representation of this part of the nodes. The INODE of the CSVO uses a 16b LCHNM, which leads to a 2-fold compression of this part of the node. Compared to SVO

_{1}

, SVO

_{2}

is more compact—precisely because it already includes this optimization by omitting the reserved bits. On the contrary, CSVO (using 16b LCHNMs) loses against SVO

_{2}

(using 8b CHNMs).

The CSVO LNODE design allows the encoding of parent nodes CHNM and the associated CHNMs of the child nodes of the last two SVO levels by omitting the pointers to these child nodes. In the case of the “Angel Lucy” 128

^{3}

model, this allowed omitting no less than 5291 of all 6832 pointers (77.44%) from the binary representation of the HDS.

In CSVO, the design of the INODEs and PLNODEs omits the binary representation of the pointer to their first child nodes in the sequence. Since each of these nodes must have at least one child node, a significant number of child node pointers can be omitted in this way. In the case of the “Angel Lucy” 128

^{3}

model, this allowed for omitting an additional 349 pointers (5.11% of all pointers).

Finally, the increase in the range of lengths of the binary representation of child node pointers is also an important source of compression: in CSVO, the 32b pointers of SVO nodes can be represented not only as 32b pointers, but also as 8b and 16b pointers, respectively. In the case of the “Angel Lucy” 128

^{3}

model, not less than 1192 pointers (17.44%) have been replaced by shorter pointers—in CSVO, 1150 were represented as 8b pointers and 42 as 16b pointers. Due to the low resolution of this model, 32b pointers were not used at all.

7. Conclusions

This paper discussed domain-specific hierarchical data structures designed for representing the geometry of voxelized 3D scenes, sparsely populated with active voxels. The aim of the paper was to investigate the potential of using the information on the distance of the child nodes of a hierarchical data structure from their parent nodes when linearizing the structure and encoding this information into the child node pointers. This, together with optimizing the count and length of the binary representation of these pointers, allowed us to design a new way of HDS encoding—CSVO—together with a new out-of-core construction algorithm.

Compared to SVO, having all of its parts aligned to 32b, the CSVO data structure was 6.57 to 6.82 times more compact. Compared to SVO, not having all of its parts aligned to 32b, the CSVO data structure was 4.11 to 4.27 times more compact. We got significantly closer to the size of the PSVO, which does not implement any child node pointers, and compared to which the CSVO was larger only by 17% to 22% (relative compression ratio was ranging from 0.82 to 0.85) in the tests performed using our testing datasets. With the increasing voxel resolution of the model, the compression ratio of the CSVO, compared to PSVO and SVO, gradually slightly decreased.

In the context of the proposed HDS, the potential of using common subtree merging has not yet been explored, which would allow in the future research, employing the principles presented herein, to construct an even more compact HDS in the form of an directed acyclic graph.

Author Contributions

Supervision, B.M.; Conceptualization, B.M.; Methodology, B.M.; Investigation, N.Á.; Software, B.M. and N.Á.; Formal Analysis, B.M.; Validation, E.C.; Data Curation, M.C.; Resources, E.C.; Visualization, B.M. and E.C.; Writing—Original Draft, B.M.; Writing—Review and Editing, N.Á.; Funding Acquisition, M.C.; Project Administration, E.C. All authors have read and agreed to the published version of the manuscript.

Funding

This publication has been published with the support of the Operational Program Integrated Infrastructure within project: Research in the SANET Network and Possibilities of Its Further Use and Development (ITMS code: 313011W988), co-financed by the ERDF.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data and program implementation of respective algorithms supporting reported results can be found on the site hds.madosonline.sk.

Conflicts of Interest

The authors declare no conflict of interest. The founders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

2D	2-Dimensional
2DTE	2-Dimensional Template-based Encoding
3D	3-Dimensional
b/vox	bits per voxel
BNF	Backus–Naur Form
CR	Compression Ratio
CSM	Common Subtree Merge
CSM-Quadtree	Common Subtree Merge Quadtree
CSVO	Clustered Sparse Voxel Octrees
DAG	Directed Acyclic Graphs
DVR	Direct Volume Rendering
ESVO	Efficient Sparse Voxel Octrees
FBC	Frequency Based Compaction
GPU	Graphics Processing Unit
HDS	Hierarchical Data Structure
HSFC	Hilbert Space Filling Curve
HT	Header Tag
CHNM	Child Node Mask
INODE	Internal Node
K	Kilo (1024)
LCHNM	Long Child Node Mask
LNODE	Leaf Node
LSVDAG	Lossy Sparse Voxel Directed Acyclic Graphs
MSFC	Morton Space Filling Curve
OBJ	Object
PLNODE	Pre-Leaf Node
PSVDAG	Pointerless Sparse Voxel Directed Acyclic Graphs
PSVO	Pointerless Sparse Voxel Octrees
PT	Pointer
PTS	Pointers
SFC	Space Filling Curve
SSVDAG	Symmetry-aware Sparse Voxel Directed Acyclic Graphs
SVDAG	Sparse Voxel Directed Acyclic Graphs
SVO	Sparse Voxel Octrees
SYM	Symbol

References

Balsa Rodríguez, M.; Gobetti, E.; Iglesias Guitián, J.A.; Makhinya, M.; Marton, F.; Pajarola, R.; Suter, S.K. State-of-the-Art in Compressed GPU-Based Direct Volume Rendering. Comput. Graph. Forum 2014, 33, 77–100. [Google Scholar] [CrossRef]
Sagan, H. Space-Filling Curves; Springer-Science+Business Media, LLC: New York, NY, USA, 1994; p. 194. [Google Scholar] [CrossRef]
Morton, G.M. A Computer Oriented Geodetic Data Base and a New Technique in File Sequencing; Research Report; International Business Machines Corporation (IBM): Ottawa, ON, Canada, 1966; p. 20. Available online: dominoweb.draco.res.ibm.com/reports/Morton1966.pdf (accessed on 15 August 2022).
Hilbert, D. Via the Continuous Mapping of a Line onto a Patch of Area (Über die stetige Abbildung einer Linie auf ein Flächenstück). Dritter Band: Analysis Grundlagen der Mathematik Physik Verschiedenes; Springer: Berlin/Heidelberg, Germany, 1935. [Google Scholar]
Laine, S.; Karras, T. Efficient Sparse Voxel Octrees. In Proceedings of the 2010 ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games (I3D ’10), Redmond, WA, USA, 2010, 19–21 February; pp. 55–63. [CrossRef] [Green Version]
Laine, S.; Karras, T. Efficient Sparse Voxel Octrees—Analysis, Extensions, and Implementation; NVIDIA Technical Report NVR-2010-001; NVIDIA Corporation: Santa Clara, CA, USA, 2010; p. 30. [Google Scholar]
Baert, J.; Lagae, A.; Dutré, P. Out-of-Core Construction of Sparse Voxel Octrees. In Proceedings of the 5th High-Performance Graphics Conference (HPG ’13), Anaheim, CA, USA, 19–21 July 2013; pp. 27–32. [Google Scholar] [CrossRef] [Green Version]
Baert, J.; Lagae, A.; Dutré, P. Out-of-Core Construction of Sparse Voxel Octrees. Comput. Graph. Forum 2014, 33, 220–227. [Google Scholar] [CrossRef] [Green Version]
Pätzold, M.; Kolb, A. Grid-free out-of-core voxelization to sparse voxel octrees on GPU. In Proceedings of the 7th Conference on High-Performance Graphics (HPG ’15), Los Angeles, CA, USA, 7–9 August 2015; pp. 95–103. [Google Scholar] [CrossRef]
Kämpe, V.; Sintorn, E.; Assarson, U. High Resolution Sparse Voxel DAGs. ACM Trans. Graph. 2013, 32, 1–13. [Google Scholar] [CrossRef] [Green Version]
Williams, R.B. Moxel DAGs: Connecting Material Information to High Resolution Sparse Voxel DAGs. Master’s Thesis, California Polytechnic State University, San Luis Obispo, CA, USA, 2015. [Google Scholar] [CrossRef]
Dado, B.; Timothy, R.K.; Bauszat, P.; Thiery, J.-M.; Eisemann, E. Geometry and Attribute Compression for Voxel Scenes. In Proceedings of the 37th Annual Conference of the European Association for Computer Graphics, Lisbon, Portugal, 9–13 May 2016; pp. 397–407. [Google Scholar]
Dolonius, D.; Sintorn, E.; Kämpe, V.; Assarsson, U. Compressing Color Data for Voxelized Surface Geometry. In Proceedings of the 21st ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games, San Francisco, CA, USA, 25–27 February 2017; p. 10. [Google Scholar] [CrossRef]
Dolonius, D.; Sintorn, E.; Kämpe, V.; Assarsson, U. Compressing Color Data for Voxelized Surface Geometry. IEEE Trans. Vis. Comput. Graph. 2019, 25, 1270–1282. [Google Scholar] [CrossRef]
Sintorn, E.; Kämpe, V.; Olsson, O.; Assarson, U. Compact precomputed voxelized shadows. ACM Trans. Graph. 2014, 33, 8. [Google Scholar] [CrossRef]
Kämpe, V.; Sintorn, E.; Assarson, U. Fast, Memory-Efficient Construction of Voxelized Shadows. In Proceedings of the 19th Symposium on Interactive 3D Graphics and Games, San Francisco, CA, USA, 27 February–1 March 2015; pp. 25–30. [Google Scholar] [CrossRef] [Green Version]
Villanueva, A.J.; Marton, F.; Gobetti, E. Symmetry-aware Sparse Voxel DAGs. In Proceedings of the 20th ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games (I3D ’16), Redmond, WA, USA, 27–28 February 2016; pp. 7–14. [Google Scholar] [CrossRef]
Villanueva, A.J.; Marton, F.; Gobetti, E. Symmetry-aware Sparse Voxel DAGs (SSVDAGs) for compression-domain tracing of high-resolution geometric scene. J. Comput. Graph. Tech. (JCGT) 2017, 6, 30. [Google Scholar]
Vokorokos, L.; Madoš, B.; Bilanová, Z. PSVDAG: Compact Voxelized Representation of 3D Scenes Using Pointerless Sparse Voxel Directed Acyclic Graphs. Comput. Inform. 2020, 39, 587–616. [Google Scholar] [CrossRef]
Madoš, B.; Ádám, N. Transforming Hierarchical Data Structures—A PSVDAG—SVDAG Conversion Algorithm. Acta Polytech. Hung. 2021, 18, 47–66. [Google Scholar] [CrossRef]
van der Laan, R.; Scandolo, L.; Eisemann, E. Lossy Geometry Compression for High Resolution Voxel Scenes. Proc. ACM Comput. Graph. Interact. Tech. 2020, 3, 13. [Google Scholar] [CrossRef]
Careil, V.; Billeter, M.; Eisemann, E. Interactively Modifying Compressed Sparse Voxel Representations. Comput. Graph. Forum 2020, 39, 111–119. [Google Scholar] [CrossRef]
Kämpe, V.; Rasmuson, S.; Billeter, M.; Sintorn, E.; Assarsson, U. Exploiting Coherence in Time-Varying Voxel Data. In Proceedings of the 20th ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games, Redmond, WA, USA, 27–28 February 2016; pp. 15–21. [Google Scholar] [CrossRef] [Green Version]
Martinek, M.; Thiemann, P.; Stamminger, M. Spatio-temporal filtered motion DAGs for path-tracing. Comput. Graph. 2021, 99, 224–233. [Google Scholar] [CrossRef]
Zhang, J.; Owen, C.B. Octree-Based Animated Geometry Compression. In Proceedings of the Conference on Data Compression, Washington, DC, USA, 23–25 March 2004; p. 508. [Google Scholar]
Zhang, J.; Xu, J. Optimizing Octree Motion Representation for 3D Animation ACM-SE 44. In Proceedings of the 44th annual Southeast Regional Conference, Melbourne, FL, USA, 10–12 March 2006; pp. 50–55. [Google Scholar] [CrossRef]
Zhang, J.; Xu, J.; Yu, H. Octree-Based 3D Animation Compression with Motion Vector Sharing. In Proceedings of the 2007 4th International Conference on Information Technology New Generations, Las Vegas, NV, USA, 2–4 April 2007; pp. 202–207. [Google Scholar]

Figure 1. Examples of SFCs: (a) 1-level MSFC; (b) 2-level MSFC; (c) 1-level HSFC; (d) 2-level HSFC.

Figure 2. An example of encoding a 2D space into an SVO, with the parameters set to

p = q = 0

and

r = 8

, where (a) is a

4 \times 4

pixel 2D-scene; (b) is a binary representation of the SVO with addresses marked in red and represented in decimal notation for simplicity and better visualization.

Figure 2. An example of encoding a 2D space into an SVO, with the parameters set to

p = q = 0

and

r = 8

, where (a) is a

4 \times 4

pixel 2D-scene; (b) is a binary representation of the SVO with addresses marked in red and represented in decimal notation for simplicity and better visualization.

Figure 3. A 2D-space discretized into

4 \times 4

pixels: (a) with the geometry marked, with active pixels marked in red; (b) depicted as a quadrant tree; (c) depicted in its binary representation, with the decomposition of the respective quadrants indicated using parentheses for better visualization; (d) the final binary representation of the quadrant tree.

Figure 3. A 2D-space discretized into

4 \times 4

pixels: (a) with the geometry marked, with active pixels marked in red; (b) depicted as a quadrant tree; (c) depicted in its binary representation, with the decomposition of the respective quadrants indicated using parentheses for better visualization; (d) the final binary representation of the quadrant tree.

Figure 4. Example of a CSVO internal node with four clusters of child nodes.

Figure 5. Example of a CSVO pre-leaf node with four clusters of child nodes.

Figure 6. Example of a CSVO leaf node.

Figure 7. Example of an intermediate product, generated by the first step of the CSVO compilation algorithm.

Figure 8. Example implementation of sub-step 1 of step 2 of the CSVO-generation algorithm.

Figure 9. Example implementation of sub-step 2 of step 2 of the CSVO-generation algorithm.

Figure 10. Example implementation of sub-step 3 of step 2 of the CSVO-generation algorithm.

Figure 11. Visualization of the voxelized scenes used for testing purposes: (a) “Angel Lucy” at 512

^{3}

; (b) “Skull” at 512

^{3}

; (c) “Porsche” at 512

^{3}

; (d) detail of “Angel Lucy” at 256

^{3}

; (e) detail of Angel Lucy at 512

^{3}

; (f) detail of “Angel Lucy” at 1024

^{3}

(1K

^{3}

).

Figure 11. Visualization of the voxelized scenes used for testing purposes: (a) “Angel Lucy” at 512

^{3}

; (b) “Skull” at 512

^{3}

; (c) “Porsche” at 512

^{3}

; (d) detail of “Angel Lucy” at 256

^{3}

; (e) detail of Angel Lucy at 512

^{3}

; (f) detail of “Angel Lucy” at 1024

^{3}

(1K

^{3}

).

Figure 12. Transformation of the 8b x, y, and z voxel coordinates (representing the location of the voxel in the scene) to the Morton coordinate m represented using 24b.

Table 1. Characteristics of the 3D scenes created by embedding polygonal surface models stored in the WaveFront Technologies OBJ geometry definition file format into these scenes and then voxelizing them to various resolutions. This table summarizes the total number of voxels in the scenes, the number of active voxels and their percentage considering the total number of voxels in the scenes, for each model and resolution.

Resolution
Resolution	128 $^{3}$	256 $^{3}$	512 $^{3}$	1024 $^{3}$	2048 $^{3}$	4096 $^{3}$
Voxels [10 $^{6}$ ]	2	16	128	1024	8192	65,536
Angel Lucy—488,880 triangles
Active voxels [10 $^{3}$ ]	22.48	91.52	366.58	1453.10	5685.86	21,656.43
[%]	1.07	0.55	0.27	0.14	0.07	0.03
Skull—80,016 triangles
Active voxels [10 $^{3}$ ]	74.10	298.85	1192.04	4688.08	17,958.71	64,608.51
[%]	3.53	1.78	0.89	0.44	0.21	0.09
Porsche—22,011 triangles
Active voxels [10 $^{3}$ ]	54.20	233.04	969.11	3938.35	15,539.54	58,673.98
[%]	2.58	1.39	0.72	0.37	0.18	0.09

Table 2. The size of each hierarchical data structure and the relative compression ratios between these data structures for the respective models and resolutions, to which the scenes were voxelized.

Angel Lucy
Resolution [vox]	128 $^{3}$	256 $^{3}$	512 $^{3}$	1024 $^{3}$	2048 $^{3}$	4096 $^{3}$
PSVO [KB]	6.67	28.63	118.00	475.99	1895.03	7447.63
SVO $_{1}$ [KB]	53.38	229.01	943.99	3807.90	15,160.25	59,581.02
SVO $_{2}$ [KB]	33.36	143.13	589.99	2379.94	9475.16	37,238.14
CSVO [KB]	7.96	34.31	142.25	575.39	2295.20	9038.44
PSVO/CSVO CR	0.84	0.83	0.83	0.83	0.83	0.82
SVO $_{1}$ /CSVO CR	6.71	6.67	6.64	6.62	6.61	6.59
SVO $_{2}$ /CSVO CR	4.19	4.17	4.15	4.14	4.13	4.12
Skull
Resolution [vox]	128 $^{3}$	256 $^{3}$	512 $^{3}$	1024 $^{3}$	2048 $^{3}$	4096 $^{3}$
PSVO [KB]	23.25	95.62	387.46	1551.56	6129.77	23,667.58
SVO $_{1}$ [KB]	186.00	764.93	3099.65	12,412.50	49,038.14	189,340.60
SVO $_{2}$ [KB]	116.25	478.08	1937.28	7757.81	30,648.84	118,337.87
CSVO [KB]	27.94	115.29	468.07	1877.54	7432.99	28,803.50
PSVO/CSVO CR	0.83	0.83	0.83	0.83	0.82	0.82
SVO $_{1}$ /CSVO CR	6.66	6.63	6.62	6.61	6.60	6.57
SVO $_{2}$ /CSVO CR	4.16	4.15	4.14	4.13	4.12	4.11
Porsche
Resolution [vox]	$128^{3}$	$256^{3}$	$512^{3}$	$1024^{3}$	$2048^{3}$	$4096^{3}$
PSVO [KB]	14.60	67.53	295.11	1241.51	5087.56	20,262.89
SVO $_{1}$ [KB]	116.79	540.25	2360.88	9932.07	40,700.44	162,103.10
SVO $_{2}$ [KB]	72.99	337.66	1475.55	6207.54	25,437.77	101,314.43
CSVO [KB]	17.11	80.24	352.78	1492.16	6135.46	24,539.74
PSVO/CSVO CR	0.85	0.84	0.84	0.83	0.83	0.83
SVO $_{1}$ /CSVO CR	6.82	6.73	6.69	6.66	6.63	6.61
SVO $_{2}$ /CSVO CR	4.27	4.21	4.18	4.16	4.15	4.13

Table 3. Parameters of SVO

_{1}

and SVO

_{2}

in case of the “Angel Lucy” model voxelized to a 128

^{3}

resolution, with the details of the number and total size (in bytes) of child node masks (CHNMs) and child node pointers (PTs) for each level of the tree.

Table 3. Parameters of SVO

_{1}

and SVO

_{2}

in case of the “Angel Lucy” model voxelized to a 128

^{3}

resolution, with the details of the number and total size (in bytes) of child node masks (CHNMs) and child node pointers (PTs) for each level of the tree.

Lucy $128^{3}$		SVO $_{1}$			SVO $_{2}$
Lucy $128^{3}$		CHNM	PT	Sum [B]	CHNM	PT	Sum [B]
SVO level 0	number	1	3		1	3
SVO level 0	size [B]	4	12	16	1	12	13
SVO level 1	number	3	16		3	16
SVO level 1	size [B]	12	64	76	3	64	67
SVO level 2	number	16	62		16	62
SVO level 2	size [B]	64	248	312	16	248	264
SVO level 3	number	62	267		62	267
SVO level 3	size [B]	248	1068	1316	62	1068	1130
SVO level 4	number	267	1193		267	1193
SVO level 4	size [B]	1068	4772	5840	267	4772	5039
SVO level 5	number	1193	5291		1193	5291
SVO level 5	size [B]	4772	21,164	25,936	1193	21,164	22,357
SVO level 6	number	5291	0		5291	0
SVO level 6	size [B]	21,164	0	21,164	5291	0	5291
	Sum [B]	27,332	27,328	54,660	6833	27,328	34,161

Table 4. Parameters of CSVO for the “Angel Lucy” model voxelized to a

128^{3}

resolution, with the details of the number and total size of child node masks (CHNMs) and child node pointers (PTs) of various lengths for each level of the tree. Column “0b PT” indicates the number of omitted pointers. CSVO level 5 is equivalent to two SVO levels, i.e., SVO levels 5 and 6.

Table 4. Parameters of CSVO for the “Angel Lucy” model voxelized to a

128^{3}

resolution, with the details of the number and total size of child node masks (CHNMs) and child node pointers (PTs) of various lengths for each level of the tree. Column “0b PT” indicates the number of omitted pointers. CSVO level 5 is equivalent to two SVO levels, i.e., SVO levels 5 and 6.

Lucy $128^{3}$		CSVO
Lucy $128^{3}$		CHNM	0b PT	8b PT	16b PT	32b PT	Sum [B]
CSVO level 0	number	1	1	0	2	0
CSVO level 0	size [B]	2	0	0	4	0	6
CSVO level 1	number	3	3	1	12	0
CSVO level 1	size [B]	6	0	1	24	0	31
CSVO level 2	number	16	16	18	28	0
CSVO level 2	size [B]	32	0	18	56	0	106
CSVO level 3	number	62	62	205	0	0
CSVO level 3	size [B]	124	0	205	0	0	329
CSVO level 4	number	267	267	926	0	0
CSVO level 4	size [B]	267	0	926	0	0	1193
CSVO level 5	number	1193	0	0	0	0
	size [B]	1193	0	0	0	0
	number	5291	0	0	0	0
	size [B]	5291	0	0	0	0	6484
	Sum [B]	6915	0	1150	84	0	8149

Table 5. The sizes of the binary representations of SVO

_{1}

, SVO

_{2}

and CSVO for each node level (CSVO level 5 is equivalent to SVO levels 5 and 6) and the obtained relative compression ratios for each node level, obtained as the ratio of the size of the node layers in SVO

_{1}

, SVO

_{2}

and CSVO, respectively.

Table 5. The sizes of the binary representations of SVO

_{1}

, SVO

_{2}

and CSVO for each node level (CSVO level 5 is equivalent to SVO levels 5 and 6) and the obtained relative compression ratios for each node level, obtained as the ratio of the size of the node layers in SVO

_{1}

, SVO

_{2}

and CSVO, respectively.

Lucy 128 $^{3}$	Level 0	Level 1	Level 2	Level 3	Level 4	Level 5	Level 6
SVO $_{1}$	16	76	312	1316	5840	25,936	21,164
SVO $_{2}$	13	67	264	1130	5039	22,357	5291
CSVO	6	31	106	329	1193	6484
SVO $_{1}$ /CSVO CR	2.67	2.45	2.94	4.00	4.90	7.26
SVO $_{2}$ /CSVO CR	2.17	2.16	2.49	3.43	4.22	4.26

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Madoš, B.; Chovancová, E.; Chovanec, M.; Ádám, N. CSVO: Clustered Sparse Voxel Octrees—A Hierarchical Data Structure for Geometry Representation of Voxelized 3D Scenes. Symmetry 2022, 14, 2114. https://doi.org/10.3390/sym14102114

AMA Style

Madoš B, Chovancová E, Chovanec M, Ádám N. CSVO: Clustered Sparse Voxel Octrees—A Hierarchical Data Structure for Geometry Representation of Voxelized 3D Scenes. Symmetry. 2022; 14(10):2114. https://doi.org/10.3390/sym14102114

Chicago/Turabian Style

Madoš, Branislav, Eva Chovancová, Martin Chovanec, and Norbert Ádám. 2022. "CSVO: Clustered Sparse Voxel Octrees—A Hierarchical Data Structure for Geometry Representation of Voxelized 3D Scenes" Symmetry 14, no. 10: 2114. https://doi.org/10.3390/sym14102114

APA Style

Madoš, B., Chovancová, E., Chovanec, M., & Ádám, N. (2022). CSVO: Clustered Sparse Voxel Octrees—A Hierarchical Data Structure for Geometry Representation of Voxelized 3D Scenes. Symmetry, 14(10), 2114. https://doi.org/10.3390/sym14102114

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

CSVO: Clustered Sparse Voxel Octrees—A Hierarchical Data Structure for Geometry Representation of Voxelized 3D Scenes

Abstract

1. Introduction

2. Related Works

3. Octree-Based Hierarchical Data Structures

3.1. Sparse Voxel Octrees

3.2. Pointerless Sparse Voxel Octrees

4. Clustered Sparse Voxel Octrees

4.1. Internal Node

4.2. Pre-Leaf Node

4.3. Leaf Node

5. Out-of-Core Algorithm for CSVO Creation

6. Results and Discussion

6.1. Datasets

6.2. Test Results

6.3. Compression Gains

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI