A Scoping Review of Voxel-Model Applications to Enable Multi-Domain Data Integration in Architectural Design and Urban Planning

: Although voxel models have been applied to address diverse problems in computer-aided design processes, their role in multi-domain data integration in digital architecture and planning has not been extensively studied. The primary objective of this study is to map the current state of the art and to identify open questions concerning data structuring, integration, and modeling and design of multi-scale objects and systems in architecture. Focus is placed on types of voxel models that are linked with computer-aided design models. This study utilizes a semi-systematic literature review methodology that combines scoping and narrative methodology to examine different types and uses of voxel models. This is done across a range of disciplines, including architecture, spatial planning, computer vision, geomatics, geosciences, manufacturing, and mechanical and civil engineering. Voxel-model applications can be found in studies addressing generative design, geomatics, material science and computational morphogenesis. A targeted convergence of these approaches can lead to integrative, holistic, data-driven design approaches. We present (1) a summary and systematization of the research results reported in the literature in a novel manner, (2) the identiﬁcation of research gaps concerning voxel-based data structures for multi-domain and trans-scalar data integration in architectural design and urban planning, and (3) any further research questions.


Introduction
Computer-aided design (CAD) emerged in the 1950s at the intersection of the computer and engineering sciences. Today, it bears central importance in the disciplines of engineering informatics and architectural design and urban planning. CAD is defined as "the use of computers to aid in the creation, modification, analysis, or optimization of a design" (Lalit Narayan et al., 2013, p. 3) [1].
Voxel models emerged in the computer science field in the 1960s and their initial applications in the field of CAD were studied in the late 1980s by Granholm (Granholm et al., 1987) and Jense (Jense et al., 1989) [2,3]. Voxel models are referred to as "spatial-knowledge representation schemata" (Srihari, 1981) [4], implying that they can serve as spatial data structures to encode the knowledge utilized in knowledge-based design processes. This review examines the existing applications of voxel models in the fields of architecture, spatial planning, computer vision, geomatics, geosciences, manufacturing, and mechanical and civil engineering, to identify their possible role in interdisciplinary and knowledge-based design processes.

•
Identification of the scope of existing voxel model applications in the context of CAD and linked fields, based on existing interdisciplinary approaches and categorization of the identified approaches based on the dominant sub-discipline related to the interdisciplinary field of CAD; • Analysis of each identified category to identify the existing discipline-specific applications of voxel models that can offer a key utility to the field of knowledge-based computational methods and tools in architecture and urban planning; • Discussion of novel approaches to voxel models as spatial-knowledge-representation schemata in the context of computational architectural design and urban planning; • Identification of further research questions based on the outcomes of the semi-systematic literature review.
The methods selected for this review reflect these objectives. The first objective is addressed through a narrative literature review methodology. The emergence of voxel models is traced by searching for the earliest voxel definition. The reference tracing strategy is used to create an initial understanding of the scope of the voxel model applications. Scoping literature methodology is used to identify the scope of the research addressing the application of voxel models in the CAD context. This step concludes with the definition of thematic categories expressed as clusters in the bibliographic network. The resulting categorization is further studied and synthesized by utilizing the narrative review methodology, thereby addressing the third research objective.

Emergence of the "Voxel" Term
This section examines the definition of, and the theory related to, voxel models, through a literature study, to identify the earliest publications mentioning or referencing voxel models. The term "voxel" emerged in the 1970s in the field of computer science to describe methods for volume rendering and early experiments in the 3D visualization of medical images. Early attempts to work with 3D grids containing data can be traced back to the time preceding the wide availability of computers. Efforts to generate 3D visualizations of datasets constructed by utilizing medical imaging were published as early as 1970 [13]. In this context, terms such as "three-dimensional image" (Greenleaf et al., 1970) [13], "three-dimensional array" (Artzy et al., 1980) [14], and "volume rendering" (Drebin et al., 1988) [15] are often interchangeably used with the term "voxel". In the 1980s, a series of works [14,16,17] were published that sought to systematize concepts related to 3D arrays and the introduction of voxels as a mathematical concept.
Srihari [4] explained that "the term voxel is short for "volume element" analogous to pixel for "picture element" in two dimensions". He also pointed toward the potential interdisciplinary application of voxels, "ranging from organs interior to the human body to rock microstructures ( . . . )" [4]. The growing availability of computers led to a convergence of the theoretical concepts related to voxels and volumetric rendering techniques. Arie Kaufman [18][19][20] explained that "each voxel is a unit of volume and has a numeric value (or values) associated with it that represents some measurable properties or independent variables of a real object or phenomenon." [21]. In the CAD field, early voxel applications were studied by Jense and Huijsmans and initially related to 3D-object reconstruction and visualization based on multiple two-dimensional (2D) sections [22]. Jense and Huijsmans presented a literature review that outlined pioneering voxel applications in the CAD context [3].
At the beginning of the 1990s, the term "voxel" was widely recognized in the field of computer graphics and primarily linked to solid modeling and spatialpartitioning representation [23] (p. 549). Subsequently, voxels were recognized as standalone concepts in the field of computer graphics related to the field of volumetric models [24] (p. 349). Finally, a shift from the analytical to the representational charac-ter of voxels occurred, whereby "voxels have gone in and out of favor for rendering, especially in entertainment" [24] (p. 349). However, the early definition of voxel models as "spatial-knowledge representation schemata" [4] is of particular interest for this literature review. Voxel-based spatial data integration and further abstraction toward spatial knowledge representation might indicate a possibility for further development of voxel models that could lead toward knowledge-based and data-integrated design and multi-domain decision support for architectural design and urban planning. To gather detailed insights, an in-depth literature review is needed to understand the diverse disciplinary approaches that can contribute to using voxel models as knowledge representation schemata in the CAD context.

Contemporary Voxel Applications in the CAD Field
Voxel models are used to study the properties of constructed objects ranging in scale from the physical properties of a building material to the environmental properties of urban neighborhoods. They also serve to integrate different spatial data representations. This approach can be instrumentalized for knowledge-based and data-integrated design and multi-domain decision support in architectural design and urban planning. To prepare the grounds for this, a scoping review encompassing a selected range of disciplines related to the CAD field is needed, to derive the possible future directions for interdisciplinary applications of voxel models in architectural design and urban planning.
Kaufman et al. explained that a "voxel is a unit of volume and has a numeric value (or values) associated with it that represents some measurable properties or independent variables of a real object or phenomenon." [21] Srihari, who primarily works in the fields of pattern recognition, machine learning (ML), and computational forensics, stated that "developing systems for processing and displaying these [3D] images has revealed the need for developing new data structures, and more generally, for developing spatial-knowledge representation schemata" [4]. The "spatial knowledge" term is used both in the contexts of cognitive science and artificial intelligence (AI). Galton [25] offered a detailed elaboration of the spatial knowledge representation in the AI context. In the CAD context, Jense stated that "it is useful to note ( . . . ) the duality that exists between the interpretation of voxel models as sets of cuboid volume cells, or as sets of 3D points, each representing a discretized pointsample, taken from some continuous space" [3]. The early voxel definitions are different from the common understanding of voxels as collections of boxes arranged in a 3D grid related to the cuboid representation of voxels used, for instance, in computer games. In general, voxel models containing numeric variables describing the properties and variables of real objects or phenomena are data structures that encode spatial knowledge.

Identification of Key Parameters and Suitable Review Methodology
A general distinction between systematic, semi-systematic and integrative literature reviews was introduced by Snyder [26]. Semi-systematic literature reviews can be conducted "when wanting to study a broader topic that has been conceptualized differently and studied within diverse disciplines, [which] can hinder a full systematic review process" [26] (p. 334). This methodology addresses the practical constraint where "to review every single article that could be relevant to the topic is simply not possible" [26] (p. 335). Snyder elucidated that "a potential contribution [of a semi-systematic literature review] could be, for example, the ability to map a field of research, synthesize the state of knowledge, and create an agenda for further research or the ability to provide an historical overview or timeline of a specific topic" [26] (p. 335). According to Snyder, semi-systematic literature reviews require adaptation and the development of customized approaches for each study. Transparency of the process and the appropriate coverage of literature can be achieved through the development of individual standards and detailed research plans. As a result, such a method can very effectively provide answers to research questions addressing a widely defined research scope and overcome the limitations of the more narrowly defined systematic literature reviews [26] (p. 336). Meth-ods applied in semi-systematic literature reviews "often have similarities to approaches used in qualitative research in general (. . . ) [and are] usually followed by a qualitative analysis" [26] (p. 335). Based on this description, this literature review can be classified as a semi-systematic review. Regarding the choice of methods, we considered Paré and Kitsiou [24] (p. 169) who further distinguish literature review methods. For the qualitative part of this semi-systematic literature review, we chose the scoping review method. Quantitative analysis was covered by the narrative literature review method. Paré and Kitsiou [27] (p. 169) provided an overview of the methodological requirements for scoping and narrative literature reviews. However, example literature reviews implementing this methodology in the field of architectural design and urban planning are sparse. A notable exception is the study of Ullah [28], who proposed a "simplistic yet reproducible" method for systematic reviews, based on the PRISMA guidelines [29], to construct a conceptual framework for studies of the built environment. Table 1 compares the critical parameters of six literature reviews [30][31][32][33][34][35] that are similar in scope and address different CAD-related domains, to identify the methodological state of the art in the field. The table shows that simple keyword search strategies are used, and the time span often covers multiple decades. Methods such as keyword co-occurrence analysis and the yearly publication trend are frequently utilized. At the same time, the inclusion of more than one data source, screening, and deduplication is rarely reported. Echchakoui [36] suggested using both Scopus (Sco) and Web of Science (WoS) databases, while Liberati et al. [29] advocated the inclusion of gray literature, referring to preprints and other publications not indexed in the most popular databases. The method used in this study adopted the PRISMA guidelines for the scoping-literature-review methodology addressing the interdisciplinary field of CAD and the identified shortcomings. To decide on the time span for this literature review, key publications were identified and mapped onto the timeline shown in Figure 1. cations not indexed in the most popular databases. The method used in this study adopted the PRISMA guidelines for the scoping-literature-review methodology addressing the interdisciplinary field of CAD and the identified shortcomings. To decide on the time span for this literature review, key publications were identified and mapped onto the timeline shown in Figure 1.  [30][31][32][33][34][35]. The reviews listed in the table are related to the interdisciplinary field of study covered by this publication, although none of the reviews address the topic of voxel models directly. The time span for the scoping review part in this study is chosen based on this comparison. The periods for the narrative parts of this review are chosen based on the initial research summarized in the voxel-modeldevelopment timeline at the top [3,4,8,13,21,23,24]. This figure is available in Supplementary Materials ( Figure S1) as a high-resolution, full-page illustration.

Outline of the Review Scope
While a range of discipline-specific reviews have been published, no scoping review investigating the possible intersections among disciplinary approaches to voxels exists. The existing methods related to the literature reviews in different CAD-related domains were studied, and are summarized in Table 1. Based on the listed references, a keyword co-occurrence analysis was undertaken by utilizing VOSViewer as a software tool for "constructing and viewing bibliometric maps" [37]. This analysis was performed to understand the knowledge components and structure and research trends [38], and to map the trends in the research field development [39].
The initial screening showed that 82% of the 56.052 publications related to voxels was published in the field of medicine. To address this issue, an ML and natural language processing (NLP)-based screening method was developed to identify the publications relevant to the scope of this study. The method builds on the algorithms implemented in state- Figure 1. Timeline comparing the time periods of literature reviews from similar disciplines [30][31][32][33][34][35]. The reviews listed in the table are related to the interdisciplinary field of study covered by this publication, although none of the reviews address the topic of voxel models directly. The time span for the scoping review part in this study is chosen based on this comparison. The periods for the narrative parts of this review are chosen based on the initial research summarized in the voxelmodel-development timeline at the top [3,4,8,13,21,23,24]. This figure is available in Supplementary Materials ( Figure S1) as a high-resolution, full-page illustration.

Outline of the Review Scope
While a range of discipline-specific reviews have been published, no scoping review investigating the possible intersections among disciplinary approaches to voxels exists. The existing methods related to the literature reviews in different CAD-related domains were studied, and are summarized in Table 1. Based on the listed references, a keyword co-occurrence analysis was undertaken by utilizing VOSViewer as a software tool for "constructing and viewing bibliometric maps" [37]. This analysis was performed to understand the knowledge components and structure and research trends [38], and to map the trends in the research field development [39].
The initial screening showed that 82% of the 56.052 publications related to voxels was published in the field of medicine. To address this issue, an ML and natural language processing (NLP)-based screening method was developed to identify the publications relevant to the scope of this study. The method builds on the algorithms implemented in state-of-the-art open-source software developed for bibliometric analysis and systematic literature reviews. The applicability of existing open-source literature-review tools, such as revtools [40] and ASReview [41], was investigated herein. However, their application in this interdisciplinary scoping review was unsuccessful because they operate on the assumption that reviewers are starting the review process with a priori knowledge of the exact scope of the study. For example, in the ASReview, reviewers were asked to select some papers that were within the scope of the study and some papers that were outside it. This selection was used to suggest the records for reviewer classification in the next stages. While such a strategy can be useful in systematic reviews, the initial paper pre-selection can increase the bias risk. In this scoping literature review, the exact definition of the scope is the study result, not the a priori assumption made by the reviewers. The topic-modeling-based classification method implemented in revtools was selected in this context. The constraints related to the manual choice of the topic count and the dataset size were identified. The topic-modeling algorithm implemented in revtools required making choices regarding the topic count, which directly affected the result quality. In the context of this scoping literature review, this arbitrary defined parameter can increase the risk of introducing bias. Therefore, the iterative coherence score method of choosing the optimal number of topics was identified in the literature [42]. The dataset used in this scoping literature review comprised 117,908 records, and was two-to-three orders of magnitude larger than the datasets conventionally used in the systematic literature reviews conducted with revtools. The iterative character of the coherence score method and the long processing times of this implementation are currently limiting the applicability of revtools in similar scoping literature reviews. To address these issues, the initial classification was derived from a widely recognized literature database and iteratively validated by the reviewers supported by the computational techniques implemented in ASReview and revtools. The initial classification resulted in literature collections dominated by medicine-related publications unsuitable for quantitative analysis. Based on the published scientific description of the algorithms implemented in the ASReview software, the functionalities needed for this study were implemented as described in the Materials and Methods Section. The scoping literature review was complemented with the elements of a narrative literature study to gain an initial understanding of the scope of the voxel model applications and the existing disciplinary approaches. The narrative literature review methodology was also used for a detailed study of the clusters generated by the keyword co-occurrence analysis. These clusters are grouping disciplinary applications of voxel models used to initiate an in-depth analysis of the possible contributions of discipline-specific voxel model applications to CAD design and urban planning.

Data Source Description
The dataset used herein was created from the Web of Science Core Collection and Elsevier Scopus databases, which were searched for all publications containing the word "voxel" in title keywords or the abstract. Gray literature was retrieved from the CORE database [43], using the same search criteria. The database search resulted in a dataset containing papers published between 1981 and 2021. A subset of the Sco dataset containing classification data was used in the training step of the AL-based record screening process. In the next step, a dataset for the keyword co-occurrence analysis was created by classifying and merging the complete Sco and WoS datasets. Finally, the CORE dataset was classified, and the relevant publications were added to the final dataset used in the expert evaluation phase. Figure 2 presents detailed information about the size of individual datasets and the publication counts used in the final dataset.
The eligibility criteria were defined after the dataset retrieval, based on the manual reviewer's evaluation of the dataset quality. The initial inclusion criteria were set to limit the publications to quantitative study types (i.e., journal and conference papers, books and book chapters, and review and data papers), considering the quantitative character of the keyword co-occurrence analysis. Publications containing incomplete metadata, particularly the keyword and publication-date fields, were rejected, due to the keyword-co-occurrence analysis requirements. Those that did not contain publication dates (e.g., preprints) and other types of gray literature were reintroduced into the study by merging the CORE database after the keyword-co-occurrence analysis step. In the last step, reviewers were required to summarize the identified clusters based on the detailed study of the relevant publications; hence, the dataset was limited to English publications.
The preliminary study showed that 82% of the research publications concerning voxels were related to medical sciences. This was calculated based on the Scopus All Science Journal Classification Codes (ASJC Codes), which were not available for all publications in the Scopus database. This limited the method's applicability to 77% of the records from the Scopus database. The ASJC Codes assigned multiple research disciplines to each publication, thereby allowing the preliminary exclusion of publications related to medical studies. Figure 3 shows the results of this preliminary study.

Narrative Review Methodology
The elements of a narrative literature study were introduced herein to extend the timeline of the reviewed literature and trace back the emergence of publications on voxels and the initial concepts that drove the voxel model development. This was performed to facilitate the search for the early and interdisciplinary definitions of the voxel models. Narrative literature-review methods combined with the reference-tracking method were applied to identify the publications that would otherwise not be found through a systematic database query. Furthermore, narrative elements were used to conduct a detailed analysis of the clusters generated by the keyword co-occurrence analysis. The eligibility criteria were defined after the dataset retrieval, based on the reviewer's evaluation of the dataset quality. The initial inclusion criteria were se the publications to quantitative study types (i.e., journal and conference paper and book chapters, and review and data papers), considering the quantitative c of the keyword co-occurrence analysis. Publications containing incomplete metad

Scoping Review Methodology
Reviewing extensive collections of publications retrieved from publication databases poses a challenge for reviewers, which is related to the inclusion of records irrelevant to the target question [40], (p. 609) and overlapping content [44], (p. 2). To address some of these challenges, a semi-automated deduplication AsySD algorithm is currently being developed for extensive literature collections [44]. Active-learning (AL) algorithms are applied in literature review studies for interactive sorting and publication filtering [41]. Topic-modeling algorithms, such as the Latent Dirichlet analysis (LDA), are also applied to assist reviewers in screening extensive literature collections [40]. The open-source implementations of AL algorithms, LDA-based topic modeling, and the semi-automated deduplication algorithm were evaluated in this review. The de-duplication with the AsySD algorithm was only possible after the initial screening, due to the size of the literature collection. A dedicated implementation was developed after the initial experiments ( Figure 4).
This entailed relating the steps required by the PRISMA method to the abovementioned algorithms. The implementation prepared for this review utilized the scikit-learn implementation of the MultinomialNB algorithm [45] and the SciBERT transformer model [46] for the semi-automated publication screening. The probabilities predicted by the two independent ML models were combined by applying the Query-by-Committee approach based on consensus entropy [47]. This strategy was taken to mitigate bias and generate datasets for the user-driven LDA-based topic modeling. The LDA component uses Gensim [48] and Mallet [49] libraries for optimal topic number calculation based on the coherence score [42]. Software implementation was developed in Python, utilizing the widely adopted ML and NLP libraries, such as huggingface [50], PyTorch [51], and SpaCy NLP [52]. The development method required considerable computation time, partially on specialized ML hardware. A mobile workstation equipped with Intel i7-9750H CPU, 16 GB RAM, and NVIDIA RTX 2070 with 8 GB GPU memory was used. More advanced ML workloads utilized a single cloud instance equipped with NVIDIA Tesla P100 with 16 GB GPU memory. We report the hardware specifications of the two platforms used in this study and refer to them in the subsequent paragraphs for future reference and to secure research reproducibility. . General workflow describing the NLP-based screening method applied in this study for the initial screening, followed by the keyword co-occurrence network analysis and a detailed study of the clusters. This figure is available in Supplementary Materials ( Figure S3) in an alternative, horizontal layout.

Description of the AL-Based Classification Component
Undertaking AL required several stages ( Figure 5). First, publications, abstracts, and a preliminary classification were used to train SciBERT on the cloud instance and Multi-nomialNB models on the mobile workstation, to classify papers relevant to the scope of the study. Each training iteration required 4 h of computation on the cloud instance, excluding the time required for additional data processing and transfers between the cloud . General workflow describing the NLP-based screening method applied in this study for the initial screening, followed by the keyword co-occurrence network analysis and a detailed study of the clusters. This figure is available in Supplementary Materials ( Figure S3) in an alternative, horizontal layout. Figure 2 (Section 2.1) depicts the publication counts retrieved from each database. Due to the interdisciplinary character of this review, the conventional methods for defining the screening criteria were unsuccessful. A manual screening of the whole dataset was practically impossible, and unexpected challenges regarding the keyword-based filtering criteria were observed. For example, the same abbreviation might simultaneously refer to different concepts, depending on the disciplinary context. For instance, the abbreviation "CAD" refers to both computer-aided design and diagnosis. The methods used in medicine, such as CT, are applied in the CAD context (e.g., imaging techniques in additive manufacturing processes). Each filtering attempt was evaluated by the reviewers through a manual checking of individual records. The initial attempts to generate a keyword co-occurrence network based on conventional screening approaches were unsuccessful, resulting in a keyword co-occurrence diagram, in which most of the CAD-related terms were rejected, due to a higher occurrence of medicine-related terms. Accordingly, a multi-step screening strategy was developed to address this problem ( Figure 4). This method utilized the partially incorrect classification derived from the Scopus ASJC Codes and the NLP-based screening method to distinguish the papers related to the scope of this study.

General Description of the Method Implementation
The Scopus, WoS, and CORE databases were queried. The resulting datasets were pre-processed to unify the bibliographic data formatting. The pre-processing step included the unification of field names and their contents based on the ris file format specification and the generation of the internal record index for the consistency validation in the subsequent processing steps. In the second step, an AL-based method was introduced to assist the reviewers in screening the collected datasets. The classification data derived from the ASJC codes were used to train the SciBERT and MultinomialNB ML models. The ML models were used to classify the remaining datasets. Following the AL principles, the iterative process of the reviewers' validation and classification was based on consensus entropy and repeated training. The reviewers validated the outcomes by a manual classification of the LDA topics derived from individual publications. The iterative validation procedure was applied both in the training and classification steps to assure that the final results produced by the presented method were validated by the reviewers. The datasets were then merged and deduplicated using the AsySD tool [44]. The datasets for the keyword co-occurrence analysis in VOSViewer [37] and for the manual reviewers' evaluation were prepared. The VOSViewer dataset preparation required keyword processing and file format conversion. The dataset for the reviewers' evaluation was created by the merging and deduplication of the previously described dataset with the records from the CORE database. Finally, the keyword co-occurrence analysis was executed. The resulting clusters were evaluated by the reviewers. The reviewers manually browsed the dataset for relevant publications based on the keyword co-occurrence analysis and the resulting assignment of individual keywords to the thematic clusters. Finally, the reviewers analyzed and described the clusters, based on their expert knowledge.

Description of the AL-Based Classification Component
Undertaking AL required several stages ( Figure 5). First, publications, abstracts, and a preliminary classification were used to train SciBERT on the cloud instance and MultinomialNB models on the mobile workstation, to classify papers relevant to the scope of the study. Each training iteration required 4 h of computation on the cloud instance, excluding the time required for additional data processing and transfers between the cloud infrastructure and the local system.
In the training step, the updated publication classification was used to train a new iteration of SciBERT on the cloud instance and MultinomialNB models. This process was repeated until the reviewers no longer reported any misclassified LDA topic. These stopping criteria occurred after three iterations. The last iteration of the ML models was used to classify the remaining papers retrieved from the WoS and CORE databases. This final classification step was conducted on the mobile workstation because the SciBERT prediction step requires less GPU resources than the SciBERT fine-tuning (training) procedure. In the classification step, the updated publication classification was used to update the dataset partitioning and iterate over the LDA-based reviewer evaluation procedure until the stopping criteria were reached. In the training step, the updated publication classification was used to train a new iteration of SciBERT on the cloud instance and MultinomialNB models. This process was repeated until the reviewers no longer reported any misclassified LDA topic. These stop-  Figure S4) in an alternative, horizontal layout.

Description of the Reviewer Evaluation Component
The following step was concerned with the pool-based sampling and the LDA-based reviewer validation procedure ( Figure 6). The reviewers used the LDA-based topic-modeling method implemented on the mobile workstation to review papers whose uncertainty was larger than the 75th-percentile threshold. The Mallet implementation of the LDA algorithm for topic modeling was used. The optimal number of topics was calculated based on the coherence score method [42]. In this method, the LDA procedure was iteratively run for different topic numbers, and the coherence score was recorded.
based on the coherence score method [42]. In this method, the LDA procedure was iteratively run for different topic numbers, and the coherence score was recorded.
The relation between the topic number and the coherence score was plotted in a similar manner as in the "elbow method", widely used with the k-Means algorithm. The number of topics was chosen for the last value of the coherence score, after which the coherence score started to linearly decrease. This required running 35 iterations of the LDA algorithm and resulted in 6 h of computation on the mobile workstation, excluding the time required for additional data processing and user interaction. The 75th-percentile threshold was chosen based on the manual quality assessment, and resulted in 40.372 records requiring manual validation. Thus, the reviewers classified the LDA topics instead of individual publications, and the updated topic classification was extrapolated to the individual publications.

Integrating Results of the AL-Based Screening with the Keyword Co-Occurrence Analysis
At this stage, the number of publications was reduced from 79,830 to 10,119, and semi-automated deduplication with AsySD was possible. The Sco and WoS datasets containing keywords were merged and exported for the semi-automatic deduplication with AsySD [44]. This procedure identified 877 duplicates, which was 12% of the whole dataset. This quantity of duplicated entries can directly affect the results of the qualitative keyword co-occurrence analysis because the keywords of the duplicated records would occur multiple times in the keyword co-occurrence analysis. To address this issue, the dataset was carefully deduplicated and validated. The resulting deduplicated dataset was converted for the keyword co-occurrence analysis with VOSViewer [37]. Different spellings of the same keyword can negatively influence the results of the keyword co-occurrence analysis. For example, in this study, multiple keywords containing the words "three-dimensional," "3d," "3D," and "3-dimensional" were identified, and their spelling was unified into the "3D" form. VOSViewer allows users to supplement the analysis with a thesaurus file. This file maps each keyword to its unified form, and must be manually created.
The dataset used in this step contained initially 35.173 unique keywords, and a manual generation of the thesaurus file was not plausible. Therefore, the keywords were manually filtered and selectively processed using purpose-written Python regex expressions and selectively lemmatized using the SpaCy NLP Python library [52]. To assess the quality of this process, the reviewers selectively validated the keywords list. In this iterative process, the keywords were alphabetically sorted, based on occurrence. The list was updated The relation between the topic number and the coherence score was plotted in a similar manner as in the "elbow method", widely used with the k-Means algorithm. The number of topics was chosen for the last value of the coherence score, after which the coherence score started to linearly decrease. This required running 35 iterations of the LDA algorithm and resulted in 6 h of computation on the mobile workstation, excluding the time required for additional data processing and user interaction. The 75th-percentile threshold was chosen based on the manual quality assessment, and resulted in 40.372 records requiring manual validation. Thus, the reviewers classified the LDA topics instead of individual publications, and the updated topic classification was extrapolated to the individual publications.

Integrating Results of the AL-Based Screening with the Keyword Co-Occurrence Analysis
At this stage, the number of publications was reduced from 79,830 to 10,119, and semi-automated deduplication with AsySD was possible. The Sco and WoS datasets containing keywords were merged and exported for the semi-automatic deduplication with AsySD [44]. This procedure identified 877 duplicates, which was 12% of the whole dataset. This quantity of duplicated entries can directly affect the results of the qualitative keyword co-occurrence analysis because the keywords of the duplicated records would occur multiple times in the keyword co-occurrence analysis. To address this issue, the dataset was carefully deduplicated and validated. The resulting deduplicated dataset was converted for the keyword co-occurrence analysis with VOSViewer [37]. Different spellings of the same keyword can negatively influence the results of the keyword cooccurrence analysis. For example, in this study, multiple keywords containing the words "three-dimensional", "3d", "3D", and "3-dimensional" were identified, and their spelling was unified into the "3D" form. VOSViewer allows users to supplement the analysis with a thesaurus file. This file maps each keyword to its unified form, and must be manually created.
The dataset used in this step contained initially 35.173 unique keywords, and a manual generation of the thesaurus file was not plausible. Therefore, the keywords were manually filtered and selectively processed using purpose-written Python regex expressions and selectively lemmatized using the SpaCy NLP Python library [52]. To assess the quality of this process, the reviewers selectively validated the keywords list. In this iterative process, the keywords were alphabetically sorted, based on occurrence. The list was updated after each regex operation. In the next step, the dataset was exported to VOSViewer, and the keywords were visually evaluated in the co-occurrence network. This step was completed when the reviewers did not report any duplicate keywords visible in the VOSViewer network, and by selectively checking the keyword list. The resulting dataset was used to generate the keyword co-occurrence network with VOSViewer (Results Section). The keyword co-occurrence analysis generated thematic clusters and revealed patterns related to individual keywords. In the following step, the gray literature from the pre-processed CORE dataset was merged with the processed Scopus and WoS datasets, resulting in the final combined dataset. The reviewers manually searched for all the metadata contained in this dataset and identified relevant clusters and related keywords in relation to the research aim. In the next step, the reviewers identified the key literature and analyzed the individual clusters, based on the final combined dataset and their expert knowledge.

Evaluation Metrics and Manual Validation of the AL Component
Confusion matrices and the improvement in the accuracy score for each new generation of the ML models were needed to evaluate the results of the NLP-based screening method (Figure 7). Compared with the initial classification of the preclassified part of the Scopus dataset, this process identified 790 relevant publications that would otherwise be excluded from the scope, and 2.027 irrelevant publications which would negatively affect the quality of the keyword co-occurrence analysis. A total of 40.372 publications were screened by the reviewers using the topic-screening method. However, the wide application of the described method is currently limited, given the technical complexity, cumulative computation time, and reviewers' effort required for the result validation. A detailed description of the computational method is not the aim of this study, and will be considered for a separate publication. A further adaptation of this method for future literature reviews is possible, given the growing availability of computational resources.
Paré and Kitsiou pointed out that scoping and narrative literature-review methodologies do not require formal statements of bias-mitigation strategies [27] (p. 170). The PRISMA methodology developed for the systematic literature reviews in the field of medical sciences is still attracting wide recognition and use. Consequently, literature reviews in the engineering sciences are often required to comply with the PRISMA methodology. The strategies applied to minimize the risk of bias in this scoping literature review are illustrated in Figure 8 and reported below, to support the transparency and reproducibility of this research. First, the WoS, Sco, and CORE datasets were pre-processed to match the bibliographic data formatting. The bibliometric data consistency was manually validated by the reviewers. Custom indexing aligned with the partial dataset lengths was introduced. The custom index was introduced to prevent duplicated or missing records and possible data processing errors in the next steps. The reviewers manually validated the custom index values after each processing step.
Next, the reviewers iteratively validated the AL process results with the LDA-based topic-modeling method (Section 2.3). In this step, the potential risks are related to (1) the bias inherent in the chosen ML models and choice of the method to combine the ML predictions, and (2) the method for validating the outcomes of this algorithmic procedure. Two possibly different state-of the-art ML models were identified, and the Committee Voting strategy was applied. The AL strategy utilizing topic modeling for reviewer validation was introduced. The inherent bias of the LDA algorithm and the bias introduced by the choice of the topic number were considered. The available implementations of the LDA algorithm were also tested for this dataset. The reviewers qualitatively evaluated the resulting topics and the recorded coherence scores for the different topic counts generated by the LDA algorithm. As a result, the Mallet implementation of the LDA algorithm [49] was chosen, and the coherence score [42] method combined with the selective reviewers' evaluation was applied to select the optimal topic number.
PRISMA methodology developed for the systematic literature reviews in the field of medical sciences is still attracting wide recognition and use. Consequently, literature reviews in the engineering sciences are often required to comply with the PRISMA methodology. The strategies applied to minimize the risk of bias in this scoping literature review are illustrated in Figure 8 and reported below, to support the transparency and reproducibility of this research. First, the WoS, Sco, and CORE datasets were pre-processed to match the bibliographic data formatting. The bibliometric data consistency was manually validated by the reviewers. Custom indexing aligned with the partial dataset lengths was introduced. The custom index was introduced to prevent duplicated or missing records and possible data Deduplication and keyword processing for the keyword co-occurrence analysis were subsequently undertaken. The semi-automatic AsySD deduplication procedure assigned 91 publication pairs for manual screening. This step showed that the duplicates accounted for 12% of the whole dataset, directly affecting the qualitative keyword-co-occurrence analysis. In relation to the keyword processing, the procedure resulted in a 17% reduction of the total keyword count. Finally, the quality of the keyword co-occurrence network was analyzed by the reviewers. Figure 9 shows the steps taken in the AL-based screening procedure. In the first step, 79.830 records were identified from the databases, while 38.078 gray-literature items were identified from other sources (e.g., CORE database). Deduplication was conducted in the later step. Based on the inclusion criteria, 5.373 records were removed from the WoS and Sco datasets, due to the publication type and the language criteria. Accordingly, 17.559 records were removed from the CORE dataset, due to the language criteria. In the second step, 94.979 records were screened with the NLP-based method. As a result, 92.410 records were classified as irrelevant in the AL-based screening.

Results
In the next step, the results of the AL-based screening were combined with the keyword co-occurrence analysis by the reviewers. A dataset describing the keyword cluster assignment and the weighted importance of individual keywords within each cluster was created using VOSViewer software [37]. This dataset was visualized as a keyword-co-occurrence network diagram ( Figure 10) and used by the reviewers in the next steps. Further record exclusion and knowledge synthesis required the systematic combination of the keyword dataset with the literature collection. The reviewers separately selected the studies to be included in the next step for each cluster. The filtered literature collection was queried by the reviewers, based on the selected keywords derived from the weighted keyword occurrence in the studied cluster. All publications containing the chosen keyword in the title, abstract, or keywords were recorded, resulting in a dataset containing 2.569 records. The reviewers manually screened all records in this dataset based on titles, keywords, and abstracts and marked 487 publications for full-text retrieval. They then studied the retrieved publications, to summarize the cluster descriptions. Each cluster description contained a table, in which all the publications listed in the description were further categorized based on the keyword used in the retrieval process. Particular attention was given to the existing literature reviews. The reviewers were then asked to identify the literature reviews related to the studied cluster and to commence the cluster description with the overview of the existing literature reviews.

Results of the Keyword Co-Occurrence Network Analysis
The merged dataset containing records from the Scopus and WoS datasets was used to create the keyword co-occurrence network (Figures 10 and 11). The different colors in Figure 10 represent the thematic clusters generated with the unified mapping and clustering approach implemented in VOSViewer. The proximity between two keywords reflects the close relations of both terms, even if the terms are assigned to different clusters. The color scale in Figure 11 shows the average publication year assigned to each keyword. The average publication year analysis presents information that is similar to the one in the keyword burst analysis, in which the development of certain concepts in relation to the studied topic can be matched with a particular time. The most recent trends can be traced back to individual keywords and thematic clusters.
The node distribution in the network illustrated in Figure 10 is balanced, and the six clusters can easily be identified. This network diagram does not contain keywords related to medical sciences, which is the main aim of the NLP-based screening step. VOSViewer software enables users to exclude the most frequent keywords from the network visualization, allowing for informed decisions on excluding selected keywords to achieve fine-grained and balanced clustering and mapping results. For example, the first and second clusters contained the "computer graphics" and "computer vision" terms, respectively, which suppressed most of the keywords in the respective clusters. Hence, they were excluded from the visualization and chosen as the cluster names to reflect their relevance in their respective clusters. The names of the remaining clusters were chosen based on the reviewers' expert knowledge, matching the name of the scientific discipline with the keywords in the cluster. The nodes of the sixth cluster were scattered. This cluster was assigned to the general concepts related to the voxels present in a wide range of disciplines. The clusters that emerged from the keyword co-occurrence analysis were assigned for analysis by the reviewers. In the next step, the results of the AL-based screening were combined with the keyword co-occurrence analysis by the reviewers. A dataset describing the keyword cluster assignment and the weighted importance of individual keywords within each cluster was created using VOSViewer software [37]. This dataset was visualized as a keyword-co-oc-

First Cluster-Intersections between CAD and Computer Graphics
The first cluster represents research on computer graphics ( Table 2). Keywords such as "visualization" and "virtual reality", describe the technologies relevant to digital design and planning. Figure 11 shows that the keywords in this cluster have the lowest average publication year. The applications related to the role of voxel models as spatial data structures for encoding knowledge for knowledge-based design processes were not identified in this cluster. Therefore, the discussion in this cluster was limited to the description of the role of the voxel models in dedicated visualization techniques. Most contributions in this cluster were related to the voxel model applications for visualizing large datasets describing buildings [53] and large territories [54]. Experiments with preliminary design exercises both on-screen [55][56][57] and in virtual reality [58,59] exist. Voxel-based generative-design interfaces have also been proposed [60,61].

Second Cluster-Intersections among CAD, Computer Vision, and Urban Planning
The second cluster related to the field of computer vision shows multiple overlaps with the computer graphics cluster. Keywords with a higher average publication year are less related to the computer graphics cluster. The most recent research is related to point cloud classification and semantic segmentation, autonomous vehicles, and multimodal data fusion. Table 3 lists the representative publications from this cluster. Different methods of real-time 3D mapping and ML-based scene understanding are present in this cluster, including "machine learning", "convolutional neural networks", "object recognition", "intelligent robot", and "stereo vision" keywords. The keywords occurring in this cluster are related to data acquisition and integration methods, such as "multimodal data fusion", "RGB D", and "stereo vision." The intersection between this cluster and the remote sensing cluster contains terms describing urban environments, such as "roads and streets", "trees", "transportation", and "urban planning". Xu, Tong, and Stilla [8] recently reviewed voxel-based point cloud representations for their potential role in the construction industry. Their review presented a detailed overview of the algorithmic approaches addressing point cloud pre-processing, registration, segmentation, classification, and modeling. They primarily focused on the datasets acquired through laser scanning and stereo vision applied for 3D urban-scene mapping, which aligned well with the cluster described in this paragraph. Table 2 summarizes the publications that went beyond the scope of the review published by Xu, Tong, and Stilla [8] and focuses on the voxel model application in digital architecture and planning. In this context, the voxel models were applied to quantify green space [62], estimate the local landscape index [63], and communicate the importance of urban green volume to non-expert stakeholders using digital fabrication technologies and different visualization techniques [64]. Voxel-based methods can be applied to distinguish individual trees in urban locations [65], predict individual tree species [66], and estimate individual tree volumes and the amount of carbon stored in a single tree [67]. Voxel models were also applied in an urban context to study visibility in complex-terrain conditions [68] and extend Lynch's isovist theory into quantifiable, 3D metrics describing urban landscapes [69].

Third Cluster-Intersections among CAD, Geomatics, and Architectural and Spatial Planning
The third cluster summarized in Table 4 overlapped with the computer vision cluster. The spatial data acquisition through autonomous vehicles, an understanding of the urban scene, and the application of these concepts to urban planning were identified in the second cluster. A similar synergy among data acquisition, the generation of information, and the knowledge applied to design and planning was visible in the third cluster, extending toward non-urban environments. The keyword 'architectural design' is assigned to this cluster. The application of voxel models and generative adversarial networks (GANs) in architectural form design [70] was identified. Furthermore, voxel models were applied to design hospital layouts [71], connect the voxel-based simulation with the network analysis for building-evacuation modeling [72], and integrate pathfinding and heat transfer for the building-performance simulation [73]. The integration of the building-information modeling (BIM) and voxel-based modeling approaches is gaining popularity. Combined BIM and voxel environments allow the automatic monitoring of the daily construction site progress [74] and crowd-behavior simulation during fire and toxic-gas expansion [75]. The voxelization of BIM models for cell-based path planning [76] and the automatic annotation of exterior building elements [77] were recently studied. The conversion of 3D scans to BIM objects can also utilize voxel models [78]. In addition, photogrammetric 3D scans of lattice structures can be automatically converted into line-based 3D skeleton models used for structural analysis [79]. Spatio-temporal analysis Jjumba and Dragićević [93][94][95], Shirowzhan et al. [96] For building interiors, voxel models are used to reconstruct the semantic labels of the building interior from the data collected with the 3D scanning sensor of a mobile augmented-reality device [80]. The curvilinear walls, irregular slabs, stairs, and ramps were successfully classified in the abovementioned example. Internal doors and windows can also be reconstructed from incomplete point clouds using a voxel-based approach [81]. A similar approach was applied to building facades, where voxel models were used to reconstruct the building facade geometry and directly use the results in structural analysis software [82]. GANs could be utilized with voxelized facade models to generate the fragments of the facade that are missing in the acquired datasets [83]. Voxel models were also used to design building envelopes based on simulated solar radiation [84].
A voxel-based solar analysis was applied in the urban planning context [85] and in a fine scale through single-laser scanner acquisition [86]. In the context of the Geographic Information Systems (GIS), voxel models are widely utilized when 3D data and temporal change must be introduced [87,88]. The identified applications spanned marine environments [89], volumetric recording of archaeological sites [90], urban planning support in relation to 3D geological modeling [91], and underground energy storage [92]. In the GIS field, the generative capabilities of voxel models were utilized by introducing voxel-based geographic automata [93,94] applied, for example, to simulate the dispersal of airborne pollutants [95]. The application of the voxel automata was recently reviewed in relation to the spatio-temporal change of a built-fabric 3D density in urban contexts [96].

Fourth Cluster-Intersections among CAD, Materials Science, and Geosciences
The fourth cluster focused on the internal structure of the Earth's surface and on studying the processes happening on this surface (Table 5). Knowledge regarding structure and processes was applied in CAD when planning terrain modifications and in large-scale planning. The fourth cluster contained keywords, such as "porosity", "permeability", "flow simulation", "erosion", and "microstructure". In this context, voxel models were applied to simulate and visualize the spatio-temporal change driven by natural processes and model the multi-layered structure of the Earth's surface. Dedicated voxel-model visualization techniques contained stack-based terrain representations [97] related to geotechnical-modeling applications [98]. A voxel-based earthwork modeling methodology incorporating the geotechnical properties integrated with the BIM processes was proposed [99]. Different natural processes can be modeled, analyzed, and visualized through space-time cube representations [100] and collaborative, tangible interfaces [101]. Digital rock physics (DRP) is a methodology for studying the petroleum reservoir structure with a focus on the porosity of the Earth's subsurface layers in relation to pore interconnectivity and fluid-rock interaction on multiple scales. Voxel models were applied to model the porosity of laboratory samples based on CT scans and integrate large-scale data describing the geological structure of the studied territory. Integrating 3D printing and DRP [102] allowed the physical manufacture of tangible samples of the digital rock voxel models with different materials. These digital rock twins could be tested using the same laboratory procedures as real rock samples. Moreover, voxel models were applied to study the relations between various soil properties and plant roots [102]. Semi-automated root vectorization techniques were also developed for CT scan-based voxel models [104]. A similar approach was taken to study the 3D spatial distribution and relations among hydrological, geochemical, and microbiological processes [105].
Finally, voxel models were applied to study the relations among land-use patterns, habitat classification, and their use by animals. These methods can support conservation and management planning in urban parks [108]. The application of voxel-based probabilistic space-time prisms (STP) [107] can further advance studies that address urbanhabitat-use patterns at high-resolution temporal scales [108]. Voxel-based STPs utilized GPS tracking data to map and predict the probability that the tracked agent (animal) can be found at a specific location at a given time. This information can be overlaid with land-use data to uncover otherwise unobserved daily use patterns related to urban habitats [109]. The temporal range can cover a few days [108] to multiple months [109], depending on the tracking resolution data.

Fifth Cluster-Intersections between CAD and Computer-Aided Manufacturing
The fifth cluster summarized in Table 6 contained terms such as "3D printing", "finite element method", "topology optimization", and "concrete and thermal conductivity". Bacciaglia et al. [7] published a systematic review addressing the voxelization in additive manufacturing. Momeni et al. [6] reviewed 4D-printing processes, addressing the design and fabrication of shape-changing 4D-printed structures. In the structural analysis context, Schillinger et al. [5] reviewed the finite cell method (FCM) for the structural analysis of the CAD and image-based geometric models. Table 6 summarizes the representative publications from this cluster. In subtractive manufacturing, voxel models were applied in combination with ML to automatically identify the conventional machining processes from CAD models [110], predict the cutting force [111] and resulting deformations [112], and generate efficient toolpaths [113]. In additive manufacturing, voxel models and ML were applied to predict the 3D-printed-shape accuracy [114], enforce manufacturing constraints on topology optimization [115], and design 3D-printed, self-organizing, and functionally graded materials [116]. The material performance and failure can be studied with the voxel model application. Meanwhile, the mechanical damage on concrete can be studied at the micro scale [117] and by using 3D-imaging techniques and autonomous platforms to monitor buildings [118,119] and other civil engineering structures [120]. The voxel model application for the structural analysis of existing buildings is studied in the context of heritage preservation, considering that the 3D scanning of historical monuments is widely practiced. However, the conventional cloud-to-BIM-to-FEM workflows [121] require sophisticated 3D modeling techniques and expert knowledge. The direct structural analysis of voxelized point clouds with the FCM methods is currently under study [122]. Accordingly, a semi-automated voxelization method was developed to assess the structural stability of a partially collapsed heritage building [123]. Aside from the structural performance, the thermal conductivity of building materials [124,125] can also be studied with voxel-based methods.
In the context of digital architecture and planning, voxel modeling approaches were applied to conventional building materials. An integrated voxel-based workflow addressing digital design, structural analysis, and 3D printing with concrete was recently proposed [126]. Non-cuboid voxel models were used to propose a reconfigurable slip formwork system for materializing continuous, modular concrete structures [127] and constructing voxel-based aggregation structures materialized as stacked MDF units connected by tenon and mortise joints [128]. Furthermore, voxel models were used to integrate topology optimization into the digital design and fabrication process by utilizing concrete and customized foam molds [129]. A similar approach accommodated voxel-based design methods to construct structurally optimized and highly specified tectonic configurations of wooden modules [130], combining multi-material topology optimization, robotic fabrication, and the encoding of design properties in individual voxel cells for the complex multi-material configuration of the proposed voxel assembly.
Progressing beyond the materials widely adopted in architecture, Michalatos and Payne [131] observed that the surface modeling paradigm currently predominant in 3D architectural modeling does not permit incorporating multi-scalar material properties and fine-grained material performance. Their contribution resulted in a software prototype that made designing within the volumetric paradigm possible, and incorporated the internal complexity of solid objects related to the hierarchical complexity of physical materials [132]. A series of deformable physical objects were fabricated to explore the potentials of multimaterial 3D printing based on the analysis components implemented in their software. The applicability of this modeling approach can be extended because voxel models were applied to study the performance of complex materials, such as 3D woven composites [133] or open-cell metal foams [134], in relation to their unique structural and thermal performances.
The advantages of combining voxel models, genetic algorithms (GA), and finiteelement analysis in the context of topological optimization are well studied [135]. Voxelbased design processes utilizing GA can be combined with computational fluid dynamics algorithms to simulate thermal and airflow performances. These concepts were recently applied to design an organically shaped heat exchanger by utilizing a single-objective GA for optimizing pressure drop and heat transfer, consequently combining two opposing objectives [136]. Voxel-based generative processes can be used to design optimized, fail-safe structures [137] and functionally graded and resource-efficient building components [138]. The growing availability of computation resources has led to a point where generative voxel-based morphogenesis processes can reach giga-voxel resolutions and be applied to generate objects incorporating structural details in scales ranging from millimeters to tens of meters. Aage et al. [139] described this generative-material optimization process, whereby a complete structure of a plane wing emerged from a computational process constrained by the typical aerodynamic load cases and a 3D outline of a typical plane wing. The generated multi-scalar structure showed similarities with the structural patterns observed in bird bone structures. Aage et al. stated that: this "methodology (. . . ) is directly applicable to similar morphogenesis problems in other engineering disciplines, as well as in architecture and industrial design." [139] (p. 86).

Discussion
The chronological structure of this study is informed by the timeline of the voxel model development shown in Figure 1 (see page 5). The choice of methodologies follows this structure, and the discussion is organized accordingly. The main findings of this study have been listed below, to initiate the discussion on novel applications of voxel models in the context of architectural design and spatial planning:

1.
A useful starting point for developing novel applications of voxel models is the observation that the widely adopted definition of a voxel model as "the 3D conceptual counterpart of a 2D pixel in an image" [21] should be seen in its original context and be complemented with the definition of a voxel model as "spatial-knowledge representation schemata" [4].

2.
Various applications of voxel models in architectural design developed over time, shifting from human-computer interaction studies towards computational experiments that reflect the generative dynamics of natural systems.

3.
The growing availability of high-resolution, 3D data capturing urban scenes and large territories has been instrumentalized in spatial planning, where voxel models are used to integrate and enrich the raw data with the outcomes of analysis and simulation. 4.
In various disciplines, spatio-temporal dynamics of the natural and man-made environment are studied using voxel-based methods. Design approaches addressing the challenges of climate change and sustainable development can benefit from the application of identified voxel-based approaches.

5.
Applications of voxel models addressing all architectural project phases have been identified. In urban planning projects, identified applications of voxel models are covering initial design phases.

Existing Voxel Model Defintitions and Their Relevance for Future Research
The following is the most widely adopted definition of a voxel: it is the 3D conceptual counterpart of a 2D pixel in an image [21]. From today's perspective, this creates a misconception by analogy between a static 2D digital image and a voxel as its 3D equivalent. Voxel models emerged at a time when digital photography was not as omnipresent as it is today, and the majority of digital 2D images were computationally generated or acquired through specialized scientific equipment, such as early CT and MRI devices. At that time, the analytical character of a digital image was more evident, and the analogy between 2D pixels and 3D voxels conveyed a different meaning. This initial understanding of voxel models can be re-established by understanding their analytical character, which can extend beyond the 3D geometry in integrating diverse datasets created with 3D mapping, analysis, and simulation, as well as through the design process itself. The growing availability of computational resources makes it possible to encode multidimensional datasets in voxel models and represent the spatio-temporal changes of various parameters for datasets, increasing in scale and resolution.
By extension, one of the interesting ways in which voxels can be used is to encode different datasets, which leads to new insights and allows voxels to be considered as "spatial-knowledge representation schemata" [4]. Nelson and Stolterman [140] posited that design is inquiry for action. In this context, voxel models can be utilized to encode the spatial knowledge encoded in voxel models such as to be actionable in the context of a design-driven inquiry. This voxel model application is particularly suitable for interdisciplinary design environments, where different disciplinary datasets must be spatialized and integrated to support an interdisciplinary design process. Therefore, we foreground herein the integrative approaches that use voxels as knowledge representation schemata. This review outlined several research topics in which voxel models were utilized to structure disciplinary datasets and link them with a discrete geometric representation. In the context of architectural design, voxel models are used for design and analysis of the scales of architectural objects and urban systems. Advancing into a wider scope of computer-aided design, contributions from material science, mechanical engineering, robotics, 3D scanning, and automated object classification were identified. This extended scope outlines the possible directions for future research, where the identified approaches can be utilized to inform the design process. Designers are confronted with the growing complexity of both the object of their design and the environment in which their design exists. In the context of the built environment, the urgent need for innovative sustainable design and construction practices drives the need for novel approaches to informed design methods. The potential of voxel models relating to their interdisciplinary character and the possibility to integrate multi-scalar data were identified in this review. The multi-scalar character of voxel models allows the integration of diverse knowledge domains and the incorporation of temporal change within one composite, spatial model. Composite voxel models understood as "spatial-knowledge representation schemata" [4] could be used to advance spatial information into spatial knowledge, following the line argumentation proposed by Srihari [4]. These models could be combined with expert knowledge to tackle contemporary design challenges, such as climate change and sustainable development. Based on this, targeted multi-domain decision-support systems can be developed and utilized to support designers and decision makers.

Existing Voxel Model Applications in Computer-Aided-Design Studies
The early voxel model applications in architectural design utilized basic humancomputer interfaces, and were studied in pedagogical contexts [55,56]. Over time, they evolved toward generative voxel-based design environments, such as Zellkalkül [60] and Emergent Reefs [61]. These applications integrated computational logics with natural growth processes, and initially involved links with manufacturing processes and environmental simulations. This line of inquiry more recently developed into a combined application of voxel models and GANs in architectural design [70]. Such experiments are often conducted in an abstract design space that disregards the spatial context inherent to architectural interventions.
Advanced voxel-based methods currently existing in the field of CAD can be used to integrate a wide range of physically measured or simulated properties directly related to the affordances of the designed objects. Affordances incorporate relations among the object, its user, and the environment. Voxel models can describe both occupied (the object) and empty (the environment) spaces on a wide range of scales; hence, they can be applied to design objects based on their desired affordances. This could be achieved by utilizing ML-based affordance-detection algorithms to learn function-to-form mapping and generate objects by combining the desired affordances [141]. Alternatively, non-geometric design knowledge formalized as a SysML model can be used to represent the spatial conflicts across multiple design domains in a voxel model space [142]. Spatial conflicts can be addressed through the computational definition of the intended empty spaces related to the design requirements defined by multi-domain design stakeholders that might change throughout an objects' lifecycle. These emerging voxel model applications demonstrate the differences between the understanding of voxel models as the 3D equivalents of pixels versus the analytical character of spatial-knowledge-encoding voxel cells that can be harnessed in computational-design processes.
Findings described above indicate the strength of voxel models in providing approachable and playful interfaces for spatial interactions. It is possible to extend the capacity of such voxel-based design interfaces to incorporate informed, generative-design environments. Moreover, abstract concepts such as affordances and spatial conflicts, can be expressed in geometric terms. Lastly, the multi-stakeholder perspectives and temporal change can be introduced into voxel interfaces. On the other hand, weaknesses of the existing voxel-based design experiments are related to the fact that they often operate in an empty, abstract space, disregarding the constraints of pre-existing geometry and environmental conditions. Moreover, the introduction of abstract concepts, such as affordances or spatial conflicts, requires highly specialized approaches, and has been currently tested only with small-scale objects. From here, the following research gap can be derived. The direct integration of interactive voxel-based environments with data-driven, generative-design processes has not been extensively studied in the field. In particular, concepts such as affordances and spatial conflicts have not been considered in the context of voxel-based methods in architectural design. Lastly, it is important to consider the role of different stakeholders and the temporal change, while developing new voxel-based design approaches. Based on this, the following further research questions arise. How can the user-centered and data-driven, multi-temporal, voxel-based design processes converge to support the architectural design processes? What are the challenges of incorporating affordances and spatial constraints into such voxel-based design approaches? How can different stakeholder perspectives be instrumentalized in such a design process?

Existing Voxel-Model Applications in Spatial-Planning Studies
By contrast, voxel models are ideally suited for integrating 3D scanned data representing urban scenes and large territories. Xu, Ting, and Stilla [8] extensively reviewed this topic, showing that individual objects can be segmented, semantically classified, and converted to geometric representations directly usable in the CAD context, through diverse ML-based methods. Looking beyond the ML-based urban-scene understanding and analysis, voxel models can be applied to integrate acquired and simulated geospatial data [143] to support generative, performance-oriented design processes in non-urban contexts [144].
In urban contexts, conventional spatial-analysis methods, such as isovists, can be extended to their 4D counterparts through voxel-based methods [69]. At the same time, urban trees can be located [65], identified [66], and quantified in terms of carbon storage [67]. Similar concepts can be applied at a larger scale to quantify the green space in cities [62,63] and analyze and communicate the importance of urban green spaces through digital visualizations and tangible models [64].
The strengths of the voxel models' applications in this context are related to the capacity to represent objects found in the physical world as objects in the abstract space of the voxel model. These voxel-based representations can encode diverse datasets describing physical properties of the physical objects captured. Weaknesses of such approaches can be derived from the technological constraints of the devices used to capture the 3D scan data. Physical resolution of the sensors, acquisition constraints and in-process errors limit the direct use of such data in voxel-based design processes. Moreover, computational techniques to process the acquired data and augment them with additional information are actively developed, and an in-depth understanding of the techniques is required to integrate voxel models in the design processes. This leads to the emergence of a research gap, formulated as follows. The convergence of information-rich, voxel-based representations of our physical environment and data-driven, architectural-design processes has not been extensively studied. Understanding and the continuous development of technological processes is needed for the convergence of the voxel-based methods and architecturaldesign processes. Based on this, the following research questions arise. What are the open challenges for the integration of information-rich, voxel-modeling approaches in introducing physical environment constraints into the architectural-design process? How can the development and dissemination of knowledge required for the integration of voxel modeling and architectural design be accelerated?

Existing Voxel-Model Applications from the Interdisciplinary Perspective
The identified voxel-model applications in urban contexts bring together natural and man-made elements. Voxel-model applications for ecological modeling and urban-habitat characterization [106,108] were also identified. These spatio-temporal analysis methods cover a wide range of scales and knowledge domains [145]. However, the voxel-based integration of these methods in computational design processes that integrate urban and architectural design with natural elements and urban ecology has not yet been fully explored. Some research projects focusing on this integration are underway [146,147]. Natural growth processes can also be modeled with voxel models, both in a design context [61] and to study the plant root growth [104]. The convergence of these two approaches is observed in the field of Baubotanik, where voxel models were recently applied to reconstruct a skeleton model from 3D scanned examples of living architectures [148]. The integration of living architecture in urban contexts could be facilitated through a voxel model to leverage the data integration potential, combining different scales, disciplinary datasets, and methods.
The unique quality of voxel-model applications in this context is the ability to spatialize expert data coming from other disciplinary contexts and possibly enable the data for integration with voxel-based design processes in architectural design. The discussed studies are contributing insights that can expand the impact that architectural design and the spatial-planning profession can have on addressing the current sustainability challenges. The perceived weaknesses of the discussed methods are the complexity of individual disciplinary voxel-modeling approaches, requirements for expert data input and the interdisciplinary expertise required for the validation of modeling results. Moreover, existing voxel-model applications are often operating in different spatio-temporal scales and resolutions, the application of which in architectural-design processes is conceptually challenging. From there, the need to understand and integrate voxel-modeling approaches coming from the field of earth sciences within the future design and planning activities can be seen as a possible research gap. Moreover, an understanding of the spatiotemporal scales and resolutions utilized in other disciplines is required to succeed in interdisciplinary design and planning approaches. Future research can focus on the following questions: what are the constraints limiting the possible integration of voxel-modeling approaches from the fields of earth sciences and ecology into design and planning activities? How can the interdisciplinary, voxel-based data relating to different scales and resolutions be integrated into and made an integral part of a design process?

Distribution of the Identified Voxel-Model Applications across AiA Project Phases
Voxel models can be linked with different architectural and urban-planning stages to address sustainable development. Table 7 assigns the identified voxel-model applications to the categories derived from the project phases, as described in this paragraph. The division of the architectural planning process into phases is conventionally standardized by national bodies, such as the Royal Institute of British Architects (RIBA), the American Institute of Architects (AiA), and other national equivalents. The RIBA and AiA project phases were established to standardize contracts signed between practices and clients and specify project deliveries that architectural offices must submit at the end of each phase. The context of this study extends beyond the understanding of the design process as a procedure to plan and construct a building, by encompassing the relations between natural and man-made environments. This is performed to enable holistic design processes toward sustainable development. As presented through clear indicators, the architectural-design profession recognizes the need for more holistic design processes to address, for instance, the relations between buildings and the environment (e.g., through building life-cycle analysis). This study establishes four categories derived from the building life-cycle [149] (p. 14) and design phases [149] (p. 22) established by the AiA. The first category combines "Pre-design" activities and building "Use and Maintenance" to underline the fact that each constructed building becomes a part of the building stock. The term "building stock" describes a group of buildings, while each constructed building is seen as a stock of raw materials that can be adapted or recycled throughout its lifecycle. From this perspective, the "Use and Maintenance" activities naturally blend with the "Pre-design" activities in the process of the constant change of buildings and cities. The first category collects the voxel-model applications that can be used to capture, quantify, and analyze the objects constituting man-made and natural environments. The following two categories refer to the "Schematic Design" and "Design Development" AIA project phases. Table 7 presents the voxel-model applications related to the design activities assigned to the two project phases listed in those categories. The fourth category contains the contributions related to the voxel-model applications addressing the physical aspects of the architectural design process, including building construction and the constraints related to materials and manufacturing.
The first category in Table 7 contains the voxel-model applications related to both architectural and urban design. Nearly all voxel-model applications in the urban context fall into this category, which can be explained by the adoption of diverse 3D-data-acquisition and analysis techniques utilizing voxel-based representations. The applications related to architectural design are related to the similar techniques applied to both building exteriors and interiors. The category related to the "Schematic Design" phase involves different approaches for design experimentation, utilizing voxel representations. The "Design Development" category contains approaches addressing the analysis and optimization of internal building organization and those for combining generative-design processes with voxel-based representations. The last category identifies the voxel-model applications related to digital manufacturing processes and material properties. The listed publications show growing interest in incorporating material-performance and manufacturing-process constraints in the design processes. Table 7. Selected voxel model applications assigned to categories based on the project phases derived from the AIA Guide to Building Life Cycle Assessment in Practice [149].

Project Phases Architectural Design Urban Planning
Pre-design/ Use and Maintenance Liu et al. [53] Deidda [76] Hübner et al. [80] Previtali et al. [81] Truong-Hong et al. [82] Chen et al. [83] Orengo [90] Shoaib Khan et al. [99] Taraben and Morgenthal [118] Yang et al. [119] In reference to Table 7, the strength of the voxel-model applications in architectural design can be assigned both to the large number of contributions related to the last design phase and to the good coverage of all project phases. The advantage of voxel-modeling approaches in spatial planning can be seen in the strong concentration in the initial design phase. At the same time, the large fragmentation of architectural-design approaches can be seen as a weakness, since the individual studies are conventionally perceived as disjointed approaches to instrumentalize voxel models to solve the problem at hand, instead of constituting a larger picture of the voxel-modeling approaches in architectural design. While the voxel-model applications in spatial planning are more concentrated, the lack of voxel-model applications in the later project phases can be seen as a disadvantage. The above-mentioned observations can constitute a research gap, expressed as a need to further systematize the voxel-model applications in each of the architectural project phases. Moreover, the transition of voxel-modeling approaches found in the later project phases of architectural design into the domain of spatial planning might lead to new findings. These observations suggest new research questions. In reference to each of the AiA Project Phases, what are the existing voxel-modeling approaches, and how they can be extended? How can the applications of voxel models from different AiA Project Phases and across the architectural design and urban-planning activities be reapplied in different phases and activities?

Summary of New Questions and Possible New Research Steps
Finally, the main findings of this study derive from the definition of voxel models as "spatial-knowledge representation schemata" [4]. Research gaps emerging from the findings and the following research questions have been presented for each of the main findings. This engenders new further research steps: • Focus needs to be placed on the investigation of the possible convergence of usercentered and data-driven, multi-temporal, voxel-based design processes in the context of architectural design. This includes the role of affordances and spatial conflicts and ways of expressing them in a voxelized design space, incorporating stakeholder interactions. • A second line of inquiry needs to focus on the integration of data-driven, voxelmodeling approaches that incorporate physical-environment constraints into architectural-design process. This can serve to underpin the development and dissemination of expert knowledge related to the data-driven voxel-modeling approaches in architectural design.

•
Further focus needs to be placed on the promotion of interdisciplinary collaboration between the disciplines of architectural design, spatial planning, earth sciences and ecology, through the development of interoperable voxel-modeling approaches and the instrumentalization of disciplinary datasets ranging in scale and resolution. • Finally, it will be useful to undertake systematic studies of voxel-modeling approaches in architectural design and urban planning, addressing each of the AiA Project Phases and possible innovations emerging from the application of identified methods in different project phases or design activities.

Conclusions
This paper presented a semi-systematic literature review with the aim of uncovering and discussing the possible intersections of diverse disciplinary methods related to voxel models regarding their possible contribution to digital architecture and planning. This study used scoping and narrative literature-review methods to map and summarize the findings and trace the development of voxel models over time. The first part of the review concluded with a keyword co-occurrence analysis. The analysis of the keywords contained in the clusters revealed numerous voxel-model applications, and covered a wide range of topics studied in computer-aided design. This analysis revealed the gap in examining how voxel models could serve as data structures for multi-domain and trans-scalar data-integrated workflows. A detailed examination was conducted to identify the existing and emerging research directions, based on the reviewers' expert knowledge. According to Snyder [26], a semi-systematic literature review aims to identify the scope of topics encompassing a particular knowledge domain. The resulting description of the possible research directions is not meant to be fully exhaustive, but aims instead to provide the possibility for the research community to examine the outcomes of the scoping study and revisit different parts in the separate, systematic literature reviews. The discussion initiated in this review concluded with the observation of numerous voxel-model applications understood as "spatial-knowledge representation schemata" [4] in computer-aided design. However, attempts to integrate this type of voxel model and architectural design are sparse and fragmented. Notable exceptions can be found in generative design [61], geomatics [121], material science [131], and computational morphogenesis [139] (p. 86). However, the full potential relating to the interdisciplinary, integrative, and holistic design approaches addressing sustainable design challenges based on voxel models is only starting. The possible future research directions identified in this review include the voxel-model application for the data-driven design approaches, leveraging analysis and acquisition methods from the field of geomatics. These processes might incorporate the identified generative-design elements and be executed in both urban and non-urban contexts. The identified environmental-modeling methods addressing the field of urban ecology often utilize spatio-temporal, voxel-based representations. The application of such approaches in the context of integrated design and planning processes will be further studied.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/architecture3020010/s1 Figure S1: Timeline comparing the time periods of similar disciplinary literature reviews. (high-resolution, full-page illustration), Figure S2: Disciplinary distribution of the voxel-related papers related to the year of publication based on the Scopus All Science Journal Classification Codes (high-resolution, full-page illustration), Figure S3: General workflow describing the NLP-based screening method applied in this study for the initial screening, followed by the keyword co-occurrence network analysis and a detailed study of the clusters (high-resolution, full-page illustration), Figure S4: Flow chart describing the algorithmic implementation of the active learning component (high-resolution, full-page illustration), Figure S5: Flow chart describing the algorithmic implementation of the pool-based sampling and the topic modeling-based reviewer validation component (high-resolution, full-page illustration), Figure S6: Role of the reviewers in validating the outcomes of the NLP-based screening method to minimize the risk of bias (high-resolution, full-page illustration).

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.