Next Article in Journal
Pricing Decisions in the Recycled Cement Supply Chain Considering Retailers’ Sales Effort
Previous Article in Journal
Effect of Paraffin and Vinyl Acetate Ethylene (VAE) Emulsions on the Waterproofing and Mechanical Properties of Fiber-Reinforced Modified Gypsum (FRMG) Matrix
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

From Motion to Form: Systematizing Motion-Data Processing for Architectural Generative Design

1
Department of Smart Convergence Architecture, Ajou University, Suwon 16499, Republic of Korea
2
Department of Architecture, Ajou University, Suwon 16499, Republic of Korea
*
Author to whom correspondence should be addressed.
Buildings 2026, 16(8), 1492; https://doi.org/10.3390/buildings16081492
Submission received: 13 February 2026 / Revised: 25 March 2026 / Accepted: 4 April 2026 / Published: 10 April 2026
(This article belongs to the Section Architectural Design, Urban Science, and Real Estate)

Abstract

This study systematizes the form generation process using machine learning-driven motion-tracking data and investigates the interrelationships between the characteristics of generated data and forms generated according to data-processing methods. Through the vision-based machine learning motion estimation (VideoPose3D) algorithm, 3D motion data are extracted from 2D video and categorized into point (joint), curve (bone), and boundary (range of motion) types. Furthermore, this study analyzes the form generation characteristics and limitations associated with each type of motion-tracking data derived from dynamic-to-dynamic physical activities with postural transitions. A data-processing methodology based on artistic practice from prior research is applied. The characteristics of generated data and the morphological characteristics of generated forms are then analyzed according to non-processed and processed methods. Results suggest a potential correlative tendency between the characteristics and generated forms of each type of motion data value information. A bidirectional complementary relationship exists between non-processed and processed motion-tracking data. The data-based form generation methodology demonstrates potential applicability in architectural design. This study expands design possibilities by supporting decisions early in the architectural design process and immediately generating diverse alternatives; it also proposes a standardized framework for a universal data-centric design process applicable to diverse data types, including motion data.

1. Introduction

Driven by the Fourth Industrial Revolution and advancements in information and communication technology, information has grown from a simple statistical resource to a strategic asset that generates core value across contemporary industries [1,2,3,4]. With the growing importance of data, the utilization of information technologies has expanded across diverse fields, including industry, society, and healthcare, highlighting the need for methodologies capable of effectively selecting, classifying, extracting, and utilizing rapidly increasing value data [5,6]. Digital technologies—big data, data mining, and machine learning (ML)—serve as key mediums that extend and advance data collection methods as well as tools that enhance the qualitative value of the collected data [7,8]. In contemporary architecture, digital technologies are more than tools of procedural efficiency; they are used as instruments for expanding design potential. New design methods using information technologies have been developed in recent times, mediating the integration of information technology and architecture in the fields of urban planning, energy efficiency, equipment planning, and related simulations [9,10,11,12].
Innovation in digital technologies has enabled the quantification of human bodily movement into numeric motion-tracking data through ML, sensors, and optical technologies. Throughout history, the relationship between the human body and architecture has been tightly interwoven. Humans’ perception and use of space through their bodies is a fundamental concern of architecture [13]. Changes in how the body has been understood across historical periods have revealed new relationships between the body and space (architecture) and have continuously influenced the formation of sociocultural contexts and spatial design concepts [14]. In particular, quantitative digital data analyses of bodily movement are differentiated from aesthetic investigation methods [15] that use bodily proportions based on static records (drawings, pictures) or from design generation methods that use partial depictions of movements [16,17]. Quantitative digital data immediately reflect continuous changes to the body during human movement, providing fluid data that are available for analysis, visualization, and utilization. Because diverse types of value data can be extracted according to the type of human activity (static, dynamic, etc.) and analytical elements of movement (joint, affordance, vector, etc.), motion-tracking data can operate as a multifaceted parameter that expands the potential for form generation.
The convergence of architecture and motion-tracking technology, mediated by digital technology, presents the potential to broaden the architectural design domain based on the inherent potential of bodily movement data. In particular, methodologies for deriving data with detailed attributes, such as force and location from human activities or movements, classifying and systematizing the data thus derived, and extracting diverse value data not only present new possibilities for form generation in architectural design but also highlight the necessity of conducting further related research.
However, in the field of architecture, the use of information technology has predominantly remained at an analysis-oriented level or as a means of intuitive data visualization. Prior studies have been limited to functional optimization or generating aesthetic patterns [18], leaving a critical gap in research regarding quantitative evaluation criteria and systematized frameworks for translating bodily motion-tracking data into architectural geometries.
To address this gap, this study aims to systematize a parameter-based form generation methodology utilizing machine learning-driven motion-tracking data (VideoPose3D). The original contribution of this research lies in establishing a structured framework that minimizes subjective design intervention by quantitatively evaluating the morphological and architectural trade-offs between non-processed and processed data.
Specifically, this study addresses the following core research questions:
  • How can dynamic bodily movements be systematically extracted via ML and translated into quantitative architectural parameters?
  • What are the morphological and architectural impacts (e.g., volumetric efficiency, ground contact ratio, effective occupancy ratio) of computational data-processing strategies (extension, division, clumping) compared to non-processed data?
  • How do these processing methods compensate for the inherent limitations of raw motion-tracking data and contribute to a potential data-centric generative design process?
The remainder of this paper is organized as follows. Section 2 outlines the research methodology. Section 3 and Section 4 examine the theoretical considerations and interrelationships between the human body, activities, and motion capture technology in architecture. Section 5 presents the typification and organization of body motion-tracking technologies, focusing on their specific characteristics. Section 6 details the form-generation process utilizing motion-tracking-based 3D data. Section 7 reports the results, specifically analyzing the design process and the characteristics of forms derived from both non-processed and processed motion data. Finally, Section 8 discusses the findings, centering on the analysis and systematization of the characteristics and interrelationships of the generated forms, and Section 9 presents the conclusion.

2. Research Methodology

First, this study involves a theoretical examination of the concept of the human body in architecture and of motion capture technology as a foundation for systematizing form generation processes based on motion-tracking data. It investigates the characteristics of each type of value data (movement data, activity data) derived from the interrelationship between digital technologies and motion-tracking data and analyzes the system of principles underlying various methods used to extract data. This study analyzes the categorization and characteristics of human activities and compares the features and distinctions of a sensor-based system with those of a vision-based ML motion estimation system in the process of extracting motion data. Next, it involves a theoretical examination of the detailed components of motion-tracking data and value data derived from the vision-based ML motion estimation system. In analyzing the characteristics of motion-tracking data by type, this study categorizes the detailed motion data used in architectural design through literature reviews. The characteristics of non-processed data that can be derived through the VideoPose3D algorithm are analyzed as well. These non-processed data serve as foundational material that can be obtained through sensor-based motion capture and as foundational material for the stage preceding data processing in this study, and as an intuitive data form of movement. This study subsequently classifies motion data-processing methods and examines the characteristics resulting from their application, along with their distinctions from and interrelationships with non-processed data. In building the form generation methodology, this study produces a 3D form by developing a generation methodology appropriate for the characteristics of each type of value data in motion-tracking data derived using machine learning-based motion capture technology. For design process analysis, this study distinguishes the literature-derived processing taxonomy from the actual experimental scope, focusing on selected processing strategies. It analyzes the correlation between the data characteristics and the generated forms, while objectively evaluating the morphological outcomes through quantitative architectural metrics (e.g., volumetric efficiency, ground contact ratio, and effective occupancy ratio). Finally, this study analyzes the mutuality and characteristics of forms generated using non-processed value data and value data transformed through data processing. Through this series of steps, this study systematizes an architectural form generation methodology based on the types of value data from machine learning-based tracking data (Figure 1).

3. Interrelationship Between the Human Body and Architecture

3.1. Theoretical Consideration of the Human Body in Architecture

In architecture, the human body has functioned as a fundamental measure for spatial proportion and perception since antiquity. While the physical dimensions of the human body were applied as static design standards in the past (e.g., Vitruvius, Le Corbusier) (Figure 2), contemporary architecture has expanded the human body into an active medium that shapes the sensory experience of space (Steven Holl, Peter Zumthor) [19,20,21,22].
With recent advancements in digital technologies such as sensors and machine learning, physical movement has been converted into quantitative 3D data, transcending simple proportions [13,17,23,24]. This reconceptualizes the human body from a mere subject of spatial perception into an active medium of form generation, capable of deriving non-linear geometric space.
Therefore, this study systematizes a methodology within a digital environment to utilize the dynamic motion of the human body as a quantitative parameter for data-driven architectural form generation and validates its technical validity and potential.

3.2. Classification of Human Body Activities

To utilize body movement as a design parameter, previous studies classify human activities into static, dynamic, and activities with postural transitions [25,26,27,28]. Static and simple dynamic activities are straightforward for data extraction, but securing morphological diversity is challenging due to low data displacement. Conversely, the ‘dynamic to dynamic’ transition among activities involving postural transitions entails significant movement overlap, requiring high technical precision for data extraction and segmentation (Table 1).
However, such data complexity serves as a technical potential for deriving diverse geometric variations within the form generation process. Therefore, to validate the effectiveness of the proposed methodology under the most rigorous conditions, the dynamic postural transition activity (figure skating) with the highest data complexity is adopted as the core experimental variable.

4. Body Motion-Tracking Technologies (Motion Capture Systems)

4.1. Theoretical Consideration of Motion Capture Technology

Motion capture technology is a methodology that algorithmically extracts quantitative information such as position and velocity utilizing movement data [29]. Originating from chronophotography research (Figure 3), this technology has evolved through integration with modern digital technology into a precision data-based analytical tool and a core medium across all industrial sectors [30,31,32,33,34].
From an architectural perspective, 3D motion-tracking data provides pure movement information that excludes individual physical characteristics [35]. This serves as a practical foundation for quantitatively analyzing the interaction between physical actions and space, providing a technical basis for utilizing movement data as a quantitative parameter for generating non-linear geometry, transcending conventional static design methods.

4.2. Utilization of Body Motion-Tracking Information in Architectural and Artistic Works

In contemporary design and architecture, human movement has been treated as an experiential and sensory medium, and research utilizing motion-tracking data as a parameter for form generation has been continuously ongoing [20,36,37]. In this study, the data utilization framework is divided into a non-processed method, which directly projects raw data, and a processed method, which mathematically reconstructs the data.
The non-processed method involves the direct mapping of extracted joint and skeletal data into form generation algorithms. Based on a literature review and system output structures (Table 2), non-processed data is categorized into three quantitative parameters: point (joint coordinates) [38], curve (skeletal structures) [39], and boundary (spatial limits of the range of motion, RoM) [40].
Conversely, the processed method overcomes the limitations of morphological diversity and variable control inherent in the direct application of raw data, expanding geometric possibilities by reconstructing mathematical relationships between the data. This study draws its conceptual framework from the artistic practice (Triadic Ballet) of Oskar Schlemmer. To apply this to digital architecture, quantitative transformation rules are required.
Analysis of contemporary architectural cases (Table 3) revealed that data processing methods are categorized into five types: extension, division, clumping, deformation, and vector weight. Although the processing procedure partially refines the dynamism of raw data, it is an essential geometric optimization step to ensure architectural applicability in accordance with design purposes. In particular, this study clearly distinguishes between the literature-derived taxonomy and the actual experimental verification. The experimental scope for the experimentally validated findings is strictly limited to the three methods of extension, division, and clumping, as these provide clearly defined mathematical operational rules and mechanisms for control.
The data types and characteristics derived from the non-processed and processed methods are defined in Table 4. By comparatively analyzing the form generation principles and the geometric and structural characteristics of the outcomes from both approaches, this study proposes a form generation framework that allows designers to selectively apply data processing methods according to their design purposes.

5. Typification and Organization of Motion Capture Technology

5.1. Classification of Motion Capture Technologies

Modern motion capture technology is expanding into diverse research areas by integrating with machine learning (ML) algorithms, moving beyond simple movement tracking [41]. Recent studies utilize motion-tracking data across multifaceted application fields (or adjacent industrial fields) such as virtual reality (VR) integration, robotics control, and athletic performance analysis based on full-body movement records (Table 5).
According to a literature review (Table 5), contemporary motion capture technologies are classified into sensor-based and vision-based machine learning (Vision-based ML) methods according to their data collection and processing approaches (Table 6). Sensor-based methods (e.g., inertial, mechanical, optical) enable precise joint tracking; however, they require expensive specialized equipment and controlled experimental environments, imposing physical and economic constraints on their universal application in general architectural design environments.
In contrast, vision-based machine learning technology estimates the 3D joint positions and skeleton using only 2D video from a single camera without specialized equipment, offering high flexibility in data extraction. Considering the practical purpose of immediately utilizing dynamic movement data as a parameter during the early architectural design phase, this study selects vision-based 3D pose estimation technology as the core tool, as it facilitates marker-less data extraction while effectively eliminating physical constraints.

5.2. VideoPose3D (ML Motion Estimation Based on Key Point of Image Feature Data)

The VideoPose3D technology used in this study to visualize 3D bodily movements is part of the vision-based ML motion estimation/3D pose estimation/monocular 3D regression category and was developed by the Facebook AI Research team. VideoPose3D begins with predicted 2D key points for an unlabeled video, then estimates 3D poses, and finally back-projects to the input 2D key points. Additionally, in a supervised setting, the fully convolutional model achieves an error reduction rate of 11% compared with existing state-of-the-art methods [43]. As such, VideoPose3D, based on ML (supervised learning), converts movements in videos (2D) into reliable quantified motion-tracking data (3D) by estimating, extracting, and analyzing key points. Key points are a category of data extracted through image feature data (local feature data, Scale-Invariant Feature Transform, and Speeded Up Robust Features) [49], and they function as critical components in generating motion data, such as joint positions, coordinates, and skeletons, from frame-by-frame images within the input video (Figure 4).
VideoPose3D is a motion estimation algorithm with a high level of accuracy. However, because of its supervised learning nature, this accuracy may change depending on the value of the dataset used for learning. Therefore, this study adopts the Human3.6M dataset as the movement dataset for the supervised learning process of VideoPose3D. Human3.6M contains 3.6 million video frames for 11 subjects, of which seven are annotated with 3D poses. Each subject performs 15 actions that are recorded using four synchronized cameras at 50 Hz [50], making it the most accurate source for motion estimation.
In the field of architecture, most studies have focused on the utilization of body movement data based on sensor-based motion capture technology, including optical, mechanical, magnetic, and inertial methods [29]. Applying sensor-based motion capture technology in research involves a substantial budget [34]. The sensor-based motion capture method also has limitations. It requires considerable time and has low economic efficiency, as errors occurring during data acquisition due to user mistakes necessitate repeated data collection, and the user must perform numerous activities while wearing sensors to accumulate diverse movement data. However, vision-based ML motion estimation employs video footage as input data and eliminates the need for sensors or specialized cameras, offering a high degree of economic efficiency. Errors that occur during video recording can be addressed through simple editing processes, and diverse movement datasets can be accumulated by sourcing movement videos through basic online searches and applying them to the estimation algorithm. Given such technological efficiency, this study employs VideoPose3D, a vision-based ML motion estimation method, rather than conventional sensor-based motion capture technology. This study thus proposes a new architectural utilization of bodily movement data as a 3D form generation mediator.

6. Form Generation Process with Motion-Tracking Technology-Based 3D Data

Based on the literature presented in Table 2, Table 3 and Table 5 and various prior studies, body movement data revealed in the design process, and motion capture technology utilization trends and tendencies, the following three key issues can be identified. First, studies in the architectural field employing body movement data are generally limited to the sensor-based motion capture method. Considering economic feasibility and technological efficiency, there is a need for architectural design research that adopts vision-based ML motion estimation. Second, some studies employing the vision-based ML motion estimation method have used only the non-processing approach for the extracted motion-tracking data. It is necessary to explore the diversity of data utilization methods through processing-based approaches that focus on the relationships among motion-tracking data. Third, most architectural form generation methodologies employing motion-tracking data present design outcomes by intuitively visualizing the data without further processing. However, there is a need for research utilizing motion data as parameters to explore new morphological possibilities. Considering these issues, the current study aims to systematize a form generation methodology that utilizes bodily motion-tracking data.
Specifically, this study aims to systematize a form generation methodology that employs body motion-tracking data derived from VideoPose3D, a vision-based ML motion estimation system. The activity type selected for analysis is dynamic to dynamic, which poses the greatest challenge in motion-tracking data extraction. The utilization of motion-tracking data is categorized into motion data non-processing and motion data processing, with detailed data setup. Within motion data processing, the relationship information between motion data includes parameters such as extension, division, deformation, clumping, and vector weighting. This study places particular emphasis on extension and division, as these operations were most prominently employed in the artistic works of Oskar Schlemmer (Bauhaus), as shown in Table 7.
In this study, the bodily movement videos used as sources for motion data extraction were adjusted to 50 frames per second (FPS) using video editing software and frame adjustment Python (3.7.9) code (ffmpeg). This adjustment was performed to ensure stable algorithmic execution, as the Human3.6M dataset used for training VideoPose3D is based on videos recorded at 50 FPS. The motion data generation tool used to estimate the 3D coordinate data of bodily movements in 2D videos with adjusted FPS is the digital language system based on Visual Studio, which is ideal for Python (3.7.9), C language, and JavaScript Object Notation (JSON). Video materials were input into the algorithm in MP4 format, and the extracted 3D motion coordinate data were classified and organized into a chart format using Numpy and subsequently converted into Excel files through Pandas. The motion-tracking data stored in Excel format were then transformed into Grasshopper data using the Grasshopper plug-in TT Tools. Accordingly, this study employed Rhinoceros and Grasshopper as digital tools for motion data-based form generation. Depending on the specific requirements of the form generation process, additional Python functions and Grasshopper plug-ins were incorporated according to relevant reference materials. The form generation process with motion-tracking technology-based 3D data consists of video conversion, data extraction, data processing, and form generation using data (Figure 5). Such form-generating experiments present and systematize a methodology for the utilization of motion data in architecture after analyzing the correlations and limitations of the characteristics and generated forms that emerge from a non-processing data utilization method or a processing data utilization method, which classifies, expands, and reinterprets extracted data.
To ensure methodological reproducibility, the specific experimental conditions, parameter settings, and software environments utilized in this study are detailed in Table 8. This structured protocol enables the replication of the entire workflow, from 2D video data extraction to 3D form generation.

6.1. Characteristics and Form Generation Using Non-Processed Motion-Tracking Data

Non-processed motion-tracking data are the most fundamental element that can be obtained through ML-based extraction and generation of bodily movement data. Data clusters are formed by translating the relationships and positions of joints that constitute moments of bodily movement in each video frame into coordinates. As such, the non-processed motion-tracking method (VideoPose3D) is used as a mediating mechanism that converts the dynamic characteristics in 2D videos, such as movement, motion, and activity, into a 3D mathematical representation. Using this method, each frame of the video yields 17 bodily movement point data, including data on body joints such as the head (point nos. 9 and 10), body (point nos. 0, 7, and 8), arm (point nos. 11, 12, 13, 14, 15, and 16), and leg (point nos. 1, 2, 3, 4, 5, and 6). Beyond providing 3D positional data for joints (coordinates), the non-processed motion-tracking method additionally provides skeletal data. These data are provided visually, requiring a process that generates actual skeletal data. The skeletal data derived from coordinates converted from joint positional data supply physical information on body constituents, such as the position and length of bones. The RoM can be derived by calculating the maximum spatial extent that encompasses all joints and bones of the body. Non-processed motion-tracking data provide both the convenient data processing required for form generation and a foundational database, and enable dynamic data inherent in the abstract concept of movement to be quantified and used as a parameter for form.
As previously described, motion-tracking data can be derived through a comprehensive perspective using sensor-based motion capture and vision-based ML motion estimation, and through a detailed approach, using various digital tools. This study uses the VideoPose3D function, which operates within a Visual Studio environment based on Python (version 3.7.9), C language, JSON, and Compute Unified Device Architecture. When the non-processed motion-tracking data generated through this approach are applied to the form generation process, the data are converted into Excel file formats and subsequently imported into Grasshopper, an add-on for Rhino. However, when transforming data into Excel format using Numpy and Panda, the alignment of the classification of data based on joint index numbers and 3D coordinates must be performed by the researcher.
When visualizing the motion data information applied to Grasshopper on a coordinate system, 17 data points are generated per frame. Given that the duration of the video used in this study is 37.06 s (1853 frames), the total number of generated coordinate points is 31,501 (37.06 s * 50 FPS * 17 data points). While large datasets expand diversity in the process of form generation, an excessive volume of data may increase computational complexity and the time required for the visualization process. Therefore, to efficiently utilize the extracted data, the segment from 36 s to 37.06 s was excluded from the total 37.06 s (1853 frames) video, simplifying the dataset to 30,600 data points. The original data at 50 FPS were converted to 0.5 FPS, 1 FPS, and 2 FPS, simplifying the data to 306 (36 s * 0.5 FPS * 17 data points), 621 (36 s * 1 FPS * 17 data points), and 1224 (36 s * 2 FPS * 17 data points) data points, respectively (Figure 6).
In 2D video-based dynamic motion estimation, camera panning and zoom effects inherent in the source footage distort the global spatial reference during 3D coordinate projection. This results in a geometric artifact where the estimated 3D poses are inaccurately concentrated within a localized region. To correct this spatial distortion, this study implemented an algorithmic compensation method, termed the data expansion process, based on anatomical bone-length constancy. Since the actual physiological distance between the pelvis (point no. 7) and the thorax (point no. 8) remains constant during movement, any variation in their extracted 2D pixel distance across frames is a direct function of camera zooming or depth translation.
By computing the inverse ratio of this frame-by-frame variation, the relative scale factor of the camera was calculated. Concurrently, the trajectory of the root joint was utilized to compute the camera’s panning velocity. These algorithmically derived parameters were then applied to mathematically offset the camera’s movement, accurately re-projecting the local 3D joint coordinates into a stable global virtual environment. Consequently, this data expansion process effectively controls distortion along the camera’s viewing vector axis and depth ambiguity, yielding highly reliable spatial arrangement data consistent with the physical environment [51].
As illustrated in Figure 6, the application of non-processed motion-tracking data to the form generation process necessitates a form generation method appropriate for the characteristics of each type (point (joint), curve (bone), boundary (RoM)) of motion data element. In the conventional architecture field, numerous architectural design studies have used portions of non-processed motion data as parameters for form generation processes through sensor-based motion capture methods rather than through vision-based ML motion estimation, such as VideoPose3D. Accordingly, diverse types of designs and form generation methods (e.g., Grasshopper plug-in) for implementing them in a digital environment are emerging. This study classified form generation methods by each motion data element, with a focus on forms in which the characteristics of motion data, used as parameters from the various designs and form generation methods, are intuitively and visually revealed. Moreover, by analyzing the characteristics of each motion data element, this study systematized the interrelationships between the data and generated forms.
The characteristics of point (joint), curve (bone), and boundary (RoM), which are motion data elements that can be derived through motion capture technologies, can be analyzed in relation to the human body as follows. Point data are data generated with the combination of x, y, and z coordinate values for each joint, constituting fundamental information that can be extracted using motion capture technology. Point data are an essential element in generating other motion data elements or value data. The human body creates movement through relationships between each joint and its adjacent joints. The forms generated with point data in prior studies also show design aspects that embody a generation principle based on the relationships between joints. As point data vary according to bodily movement over time, and the degree of variation depends on each joint, they can function as weighted values—a parameter for form generation. In the case of point data extracted from video, temporal coordinate changes can be interpreted as variations in positional states with changes in frames. Furthermore, point data are categorized numerically according to actual joint types within the human body. This signifies that the magnitude of force from movements differs by joint and that joints differ in importance for bodily activities or the maintenance of bodily form. Therefore, the interrelationships between force magnitudes of each joint and their corresponding importance may also be utilized as incidental parameters for form generation.
Curve data can be generated only through point data obtained from motion capture technologies. Therefore, curve data exhibit a high degree of dependency on point data, with inherent value data variations along with variations in point data. The human skeletal structure is composed of linear elements, and while they shift in response to joint movement, their linearity remains constant. Therefore, in motion variations over time, across video frames, and with movement, only the skeletal position and angle change, rather than their status. The continuous variation in linear elements forms a sequential flow, and the overlap in curve data (linear element) creates planar designs based on this flow. Unlike point data, curve data include detailed information such as on length and directionality, which can serve as incidental parameters for form generation. Additionally, the degree of variation in curve data is more pronounced compared with that in point data. Consequently, forms generated using curve data as parameters often show dynamic patterns. However, skeletal movement is limited, as it cannot change in all angles. The human body cannot rotate joints beyond a defined RoM, and the measurement values vary depending on the flexibility of an individual. Therefore, curve data are limited as they are highly influenced by the physical abilities of the measured subject.
Boundary data refer to the RoM and are characterized by a high degree of dependency on point and curve data. Boundary data embody both linear and areal elements simultaneously, including curve data such as length, as well as unique data such as the area of a region or inclination within a region. Under typical conditions in which the body does not produce extreme movements, boundary data exhibit gradual, frame-by-frame variations. These continuous and moderate ranges of change generate a coherent flow. Overlapping data create a natural volume, which can be utilized as incidental parameters for form generation. In prior studies, RoM data have been applied primarily to the delineation of spatial boundaries or in investigations into spatial sizes appropriate for bodily activity. Specifically, linear and areal data are overlapped to define spatial volumes or determine optimal spatial forms that can cover regions generated from overlaps. Forms generated using boundary data as parameters exhibit natural and organic characteristics with strong functionality, but they are also limited in achieving design diversity. Form diversity is determined more by the dynamism of activity than by physical capabilities. As in this study, if activities that produce dramatic motion variations are to be used as the data inference target, diverse forms may be attained, but there will be constraints such as the inability to ensure diversity vis-à-vis activities with lower dynamism. Therefore, there is a need to explore methods to address these limitations.

6.2. Characteristics of and Form Generation Using Processed Motion-Tracking Data

The processed motion-tracking data approach can be interpreted as a data-processing methodology and a method for expanding design possibilities based on motion by partially transforming unprocessed motion-tracking data, originally derived from ML-based extraction and the generation of bodily movement information, through standardized procedures of data processing, transformation, editing, and manipulation. In doing so, it preserves and reveals a diverse range of fine-grained information embedded within a single motion-tracking dataset. In this study, the processed motion-tracking data approach was used to partially compensate for the common limitations seen in various form generation methods that use non-processed motion-tracking data. Conventional non-processed motion-tracking methods accumulate data by capturing individual frames or instantaneous moments within a movement sequence. Therefore, the extractable information at each frame or moment is limited to 17 joint coordinate data points and the skeletal and RoM data derived from these coordinates. Consequently, when new motion data or diverse detailed information must be obtained, it must be generated by inputting new video to ML models or attaching sensors to humans. As such, this approach is limited in securing the diversity of motion data and detailed information. The application of processed motion-tracking data as form generation parameters can be used as a methodology that offsets such limitations. In non-processed motion tracking, motion data extracted from a 10-min video at 2 FPS yield 34 data elements per frame (point (17) + curve (16) + boundary (1) = total 34), resulting in a total of 40,800 data elements (10 min * 60 s * 2 FPS * 34 data points). In contrast, processed motion tracking involves additional stages of data processing, extracting 40,800 * n data elements, where n is the number of processing types, thereby expanding the potential for use as formal parameters of larger data. Processed motion-tracking data, as parameters of form generation, thus contribute to a methodology that expands the diversity of motion data elements and broadens formal possibilities. They also show potential as design tools that compensate for the limitations in form generation methods using non-processed motion-tracking data.
The motion data-processing methods employed in this study are categorized according to prior studies. In particular, they refer to the artistic work by Oskar Schlemmer with Bauhaus, in which bodily activities were used for the transformation and reconstruction of the body, either in whole or in part, based on the interrelationships between body parts. The motion data-processing method is categorized into extension, division, clumping, deformation, and vector weight; this study focuses primarily on extension, division, and clumping methods in analyzing the interrelationships between the characteristics of and generated forms from each processing method. This study also systematizes a form generation design process based on processed motion-tracking data.
In data processing, the extension method is a method of bodily reinterpretation best revealed in Oskar Schlemmer’s artistic work titled “Slat Dance,” where body parts are exaggeratedly extended or enlarged. When applied to motion data, diverse forms of reconfiguration are possible, such as extending parts of the data, setting different hierarchies among specific data by assigning weights, or changing the scale while maintaining the existing data system. The division and clumping methods are body reinterpretations best revealed in Oskar Schlemmer’s artistic work “Triadic Ballet.” The division method is revealed through sketches, and the clumping method is revealed through the process of costume composition. The division method distinguishes the body into components according to the designer’s subjective standards, producing a classification system that is different from biological classifications. When such processing methods are applied to motion data, they either generate a classification system for each data point by assigning weights based on the distance between elements according to specific standards, or enable the segmentation and simplification of data by applying a new classification system on motion data based on the conventional biological classification of the body. The clumping method simplifies complex body structures according to the rules established by the designer. When such a processing method is applied to motion data, elements located within a set distance from key body joints are integrated as one element, and the data volume is reduced by simplifying the complex skeletal and joint structures or by minimizing variations in body form from movement with reference to the body’s core (Table 9).
Furthermore, the deformation and vector weight data-processing methods that were not included as parameters in the form generation process of this study are also means of body reinterpretation that are best represented in “Triadic Ballet.” The deformation method is revealed by transforming bodily form, while the vector weight method is revealed through variations from costume movement. The deformation method modifies specific body parts to generate a combination of geometric forms, thereby seeking novel forms that depart from conventional biological body forms. If such a processing method is applied to motion data, body part elements are expressed not in point or curve data but in a completely different geometric form, producing data that differ fundamentally from the original motion data and enabling the generation of diverse formal parameters. The vector weight method reveals the abstract and invisible physical force relationships inherent in bodies by visualizing them. If such a processing method is applied to motion data, new forms are generated by differentiating weights according to the magnitude of forces applied within a body, expanding the distance in specific directions based on the space between joints or skeletal elements, or by applying moment values.
As shown in Table 9, when processed motion-tracking data generated through data-processing methods are applied to the form generation process, it is anticipated that a diverse range of forms, distinct from those produced using non-processed motion-tracking data, can be derived depending on the specific processing method employed. Therefore, this study applies categorized processing methods to detailed motion data to analyze variations in both the motion data and the resulting forms. Furthermore, this study examines the limitations of utilizing existing non-processed motion-tracking data, the interrelationships between data-processing methods, and the characteristics of motion data generated through processing to systematize the interrelationships between them and the resulting forms. The characteristics of the extension, division, and clumping processing methods, which can be derived through data processing, are analyzed as follows in relation to variations in motion data and the forms generated from them.
The extension approach (extension-based data transformation) selectively expands parts of detailed motion data (point, curve, boundary) to emphasize the importance of specific bodily elements. For point (joint) data, differentiated weights and hierarchy are assigned to each joint according to the biological classification standard. High weights are assigned to the head, neck (clavicle), both hands, the core, and both feet; medium weights are assigned to the shoulders and pelvis; and low weights are assigned to the elbows and knees. This differentiated weighting provides the same joint coordinates as non-processed data, while allowing the abstract concept of biological classification standards to form. Two approaches are possible for curve (bone) data modification: altering existing data by elongating certain body parts and transforming linear data into planar data through extension and overlaps. This study adopts the former approach, enabling more dramatic and dynamic form generation than that achieved through non-processed data utilization by modifying the length information of selected data. Through this process, designers can reflect personalized priorities in the form by exaggerating body parts deemed significant during bodily activities. Boundary (RoM) data modify value data by expanding spatial regions toward joints or body parts with biologically high significance, or by changing the scale of the region based on the body’s core (point no. 7). This study expands spatial regions around specific joints according to the designer’s priorities. While this approach provides the same detailed regional data as non-processed data, it reconfigures spatial regions around key bodily elements to dynamically transform the previously gentle model or make modifications to subdue the forms with large variations of activities across frames in accordance with the designer’s intent.
The division approach (division-based data segmentation) enables independent utilization of specific body parts through the segmentation of motion data. Point (joint) data assign joint-specific weights by calculating the distance between each joint and the bodily core (point no. 7). At the same time, positive and negative weights are assigned after distinguishing the upper part from the lower part and the left from the right of the body. Although this approach provides the same joint coordinates as non-processed data, it means that inter-joint relationships can be reconfigured as weights, centered on specific points or joints set by the designer. This approach also modifies motion data to reflect the abstract relationship between joints in the form. Curve (bone) data create a segmented data cluster by segmenting the body as follows: the head area, including the neck; both arms, including the shoulders and elbows; the spine, including the clavicle and core; and both legs, including the knees and pelvis. While they provide the same curve data as non-processed data, their reorganization into different clusters of elements presents the possibility of form generation using only selective elements. Unlike other data-processing methods that emphasize or hierarchize data elements according to designer-defined priorities, this approach extracts only selected segments by excluding the continuity inherent in bodily connectivity, thereby isolating the detailed components of bodily movement data as independent generative elements. Boundary (RoM) data extract regions generated by specific body parts and convert a single regional data element into multiple regional data points, unlike existing methods that explore the region encompassing the entire body. Although the same spatial information is retained, the selective extraction of partial regions rather than an all-encompassing whole enhances both the flexibility and diversity of motion data when employed as parameters for form generation.
The clumping approach (clumping-based data simplification) simplifies complex motion data through aggregation. Point (joint) data reveal n number of joints located within a defined distance from a specific point that is a body part, joints with high relevance to a specific point, or n numbers of joints closest to a specific point; they are then converted into one group and the central point is determined, thereby simplifying the 17 datasets per frame. This approach provides completely different coordinate data from those obtained through non-processed data, thereby securing parametric diversity. It also restructures the data such that the form reflects the simplified outcome based on abstract inter-joint relations rather than simple coordinates. Curve (bone) data simplify the body structure formed by complex relationships among joints and bones and convert it into a single, continuous, and simplified curve generated between terminal joints. Multiple curve datasets are unified into a single Non-Uniform Rational B-Splines closed curve by connecting joints associated with a specific body part from a biological perspective. Thus, the body structure is reconstituted into a new type of monotonous data element that is different from the existing data that metaphorically express body structure. Boundary (RoM) data are aggregated according to vector values weighted on the basis of the average distance from joints constituting the body, considering which regions were generated. This reduces sudden changes by partially eliminating the characteristics of existing data, which show abrupt variations according to the level of dynamic bodily movements. Furthermore, data are modified to partially eliminate distortions that occur following motions.

7. Results

7.1. Systematization of Form Generation Characteristics by Type of Non-Processed Motion-Tracking Data

Table 10 presents the results derived from a form generation process based on non-processed motion-tracking data. From these findings, this study identifies correlations between the previously analyzed typological characteristics and the inherent properties of non-processed motion-tracking data, as well as the forms generated through experimental results. Further, this study explores the potential of motion data to serve as parameters for form generation and aims to systematize the application methods for each type of motion data, focusing on their respective characteristics and limitations.
  • Point data serve as a fundamental element for extracting and generating other body movement data by combining x, y, and z coordinate values for each joint and converting them into a 3D digital language. Generated through the quantification of physical positional data, point data form an organic network based on the abstract interrelationships inherent between joints. Forms utilizing them exhibit complex relationship-centered configurations driven by the interrelationships and organic directionality between points (joint coordinates). While the massive volume of diverse data generated according to video length, activity type, FPS values, and frame-specific motion forms can be utilized as parameters to expand morphological possibilities, it presents a limitation by inducing computational complexity and requiring excessive processing time for form generation. Furthermore, point data provide no additional information beyond simple physical coordinate data and can be utilized as positional data only during form generation. Consequently, a limitation exists wherein the subjective judgment of the designer and additional parameters are required to induce diversity in the generated forms.
  • Curve data generate linear elements that fill the gaps between joints, serving as a digital language and data resource that embodies the concept of a skeletal framework. While the linear topology remains constant regardless of changes in body movement, data diversity is generated solely through variations in detailed attributes such as angle and position. Forms utilizing this method create surface configurations through the overlapping of linear elements, driven by the organic flow of body movement. Morphological variety is secured on the basis of detailed data such as inherent length, directionality, and position; moreover, these data-driven forms, which change dramatically according to activity type, induce dynamic configurations. However, because curve data result from intuitively converting physical activity into data, a limitation emerges: the range of acquirable data within a physical environment is restricted by the specific physical capabilities of the individual subject.
  • Boundary data represent the RoM. They detect the area encompassing point and curve data and convert them into digital data. As they represent the area generated by the physical body itself, changes in the data appear relatively minimal when the activity is performed by the same body or across different body movements. However, similar to curve data, they include various detailed attributes, which support data variety. Forms utilizing this approach create volumes and continuous flows with gradual changes, generated through the overlapping of surface elements. Variety in configuration is secured through detailed data such as on inherent length, directionality, and position. However, a limitation exists: excluding activities that induce dramatic motion changes, most forms generated from standard activities result in natural and organic designs with strong functional characteristics. Furthermore, forms derived from similar activities tend to be similar (Table 11).

7.2. Systematization of Form Generation Characteristics by Type of Processed Motion-Tracking Data

Table 12 presents the forms generated using processed motion-tracking data via data-processing methods. These results are based on the systematization of the interrelationships between motion data, acting as parameters within the non-processed motion-tracking data characteristics and form generation process and design.
Detailed data from non-processed motion tracking possess positive characteristics for form generation, depending on the type, but clear limitations also appear concurrently. These limitations can be partially offset, mitigated, or overcome through data processing. By generating new forms distinct from existing ones, this process expands morphological variety and possibilities. Furthermore, it proposes the potential for form generation to reflect design concepts driven by the designer’s intent and strategy via data processing. As shown in Table 13 and Table 14, processed motion-tracking data generated through data-processing methods provide additional parameters, thereby expanding both morphological variety and the patterns of form generation.
  • The three primary limitations of point data can be summarized as follows. First, the processing time for form generation is excessive, and digital computation becomes complex due to the explosive volume of data. Second, there is an influx of parameters unrelated to body movement during form generation, which results from the provision of only simple physical coordinate data. Third, because the amount of detailed point data provided regarding motion data is scarce, the criteria for evaluating the aesthetics of the generated forms are confined to the designer’s subjective judgment. These limitations can be addressed by motion-tracking data-processing methods.
  • The limitation of explosive data volume is overcome by the clumping method. Unlike the arbitrary adjustment of video FPS for data extraction used in this study, the clumping method compensates for this limitation through data classification based on specific systems and principles rather than arbitrary rules. The absence of variety in detailed data is overcome by the extension method. By establishing differential hierarchies for each joint and point, it provides additional data that are highly related to motion data and can be utilized as secondary parameters for form generation. The subjective aesthetic evaluation criteria are partially overcome by the division method. It converts aesthetic evaluation criteria into objective numerical values based on the degree of separation between reference points and forms or based on the arrangement data of forms centered on reference points.
  • The three main limitations of curve data can be summarized as follows. First, the dynamic patterns of surface forms generated by the overlapping of linear data present difficulties regarding practicality and ease of construction. Second, the acquirable data are limited by the physical capabilities of the figure in the video being measured. Third, because forms generated from data visualizing the skeletal framework involve the intersection of multiple surfaces, their application is centered on the subdivided spaces or forms created by these intersections rather than on the flow of body movement itself. These limitations can be addressed by motion-tracking data-processing methods.
  • The reduction in practical applicability caused by dynamic designs generated from overlapping curve data is resolved through clumping-based processing. By simplifying the visualization of skeletal linear data, this method presents results that are more abbreviated and organized than existing structures. The limitation on data types acquirable due to a figure’s physical capabilities is overcome through extension-based processing, which provides motion data that cannot emerge from actual physical capabilities or environments. The limitation of design being confined to subdivided spaces and forms, which appear in forms generated using curve data, secures variety through division-based processing. Utilizing only a portion of the data rather than the entire set enables the generation of refined structures centered on flow.
  • The three primary limitations of boundary data can be summarized as follows. First, there is difficulty in securing design variety due to monotonous forms with strong functional aspects. Second, the activity type is crucial during the data extraction process, and the data obtainable from identical activities are limited. Third, in the case of vigorous activities, as only dynamic forms are generated, there are difficulties in generating functional forms. These limitations can be addressed by motion-tracking data-processing methods.
  • The limitation regarding securing design variety is resolved by the extension method. This method provides form variety by transforming area configurations around specific joints to reflect the designer’s intent. Limitations concerning restricted data based on dynamics and similarity in activity type are addressed by the division method. By subdividing areas to derive multiple datasets from partial motions rather than from the entire body, this approach weakens the relationship between the activity type, dynamics, and acquired data. The limitation wherein vigorous activities generate only extreme forms is resolved by the clumping method. By aggregating data using weighted vectors derived from the average distance of joints from the center, this method produces a gradual area, thus securing form functionality while mitigating design dynamics (Table 15).

7.3. Quantitative and Architectural Evaluation of Generated Forms

To verify whether the extracted geometric forms possess architectural practical utility beyond visual depiction, this study quantified the geometric characteristics of each data type based on a pavilion scale with a height of 10 m. The morphological success and architectural validity were objectively evaluated using three primary metrics: volumetric efficiency (%), ground contact ratio (%), and effective occupancy ratio (%) (Table 16).
The quantitative analysis revealed that processed data did not yield unconditional performance improvements over non-processed data; rather, it exhibited clear architectural trade-offs. First, in the case of the division processing method, while organic surface complexity was achieved, volumetric efficiency decreased. For instance, when division was applied to curve data (Type 1), volumetric efficiency dropped from 25.48% to 21.02%. Similarly, boundary data (Type 1) saw a decrease from 37.29% to 29.70%, indicating an inversely proportional relationship between aesthetic morphological complexity and spatial volume efficiency. From a design perspective, this reduction in volume may cautiously be interpreted as a potential increase in architectural voids or spatial porosity.
Conversely, certain processing methods significantly altered structural stability and space utilization. When the extension processing method was applied to boundary data (Type 2), the effective occupancy ratio increased from 22.16% to 32.00%. Although this expansion caused a sharp decline in the ground contact ratio (from 40.95% to 22.33%), such geometric distortion may present potential opportunities for dynamic spatial configurations, such as cantilever-like forms. On the other hand, the clumping method applied to point data (Type 3) notably improved volumetric efficiency (from 22.84% to 49.41%) and structural stability (from 41.05% to 53.59%). Nevertheless, prioritizing such extreme efficiency may concurrently lead to a reduction in morphological diversity.
Consequently, these quantitative numerical changes empirically demonstrate that the proposed data-driven form generation methodology is not a random visualization but a parameter-driven framework that induces predictable geometric, spatial, and structural variations depending on the selected processing strategy.

8. Discussion

This study focuses on the form generation methodology based on motion-tracking data utilizing ML [52,53]. The characteristics by data type, interrelationships, and applicability appearing among data, between data and forms, and according to data-processing methods can be analyzed and systematized into three patterns.
First, the interrelationship between the characteristics of motion data by meaningful information type and the generated forms is examined. Movement data containing distinct characteristics are derived as digital data according to the meaningful information type extracted via ML-based motion capture technology. Even within the same meaningful information type of motion data, the derived meaningful information type and inherent characteristics appear differently depending on the input body movement and activity material (2D video), or depending on the method of processing the extracted meaningful information from its non-processed state. Utilizing the inherent characteristics by type of motion data and non-processing or processing-based meaningful information as form generation parameters allows for the derivation of various form results. Furthermore, it verifies that 3D design generation is possible through the interrelationship between digital technologies and various motion capture technologies, including ML for the conversion of body movement into data.
Second, the complementarity between non-processed and processed motion-tracking data compensates for the limitations of motion data by meaningful information type. Non-processed motion-tracking data, constituting the basic meaningful information of motion data, possess positive characteristics depending on information type, but some limitations arise in the form generation process. These limitations are partially compensated for by the complementary relationship between motion data information and processing methods, and their combination generates new body movement information and forms. The complementary relationship is defined as a bidirectional one. The limitations of non-processed motion data are resolved through motion data-processing methods; conversely, as motion data-processing methods are fundamental systems that classify, transform, and reconstruct data, it is verified that a complementary relationship exists because non-processed motion data provide the criteria required for their application target, purpose, or utilization. Forms generated through the complementary relationship between meaningful information and motion data processing exhibit shapes different from existing meaningful information (non-processed motion-tracking data) forms, and accordingly, morphological possibilities are expanded.
Third, the architectural applicability of the 3D form generation methodology utilizing motion-tracking data is addressed. The advancement of digital technology has led to the development of various technologies that convert body movement data into motion-tracking data, including the ML-based movement inference method utilized in this study, resulting in an explosive increase in various data types beyond body movement data. The form generation methodology based on such data can produce various forms in proportion to the data volume, qualitative value, and type; thus, the generated forms can be utilized to propose various alternatives to designers during the initial form analysis (mass study) and concept construction stages of the architectural design process.
The proposed data-centric framework extends beyond the conceptual generation phase, demonstrating potential for application in building-scale design, termed “Architectural Spatialization.” The algorithmically generated geometries can be conceptualized as tangible architectural components, such as free-form building envelopes, kinetic facades, or spatial massing derived from human circulation. Furthermore, the formalization of spatial and structural metrics, including volumetric efficiency and ground contact ratio, establishes a systematic parameter-control environment. Consequently, it supports a structured transition from motion-tracking data to feasible architectural geometries.
Previous research in the architectural field utilizing body movement data has mostly proposed indiscriminate designs by intuitively utilizing non-processed motion data, which constitutes the basic data of body movement. Research has focused primarily on functional perspectives, such as tools for exploring appropriate spatial scales required for specific physical activities or efficient circulation systems for evacuation routes; from a design perspective, referencing physical proportions and reflecting them in designs constitutes most investigations. This study systematized the relationship between the characteristics of non-processed motion data, which can increase exponentially depending on body movements or activities, and the forms generated through them. It explored measures to address the limitations in motion data-processing methods and systematized the related series of processes. This study not only presents a new perspective for existing research on proportion-centered body data utilization design but also suggests a new direction by proposing a design-centered motion data utilization methodology for research based on functional perspectives. Furthermore, regarding the intuitive utilization of data appearing in motion data-centered design experiments, although form generation is identical, a difference appears in that this study proposes a 3D form generation methodology and a systematized process to utilize motion data more effectively and efficiently. It achieves this by analyzing the characteristics of acquired data and proposing data classification, reconstruction, and transformation methods accordingly, rather than utilizing intuitive data indiscriminately to compensate for existing limitations. This demonstrates the necessity of the current study as it aligns with the demands of the times and paradigms by proposing and systematizing data classification, reconstruction, and transformation methods suitable for the explosively increasing data types and characteristics required in data-based design research.
The form generation process utilizing digital language and data is organized around the characteristics of each data type; it generates new forms in a practical environment by extending the complementary relationship between meaningful information and data-processing methods and the interrelationship between generated forms and meaningful information. This is similar to digital-based generative design technologies that instantly generate numerous alternatives using random variables. However, it moves beyond the limitation of results derived from abstract and subjective aesthetics found in generative technologies. It can be interpreted as an instantaneous generation methodology of logical forms based on causality, as it yields results based on the inherent characteristics of data, processing methods that reinforce them, and the interrelationships of forms. It is verified that digital technology can be utilized not merely as a tool for form generation but as a catalyst that establishes a causal relationship between various data and architectural design and forms. In future research, based on the systematization results, it may be possible to establish form generation methodologies utilizing various data or analyze the relationship between data characteristics and generated forms from a logical perspective.
While this study establishes a quantitative framework for motion-driven form generation, certain methodological boundaries outline directions for future research. First, the current experimental scope focused on a highly dynamic activity (i.e., figure skating) to maximize data complexity. Validating this framework against static or low-dynamic everyday activities is necessary to generalize applicability. Second, the reliance on monocular 2D vision-based estimation (VideoPose3D) introduces inherent depth ambiguity. While the proposed data expansion process controls localized spatial distortions, incorporating multi-view geometric validation would further enhance projection accuracy. Finally, the generated geometries currently exist at a conceptual massing stage. Translating these forms into built architecture necessitates subsequent interdisciplinary research integrating structural load analysis and material fabrication constraints.

9. Conclusions

Advancements in digital and computing technologies have transformed data into a core source of value creation in modern industries. As data collection methods such as big data and data mining have become more sophisticated, research on methodologies for the selection, classification, and utilization of explosively increasing data has emerged as a key task in the digital paradigm shift. However, in the field of architecture, data technology has been utilized primarily as a tool for functional optimization or intuitive form generation (diversity expansion) lacking causality. There has been a dominant tendency to perceive data not as a primary element of design but as a secondary means for securing diversity. This study analyzed the characteristics of body movement data acquired through ML and, by combining this approach with a motion-tracking data-based form generation methodology, explored the potential correlations between data and form. It analyzed the characteristics of data-processing methods intended to address the limitations of body movement data and systematized the relationship between transformed forms and processed data.
The core of this study, which differentiates it from existing design methodologies, lies in the proposal of a systematic methodology covering the processes of efficient classification, selection, reconstruction, transformation, and utilization of data, extending beyond mere intuitive application. The workflow that systematizes the overall process through purpose-specific data analysis, categorization of processing methods to overcome data limitations, and the projection of processed data onto design is not confined to the specific category of motion data. This system is universally applicable to various types of data usable in architectural design. It provides foundational materials for establishing a “data-centric architectural design process” based on predictable tendencies and parametric relationships between data rather than on the designer’s subjective inspiration or intuition.
A limitation of this study is the potential for the designer’s subjective judgment to intervene in the data-processing process. Although processing methods were categorized by centering them on the artistic work of Oskar Schlemmer, follow-up research on additional objective data-processing types is required. Moreover, the results derived through this process are closer to geometric initial designs or forms in a draft state. To convert the derived forms into actual architectural spaces, in-depth evaluation and complementary discussions regarding structural feasibility and functional suitability that align with the spatial program must accompany sophisticated refinement of the forms. The primary purpose of this study lies in systematizing the form generation process through processing based on the characteristics of data, moving beyond existing methods that utilize data intuitively; however, additional exploration is required for the process of converting generated forms into substantial architectural spaces. In future research, along with the additional categorization of data-processing methods according to objective standards, concrete methodologies for the “architectural spatialization” of the generated designs must be considered. By applying various types of data beyond motion data to this workflow, the versatility and architectural applicability of the proposed system can be continuously verified.
The significance of this study lies in providing a standard framework for data technology-based form generation methodologies, thereby laying the foundation for establishing design process principles and systems centered on the diverse data emerging in the modern architectural design environment.

Author Contributions

Conceptualization, H.-S.A. and S.-W.K.; methodology, H.-S.A. and S.-W.K.; software, H.-S.A.; validation, H.-S.A. and S.-W.K.; formal analysis, H.-S.A., S.-W.K. and N.Y.; investigation, H.-S.A.; resources, H.-S.A.; data curation, S.-W.K. and H.-S.A.; writing—original draft preparation, H.-S.A.; writing—review and editing, H.-S.A. and S.-W.K.; visualization, H.-S.A.; supervision, S.-W.K. and N.Y.; project administration, S.-W.K.; funding acquisition, S.-W.K. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korean government (MSIT) (No.RS-2024-00355601).

Data Availability Statement

The data supporting the findings of this study are available from the corresponding authors upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AIArtificial intelligence
RoMRange of motion
VRVirtual reality
FPSFrames per second
RGBRed, green, and blue
RGB-DRed, green, and blue—depth
IMUInertial measurement unit
MLMachine learning
JSONJavaScript Object Notation

References

  1. Schwab, K. The Fourth Industrial Revolution; Crown Business: New York, NY, USA, 2017. [Google Scholar]
  2. Anshari, M.; Syafrudin, M.; Fitriyani, N.L. Fourth Industrial Revolution between knowledge management and digital humanities. Information 2022, 13, 292. [Google Scholar] [CrossRef]
  3. Lee, M.; Yun, J.J.; Pyka, A.; Won, D.; Kodama, F.; Schiuma, G.; Park, H.; Jeon, J.; Park, K.; Jung, K.; et al. How to respond to the Fourth Industrial Revolution, or the Second Information Technology Revolution? Dynamic new combinations between technology, market, and society through open innovation. J. Open Innov. Technol. Mark. Complex. 2018, 4, 21. [Google Scholar] [CrossRef]
  4. Jeon, J.; Suh, Y. Analyzing the major issues of the 4th Industrial Revolution. Asian J. Innov. Policy 2017, 6, 262–273. [Google Scholar] [CrossRef]
  5. Philip Chen, C.L.; Zhang, C.-Y. Data-intensive applications, challenges, techniques and technologies: A survey on big data. Inf. Sci. 2014, 275, 314–347. [Google Scholar] [CrossRef]
  6. Siefkes, C.; Siniakov, P. An Overview and Classification of Adaptive Approaches to Information Extraction. In Journal on Data Semantics IV; Lecture Notes in Computer Science; Spaccapietra, S., Ed.; Springer: Berlin/Heidelberg, Germany, 2005; Volume 3730, pp. 172–212. [Google Scholar] [CrossRef]
  7. Pan, Y.; Zhang, L. A BIM-data mining integrated digital twin framework for advanced project management. Autom. Constr. 2021, 124, 103564. [Google Scholar] [CrossRef]
  8. Janiesch, C.; Zschech, P.; Heinrich, K. Machine learning and deep learning. Electron. Mark. 2021, 31, 685–695. [Google Scholar] [CrossRef]
  9. Pfitzner, F.; Braun, A.; Borrmann, A. From data to knowledge: Construction process analysis through continuous image capturing, object detection, and knowledge graph creation. Autom. Constr. 2024, 164, 105451. [Google Scholar] [CrossRef]
  10. Tanwar, S.; Popat, A.; Bhattacharya, P.; Gupta, R.; Kumar, N. A taxonomy of energy optimization techniques for smart cities: Architecture and future directions. Expert Syst. 2022, 39, e12703. [Google Scholar] [CrossRef]
  11. Ciardiello, A.; Rosso, F.; Dell’Olmo, J.; Ciancio, V.; Ferrero, M.; Salata, F. Multi-objective approach to the optimization of shape and envelope in building energy design. Appl. Energy 2020, 280, 115984. [Google Scholar] [CrossRef]
  12. Pilechiha, P.; Mahdavinejad, M.; Pour Rahimian, F.; Carnemolla, P.; Seyedzadeh, S. Multi-objective optimisation framework for designing office windows: Quality of view, daylight, and energy efficiency. Appl. Energy 2020, 261, 114356. [Google Scholar] [CrossRef]
  13. Pallasmaa, J. The Eyes of the Skin: Architecture and the Senses, 2nd ed.; Academy Press: Chichester, UK, 2005. [Google Scholar]
  14. Oh, H.S. A Study on the Tendency of the Expression of Contemporary Architecture from the View Point of ‘Body’. Master’s Thesis, Kookmin University, Seoul, Republic of Korea, 2003. [Google Scholar]
  15. Vitruvius. The Ten Books on Architecture; Morgan, M.H., Translator; Dover Publications: New York, NY, USA, 1960. [Google Scholar]
  16. Tschumi, B. The Manhattan Transcripts, 2nd ed.; Academy Editions: London, UK, 1994. [Google Scholar]
  17. Lynn, G. Animate Form; Princeton Architectural Press: New York, NY, USA, 1999. [Google Scholar]
  18. Li, S.; Liu, L.; Peng, C. A review of performance-oriented architectural design and optimization in the context of sustainability: Dividends and challenges. Sustainability 2020, 12, 1427. [Google Scholar] [CrossRef]
  19. Shin, S.J. A Study on the Relationship between Meaning of Extended Body and Digital Fabrication in Contemporary Architecture: Focused on Small Scale Pavilion. Master’s Thesis, Ajou University, Suwon, Republic of Korea, 2014. [Google Scholar]
  20. Kim, S.C. A Study on the Visualization Method of Body Motion Data in Architecture Digital Technology. Master’s Thesis, Ajou University, Suwon, Republic of Korea, 2023. [Google Scholar]
  21. Lim, G.T. Phenomenology and Architecture Theory; Spacetime (Sigongmunhwasa): Seoul, Republic of Korea, 2014. [Google Scholar]
  22. Kim, E.Y. A Study on the Expression of Space Design by Physical Perception: Focusing on the Organic Phenomenon of Modern Architecture. Master’s Thesis, Chosun University, Gwangju, Republic of Korea, 2005. [Google Scholar]
  23. Nogueira, P. Motion Capture Fundamentals. In Proceedings of the Doctoral Symposium in Informatics Engineering, Porto, Portugal, 26–27 January 2012; Volume 303. [Google Scholar]
  24. Topuz, H.H. Motion-Based Dynamic Form Generation to Contribute to the Kinetic Design Diversity. Ph.D. Thesis, Istanbul Technical University, Istanbul, Turkey, 2021. [Google Scholar]
  25. Ingleby, T.; Orlando, S. Translating movement into architectural form. Nexus Netw. J. 2021, 23, 1017–1037. [Google Scholar] [CrossRef]
  26. Ankalaki, S.; Thippeswamy, M.N. Static and dynamic human activity detection using multi CNN-ELM approach. In Emerging Research in Computing, Information, Communication and Applications; Lecture Notes in Electrical Engineering; Shetty, N.R., Patnaik, L.M., Nagaraj, H.C., Hamsavath, P.N., Nalini, N., Eds.; Springer: Singapore, 2022; Volume 789. [Google Scholar] [CrossRef]
  27. Bi, H.; Perello-Nieto, M.; Santos-Rodriguez, R.; Flach, P. Human activity recognition based on dynamic active learning. IEEE J. Biomed. Health Inform. 2021, 25, 922–934. [Google Scholar] [CrossRef]
  28. Das Antar, A.; Ahmed, M.; Ahad, M.A.R. Challenges in Sensor-Based Human Activity Recognition and a Comparative Analysis of Benchmark Datasets: A Review. In Proceedings of the 2019 Joint 8th International Conference on Informatics, Electronics & Vision (ICIEV) and 2019 3rd International Conference on Imaging, Vision & Pattern Recognition (icIVPR), Spokane, WA, USA, 30 May–2 June 2019; pp. 134–139. [Google Scholar]
  29. Menolotto, M.; Komaris, D.S.; Tedesco, S.; O’Flynn, B.; Walsh, M. Motion capture technology in industrial applications: A systematic review. Sensors 2020, 20, 5687. [Google Scholar] [CrossRef]
  30. Kim, J.J.; Kim, J.Y. Typological characteristics of methods in formalization process of body movement. J. Korean Inst. Inter. Des. 2006, 15, 28–35. [Google Scholar]
  31. Kern, S. Anatomy and Destiny: A Cultural History of the Human Body; Bobbs-Merrill: Indianapolis, IN, USA, 1975; p. 66. [Google Scholar]
  32. Kim, W.G. Metropolis; Open Books: Paju, Republic of Korea, 2002. [Google Scholar]
  33. Jensenius, A.R. Action-Sound: Developing Methods and Tools to Study Music-Related Body Movement. Ph.D. Thesis, University of Oslo, Oslo, Norway, 2007. [Google Scholar]
  34. Ha, E.; Byeon, G.; Yu, S. Full-body motion capture-based virtual reality multi-remote collaboration system. Appl. Sci. 2022, 12, 5862. [Google Scholar] [CrossRef]
  35. Stathopoulou, D. From Dance Movement to Architectural Form. Ph.D. Thesis, University of Bath, Bath, UK, 2011. [Google Scholar]
  36. Kim, J.J. (Ed.) Bodyscape; Damdi Publishing Company: Seoul, Republic of Korea, 2007. [Google Scholar]
  37. Crolla, K. Choreographed Architecture—Body-Spatial Exploration. In Learning, Prototyping and Adapting, Proceedings of the 23rd International Conference on Computer-Aided Architectural Design Research in Asia (CAADRIA 2018), Beijing, China, 17–19 May 2018; The Association for Computer-Aided Architectural Design Research in Asia (CAADRIA): Hong Kong, China, 2018; Volume 1, pp. 101–110. [Google Scholar] [CrossRef]
  38. Ruescas-Nicolau, A.V.; Medina-Ripoll, E.J.; Parrilla Bernabé, E.; de Rosario Martínez, H. Multimodal human motion dataset of 3D anatomical landmarks and pose keypoints. Data Brief 2024, 53, 110157. [Google Scholar] [CrossRef] [PubMed]
  39. Desmarais, Y.; Mottet, D.; Slangen, P.; Montesinos, P. A review of 3D human pose estimation algorithms for markerless motion capture. Comput. Vis. Image Underst. 2021, 212, 103275. [Google Scholar] [CrossRef]
  40. Wang, X.M.; Smith, D.T.; Zhu, Q. A webcam-based machine learning approach for three-dimensional range of motion evaluation. PLoS ONE 2023, 18, e0293178. [Google Scholar] [CrossRef]
  41. Menache, A. Understanding Motion Capture for Computer Animation and Video Games; Morgan Kaufmann: San Francisco, CA, USA, 2000. [Google Scholar]
  42. Feng, B.; Zhang, X.; Zhao, H. The research of motion capture technology based on inertial measurement. In Proceedings of the 2013 IEEE 11th International Conference on Dependable, Autonomic and Secure Computing (DASC), Chengdu, China, 21–22 December 2013; pp. 238–243. [Google Scholar]
  43. Pavllo, D.; Feichtenhofer, C.; Grangier, D.; Auli, M. 3D human pose estimation in video with temporal convolutions and semi-supervised training. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019; pp. 7753–7762. [Google Scholar]
  44. Kocabas, M.; Athanasiou, N.; Black, M.J. VIBE: Video inference for human body pose and shape estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 5253–5263. [Google Scholar]
  45. Kim, S.-J.; Lee, Y.-J.; Park, G.-M. Real-time joint animation production and expression system using deep learning model and Kinect camera. J. Broadcast. Eng. 2021, 26, 269–282. [Google Scholar]
  46. Maji, D.; Nagori, S.; Mathew, M.; Poddar, D. YOLO-pose: Enhancing YOLO for multi person pose estimation using object keypoint similarity loss. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 19–24 June 2022; pp. 2637–2646. [Google Scholar]
  47. Suo, X.; Tang, W.; Li, Z. Motion capture technology in sports scenarios: A survey. Sensors 2024, 24, 2947. [Google Scholar] [CrossRef]
  48. Rao, Y.; Perez-Pellitero, E.; Zhou, Y.; Song, J. Reality’s canvas, language’s brush: Crafting 3D avatars from monocular video. arXiv 2023, arXiv:2312.04784. [Google Scholar]
  49. An, H.-S.; Jeon, Y.-C.; Kim, S.-W. Digital architectural form generation through pixel system-driven image feature information. Buildings 2024, 14, 3635. [Google Scholar] [CrossRef]
  50. Ionescu, C.; Papava, D.; Olaru, V.; Sminchisescu, C. Human3.6M: Large scale datasets and predictive methods for 3D human sensing in natural environments. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 36, 1325–1339. [Google Scholar] [CrossRef]
  51. Zhang, S.; Wang, C.; Dong, W.; Fan, B. A survey on depth ambiguity of 3D human pose estimation. Appl. Sci. 2022, 12, 10591. [Google Scholar] [CrossRef]
  52. Huang, Y.; Zhang, Z.; Su, P.; Li, T.; Zhang, Y.; He, X.; Li, H. Performance-Driven Generative Design in Buildings: A Systematic Review. Buildings 2025, 15, 4556. [Google Scholar] [CrossRef]
  53. Wiese, H.; Drude, J.P.; Becker, M. Capturing Motion, Tracing Developments, Creating Space: An Exploration about Motion Capture Methodologies in Architecture. In ACADIA 2024: Designing Change; Nahmad-Vazquez, A., Johnson, J., Taron, J., Rhee, J., Hapton, D., Eds.; ACADIA: Calgary, Canada, 2024; Volume 2, pp. 29–40. Available online: https://papers.cumincad.org/cgi-bin/works/Show?acadia24_v2_38 (accessed on 7 January 2026).
Figure 1. Flow of the study.
Figure 1. Flow of the study.
Buildings 16 01492 g001
Figure 2. Example of utilizing the human body in architecture: (a) Vitruvian Man by Vitruvius, human body-based proportional aesthetics, and (b) the Modulor by Le Corbusier, human body-based proportional aesthetics.
Figure 2. Example of utilizing the human body in architecture: (a) Vitruvian Man by Vitruvius, human body-based proportional aesthetics, and (b) the Modulor by Le Corbusier, human body-based proportional aesthetics.
Buildings 16 01492 g002
Figure 3. Precursors to modern motion capture systems: (a) Motion capture suit by Marey, (b) Eadweard Muybridge’s timelapse photographs, and (c) Max Fleischer’s rotoscope projection technique.
Figure 3. Precursors to modern motion capture systems: (a) Motion capture suit by Marey, (b) Eadweard Muybridge’s timelapse photographs, and (c) Max Fleischer’s rotoscope projection technique.
Buildings 16 01492 g003
Figure 4. The computational pipeline for 3D pose estimation utilizing VideoPose3D: (a) input of the source video sequence, (b) frame-by-frame detection of 2D anatomical key points, and (c) reconstruction of quantitative 3D coordinates and spatial visualization.
Figure 4. The computational pipeline for 3D pose estimation utilizing VideoPose3D: (a) input of the source video sequence, (b) frame-by-frame detection of 2D anatomical key points, and (c) reconstruction of quantitative 3D coordinates and spatial visualization.
Buildings 16 01492 g004
Figure 5. Fundamentals of the form generation process.
Figure 5. Fundamentals of the form generation process.
Buildings 16 01492 g005
Figure 6. Form generation process based on motion data non-processing.
Figure 6. Form generation process based on motion data non-processing.
Buildings 16 01492 g006
Table 1. Classification of human body activities and movements.
Table 1. Classification of human body activities and movements.
Human Body Activities
Typology of ActivitiesExample
Static ActivitiesLying, Sitting, Standing, etc.
Dynamic ActivitiesWalking, Running, etc.
Activities with
Postural Transitions
Static to StaticSitting to Standing
Static to DynamicSitting to Walking
Dynamic to StaticWalking to Standing
Dynamic to DynamicWalking to Running
Table 2. Overview of design processes based on motion data non-processing.
Table 2. Overview of design processes based on motion data non-processing.
EraTitleDesignerFeatureData Type
ModernPicasso Drawing with LightGjon Mili/
Picasso
Captures bodily motion as light points to translate temporal movement trajectories into spatialized visual records.Point
ModulorLe CorbusierSystematizes human proportions into a scalar metric to establish harmonic order and functional standardization in architectural design.Point/
Curve
Unique Forms of Continuity in SpaceUmberto BoccioniSynthesizes dynamic bodily curves to visualize the fluid continuity and formal integration between the figure and its surrounding environment.Curve
Nude Descending a StaircaseMarcel DuchampProjects sequential kinetic curves onto a spatial plane to represent the temporal progression of the body through overlapping geometric fragments.Curve
BauentwurfslehreErnst NeufertEstablishes functional design standards by quantifying ergonomic dimensions and operational boundaries of the human body for architectural optimization.Curve/
Boundary
KinesphereRudolf Von LabanDefines a movement-centric spatial domain by mapping the three-dimensional reach boundaries of the human body.Boundary
ContemporaryChoreographing SpaceElli Athanasiou/
Dimitra Gougoudi
Translates choreographic movement sequences into a generative architectural language that dictates spatial morphology and flow.Point/
Curve
Choreographed ArchitectureEnrica Fung/
Kristof Crolla
Synthesizes continuous kinetic trajectories with the Kinesphere concept to generate free-form architectural surfaces that define the building’s exterior envelopes.Curve
Space MOJong Jin KimConstructs organic spatial volumes by layering time-sequential motion capture data into 3D surfaces that reflect both positional shifts and postural changes of the body.Curve/
Boundary
Embryological HouseGreg LynnGenerates flexible living space boundaries that respond to inhabitant requirements by utilizing variable curve parameters based on biological growth principles, thereby departing from the constraints of fixed modules.Boundary
The Evolving Room: Inhabiting Zero Wasted SpaceStavros GargaretasOptimizes spatial efficiency by layering the temporal boundaries of daily activities to create a zero-wasted living environment tailored to individual movement.Boundary
Table 3. Overview of design processes based on motion data processing.
Table 3. Overview of design processes based on motion data processing.
YearTitleDesignerFeatureProcessing Type
1921Ambulant
Architecture
Oskar
Schlemmer
Transforms the biological body into cubic and abstract spatial forms based on “The Laws of Surrounding Cubical Space,” redefining the performer as a dynamic architectural unit that delineates spatial boundaries.Point/Curve
Extension
The MarionetteConverts organic human anatomy into a quasi-machine by substituting joints with spheres and limbs with geometric shapes according to the “Functional Laws of the Body,” shifting design focus from natural motion to mechanical abstraction.Point/Curve
Clumping
A Technical
Organism
Simplifies kinetic trajectories such as rotation, intersection, and direction into geometric motifs (spirals, disks, and vortices) based on the “Laws of Motion,” translating the body’s movement path into a structural design element within space.Boundary
Clumping
DematerializationFragments the physical body into metaphysical symbols (stars, crosses) through the “Symbolization of Limbs,” implementing a design process where the bodily presence is dissolved and integrated into a higher mathematical order.Division
Deformation
1922Triadic BalletTransforms the biological body into a volumetric “Art Figure” composed of geometric modules like spheres and cylinders through the “Clothes-Priority “ principle, implementing a design process where the sculptural form and material of the costume dictate the kinetic trajectories and directly generate the architectural logic of the performance space.Part
Extension
1927Slat DanceBased on the ‘law of spatial extension,’ rods were attached to the body’s extremities to extend the human body’s range of motion into linear vector data. The trajectory of this extended body functions as a design language that pierces and partitions the depth and boundaries of three-dimensional space in real time.Curve
Extension
2012Alloplastic ArchitectureBehnaz FarahiExtracts user proximity and gesture data via Kinect sensors. These data points are mapped to Shape Memory Alloy (SMA) actuators to drive a kinetic tensegrity system. This transforms static architectural space into a responsive, “alloplastic” entity that establishes an empathetic interaction between the user and the environment.Point
Extension
2018MoSculpMIT CSAILEstimates 3D human poses and meshes from 2D video sequences. Temporal motion trajectories are transformed into 4D volumetric swept surfaces. By materializing transient dynamics into physical “motion sculptures,” the system provides a tangible visualization of the aesthetic essence of human movement.Curve
Division
2018ANYΠAKOH
(Disobedience)
Studio INIUtilizes real-time pedestrian load and location data as a trigger for a mechanical framework. The system reconfigures an elastic skin to expand or contract along the walking path. This creates a “disobedient” kinetic pavilion where human action actively redefines and disrupts traditional architectural boundaries.Vector
Weight
2019Urban ImprintCaptures vertical load and pressure data from footsteps. A mechanical pulley-and-cable system inverts floor depression into the formation of an upward ceiling dome. This reinterprets rigid urban infrastructure as a responsive “augmented materiality” that expands and contracts in direct response to individual presence.Vector
Weight
Table 4. Systematization of design methodologies: distinguishing between motion data processing and non-processing approaches.
Table 4. Systematization of design methodologies: distinguishing between motion data processing and non-processing approaches.
Utilization of Motion-Tracking DataDesignerQuantified
Segment Data
Types of Data Extracted
Non-processingJointPoint3D Coordinate System
(x, y, z axis)
BoneCurveInterpolate Curve/Poly Line
(Length, Tangent, Division Point, etc.)
Range of Motion
(RoM)
BoundaryClosed Interpolate Curve/
Closed Nurbs Curve
(Area, Curvature, Control Point, etc.)
ProcessingRelationship
Information
Between
Motion Data
ExtensionDirection, Vector, Value, etc.
DivisionCriteria, Number of Splits, etc.
DeformationShape Types, Parameter of Shape, etc.
ClumpingDistance between Points, etc.
Vector WeightMoment, Threshold, Direction, etc.
etc.
Table 5. Schematic overview of the design processes utilizing motion capture systems.
Table 5. Schematic overview of the design processes utilizing motion capture systems.
No.AuthorTitleMotion Capture MethodSegment Method
1.Bo Feng
et al. (2013) [42]
The Research of Motion Capture Technology Based on Inertial Measurement
-
Optical Motion Capture
-
Non-optical Motion Capture
-
Inertia
-
Electromagnetic
-
Marker-based
-
etc.
2.Dario Pavllo
et al. (2019) [43]
3D human pose estimation in video with temporal convolutions and semi-supervised training
-
3D Pose Estimation
-
VideoPose3D
3.Kocabas
et al. (2020) [44]
VIBE: Video Inference for Human Body Pose and Shape Estimation
-
3D Pose Estimation
-
VIBE
-
SOTA Method
4.Sang-Joon Kim
et al. (2021) [45]
Real-Time Joint Animation Production and Expression System using Deep Learning Model and Kinect Camera
-
Optical Motion Capture
-
Marker-less
-
RGB-D Camera
-
Kinect Camera
-
etc.
5.Debapriya Maji
et al. (2022) [46]
YOLO-Pose: Enhancing YOLO for Multi Person Pose Estimation Using Object Keypoint Similarity Loss
-
2D Pose Estimation
-
YOLO-Pose
-
YOLO v5
6.Xiang Suo
et al. (2024) [47]
Motion Capture Technology in Sports Scenarios: A Survey
-
Optical Motion Capture
-
Non-optical Motion Capture
-
2D Pose Estimation
-
3D Pose Estimation
-
Mechanical
-
Inertial
-
Magnetic
-
etc.
7.Yuchen Rao
et al. (2024) [48]
Reality’s Canvas, Language’s Brush: Crafting 3D Avatars from Monocular Video
-
2D Pose Estimation
-
3D Pose Estimation
-
EasyMocap
Table 6. Taxonomy of motion capture systems and pose estimation techniques.
Table 6. Taxonomy of motion capture systems and pose estimation techniques.
Motion Capture
Method
Operational ModalitySegment MethodExample Method
Sensor-Based
Motion Capture
Optical
Motion Capture
Marker-BasedVicon and OptiTrack
Marker-lessMulti-view RGBCMU Panoptic
RGB-D CameraKinect and Intel RealSense
Non-Optical
Motion Capture
MechanicalGypsy7
MagneticNest of Birds and Polhemus
Inertial (IMU)Xsens and Smartsuit Pro II
AcousticAcademic Research
Vision-Based
Machine Learning
Motion Estimation
2D Pose
Estimation
Top-DownYOLO-Pose
Bottom-UpOpenPose and AlphaPose
3D Pose
Estimation
Monocular 3D RegressionVideoPose3D and PoseFormer
Mesh RecoveryHMR, VIBE, and SPIN
Table 7. Systematization of the form generation methodology based on motion data processing and non-processing strategies.
Table 7. Systematization of the form generation methodology based on motion data processing and non-processing strategies.
Type of ActivitiesUtilization of Motion-Tracking DataMotion Capture Method
Activities with
Postural Transitions
Motion Data
Non-processing
Motion Data
Processing
Vision-Based
Machine Learning
Motion Estimation
3D Pose Estimation
Monocular 3D Regression
Dynamic to DynamicJointRelational Attributes
of Motion Data
VideoPose3D
Bone
Range of Motion
(RoM)
Table 8. Experimental setup and parameter specifications for motion-driven form generation.
Table 8. Experimental setup and parameter specifications for motion-driven form generation.
CategorySpecification/Parameter
Source VideoFigure skating (selected for high-dynamic transitions)
Data Volume37.06 s/50 FPS (total 1853 frames)
Data Points31,501 coordinate points (refined to 30,600)
Frame Sampling Rates0.5 FPS (306 pts), 1 FPS (621 pts), 2 FPS (1224 pts)
Software EnvironmentPython 3.7.9, Visual Studio, Numpy, Pandas
Form GenerationRhinoceros 7 and 8, Grasshopper (geometric sculpting)
Table 9. Taxonomy of computational processing strategies for motion-tracking data.
Table 9. Taxonomy of computational processing strategies for motion-tracking data.
Motion
Data
Type
Non-ProcessedExtensionDivisionClumping
Point
(Joint)
Buildings 16 01492 i001Buildings 16 01492 i002Buildings 16 01492 i003Buildings 16 01492 i004
Curve
(Bone)
Buildings 16 01492 i005Buildings 16 01492 i006Buildings 16 01492 i007Buildings 16 01492 i008
Boundary
(RoM)
Buildings 16 01492 i009Buildings 16 01492 i010Buildings 16 01492 i011Buildings 16 01492 i012
Table 10. Morphological outcomes generated through the motion data non-processing strategy.
Table 10. Morphological outcomes generated through the motion data non-processing strategy.
Motion Data
Type
Form Type No. 1Form Type No. 2Form Type No. 3
Point
(Joint)
Buildings 16 01492 i013Buildings 16 01492 i014Buildings 16 01492 i015
Curve
(Bone)
Buildings 16 01492 i016Buildings 16 01492 i017Buildings 16 01492 i018
Boundary
(RoM)
Buildings 16 01492 i019Buildings 16 01492 i020Buildings 16 01492 i021
Table 11. Systematic evaluation of morphological outcomes derived from the motion data non-processing strategy.
Table 11. Systematic evaluation of morphological outcomes derived from the motion data non-processing strategy.
Motion Data
Type
Characteristics of
Motion Data Non-Processing
Properties as Parameters
for Generate Form
Morphological Limits
Point
(Joint)
-
Basic elements of motion-tracking data The abstract interrelationship inherent within the joints
Complex relationship-centered form through mutual interrelationships and organic directionality between points (joint coordinates)
-
High computational complexity and temporal costs for large-scale data
-
Reliance on designer intervention due to lack of non-coordinate data
Curve
(Bone)
-
Linear elements filling the gaps between joints (Bone)
-
Changes in detailed elements (angle, position) while maintaining linear topology
-
Organic flow based on bodily movement generates surface forms through the layering of linear elements
-
A data-driven dynamic form that changes dramatically depending on the activity type
-
Constraints inherent to anatomical structure and physiological RoM
-
Data scope limited by subject’s physical performance and environment
Boundary
(RoM)
-
Area element including Range of Motion (RoM)
-
Data containing various detail elements (points, lengths, tangents, regions, etc.)
Overlapping area elements create a continuous flow of gradual changes, forming volumetric shapes
-
Low formal differentiation due to omission of internal dynamics
-
Limited formal originality within common or identical activity groups
Table 12. Morphological outcomes generated through the motion data processing strategy.
Table 12. Morphological outcomes generated through the motion data processing strategy.
Motion
Data
Type
ExtensionDivisionClumping
Point
(Joint)
Buildings 16 01492 i022Buildings 16 01492 i023Buildings 16 01492 i024
Type 1Type 2Type 3
Curve
(Bone)
Buildings 16 01492 i025Buildings 16 01492 i026Buildings 16 01492 i027
Type 2Type 1Type 3
Boundary
(RoM)
Buildings 16 01492 i028Buildings 16 01492 i029Buildings 16 01492 i030
Type 2Type 1Type 3
Table 13. Workflow of the form generation process based on motion-tracking data.
Table 13. Workflow of the form generation process based on motion-tracking data.
Preprocessing for
Form Generation Process
Form Generation
Using Data Features
Analyzing and systematizing
relationships between motion data and form
Buildings 16 01492 i031
Table 14. Example of morphological outcomes resulting from the interplay of motion-tracking data processing strategies.
Table 14. Example of morphological outcomes resulting from the interplay of motion-tracking data processing strategies.
Buildings 16 01492 i032Buildings 16 01492 i033Buildings 16 01492 i034
Pt + DiCrv + Di + ClBo + Di
Buildings 16 01492 i035Buildings 16 01492 i036Buildings 16 01492 i037
Pt + ClPt + DiBo + Ex
Table 15. Systematic evaluation of morphological outcomes derived from the motion data processing strategy.
Table 15. Systematic evaluation of morphological outcomes derived from the motion data processing strategy.
Motion Data
Processing
Type
PointCurveBoundary
ExtensionEmphasizes joint hierarchy by applying differential weighting to enhance data and expand morphological possibilitiesExtends body segments to generate extreme dynamic data beyond existing physical motion capabilitiesTransforms uniform boundary forms through selective expansion centered on specific joints, enabling designer-intended dynamic adjustments
DivisionEstablishes relational weighting based on distance from the core, creating aesthetic evaluation and criteria-driven classification systems (Metric)By dividing overall body data into segmented clusters, it enables the creation of refined forms through the selective utilization of body partsSubdivides unified boundary into multi-segmented data, enabling partial motion-driven multiple boundary extractions that weaken activity-data dependency
ClumpingSimplifies data by clustering proximate joints based on specific criteria, reducing computational complexity while maintaining morphological integritySimplifies complex skeletal data by consolidating multiple curves into unified forms, generating restrained morphologies with reduced dynamic variabilityAggregates boundaries using weighted vector values based on distance from joints, mitigating abrupt changes and torsional distortions inherent in dynamic motion
Table 16. Quantitative comparison of non-processed and processed motion-tracking data.
Table 16. Quantitative comparison of non-processed and processed motion-tracking data.
Data TypeForm TypeProcessingVolumetric
Efficiency (%)
Ground Contact Ratio (%)Effective Occupancy Ratio (%)
PointType 1Non-Processed25.3135.0110.43
Extension25.6138.9110.65
Type 2Non-Processed15.5719.8566.07
Division17.2135.4558.38
Type 3Non-Processed22.8441.0544.48
Clumping49.4153.5973.01
CurveType 1Non-Processed25.4854.419.72
Extension21.0254.4121.37
Type 2Non-Processed28.7136.9594.41
Division28.2128.9795.69
Type 3Non-Processed18.957.638.07
Clumping22.8710.9040.38
BoundaryType 1Non-Processed37.2940.9610.87
Extension29.7042.7915.79
Type 2Non-Processed36.7440.9522.16
Division25.9522.3332.00
Type 3Non-Processed29.0768.5574.13
Clumping28.1557.5387.42
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

An, H.-S.; Yoon, N.; Kim, S.-W. From Motion to Form: Systematizing Motion-Data Processing for Architectural Generative Design. Buildings 2026, 16, 1492. https://doi.org/10.3390/buildings16081492

AMA Style

An H-S, Yoon N, Kim S-W. From Motion to Form: Systematizing Motion-Data Processing for Architectural Generative Design. Buildings. 2026; 16(8):1492. https://doi.org/10.3390/buildings16081492

Chicago/Turabian Style

An, Hee-Sung, Nari Yoon, and Sung-Wook Kim. 2026. "From Motion to Form: Systematizing Motion-Data Processing for Architectural Generative Design" Buildings 16, no. 8: 1492. https://doi.org/10.3390/buildings16081492

APA Style

An, H.-S., Yoon, N., & Kim, S.-W. (2026). From Motion to Form: Systematizing Motion-Data Processing for Architectural Generative Design. Buildings, 16(8), 1492. https://doi.org/10.3390/buildings16081492

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop