Linear Regression vs. Deep Learning: A Simple Yet Effective Baseline for Human Body Measurement

We propose a linear regression model for the estimation of human body measurements. The input to the model only consists of the information that a person can self-estimate, such as height and weight. We evaluate our model against the state-of-the-art approaches for body measurement from point clouds and images, demonstrate the comparable performance with the best methods, and even outperform several deep learning models on public datasets. The simplicity of the proposed regression model makes it perfectly suitable as a baseline in addition to the convenience for applications such as the virtual try-on. To improve the repeatability of the results of our baseline and the competing methods, we provide guidelines toward standardized body measurement estimation.


Introduction
Body measurement or anthropometry is a study of the numerical description of human body segments and overall shape. Anthropometry is important for health applications [1], virtual try-on and the fashion industry [2], fitness [3], and ergonomics [4]. Advances in human body measurement and shape estimation have been significantly driven by statistical models and deep learning. The statistical models such as SMPL [5] describe the population of human bodies using pose and shape parameters. The parameters correspond to specifically posed and shaped template meshes. The template meshes are particularly useful for body measurement because each mesh contains a fixed number of vertices, and their semantics are common for the entire population of meshes. Therefore, the procedure for body measurement from the template meshes for a given statistical model, e.g., SMPL, can be standardized. We describe the procedure for extracting 15 body measurements from SMPL templates in Section 3 and use these measurements to compare the linear regression model to the state-of-the-art.
Statistical models enable human body mesh regression, which can be described as fitting point cloud (3D) or image (2D) features to the pose and shape parameters of a given template mesh of the statistical model to obtain the best correspondence between the estimated mesh and the input human body. The 3D point cloud-based approaches [6][7][8] are generally more accurate than their image-based counterparts [9], but they require 3D scans as input, making them impractical for most anthropometric applications. Image-based methods greatly simplify the data acquisition stage. A very popular research topic is pose and shape estimation using in-the-wild images of clothed people in various difficult poses [10][11][12][13][14][15][16][17][18][19][20]. Even though these models recover a person's pose remarkably well, less attention is dedicated to the accurate estimation of shape parameters. Finally, there are several image-based methods that estimate the shape parameters based on the extracted silhouette(s) of a person [21][22][23][24][25][26][27][28][29]. Silhouette-based methods achieve state-of-the-art re-1.
We propose the linear regression model, that uses self-estimated height and weight, to be used as a baseline for human body measurement estimation; 2.
We demonstrate that the baseline performs strongly against the state-of-the-art methods for body measurement estimation and analyze its performance in detail; 3.
We publish the source code, the demo for obtaining the body measurements given height and weight as input, and the protocol for extracting the standard body measurements from the SMPL mesh.

Related Work
Body measurements can be obtained using traditional anthropometric techniques and tools, 3D scanning data such as point clouds or meshes (non-parametric approaches), or features extracted from 3D or 2D data and then used to estimate the parameters of the statistical model (parametric approaches). The accuracy of body measurements obtained using traditional tools and 3D scanning are usually more accurate compared to feature-based, parametric approaches, but the former are more time consuming and relatively expensive.
Traditional Anthropometry. Traditional body measurement involves the use of tools such as calipers and tape measures [2]. Measurements taken by an expert are considered the gold standard and are generally used as the ground truth [31]. The public anthropometric databases such as ANSUR are collected by the expert measurers. However, even measurements taken by the experts are not perfectly accurate [32]. Therefore, the values for the allowable or "expert" error [30] for each measurement are defined in the standard ISO:7250 [33,34]. Based on the public databases, several works propose linear regression models for the estimation of measurements [35][36][37] and other body characteristics such as skeletal muscle mass [38]. We extend these analyses by focusing specifically on using self-estimated height and weight as input as well as on using the statistical body models.
Non-Parametric Approaches. With the advances in 3D scanning technology, more automatic approaches to body measurement have been proposed [7,39,40]. Most 3Dbased body measurement methods use landmarks to determine distances and calculate measurements such as arm and leg length, shoulder-to-crotch, etc. Circumferences can be obtained by slicing the point cloud with a vertical plane and summing up distances between the nearest points [39]. The 3D-based methods are generally the most accurate among the (semi-)automatic body measurement methods. However, their main drawback is that they require 3D scanning, which is cumbersome and relatively expensive.
There are a number of image-based (2D) non-parametric models [20,[41][42][43] that freely deform meshes to best fit the input features. To improve convergence, they start with the template human mesh. However, the final deformed mesh does not necessarily retain the original vertex semantics. This property makes current image-based non-parametric models less suitable for body measurements. On the other hand, compared to their parametric counterparts, non-parametric models might have the advantage of better fitting out-ofdistribution samples.
Parametric Approaches. Apart from direct body measurements from point clouds, 3D scans are usually used to create the statistical models [5,[44][45][46]. The initial template mesh at rest (or T-position) is deformed (registered) to each 3D scan of the dataset [47-49] using an optimization algorithm, such as L-BFGS [50]. To parameterize body poses, a kinematic model of the human body is applied. To describe the shape variation in the dataset, principal component analysis (PCA) is applied to the vertices of the deformed meshes [5]. The first ten principal components (shape parameters) are usually used for the mesh regression. Tsoli et al. [6] first registers template meshes to the 3D scans and additionally learns features for body measurement prediction. BUFF [51] addresses the problem of 3D body shape estimation under clothing by allowing the mesh to deviate from the template but regularizes the optimization to satisfy the anthropometric constraints. Similar to non-parametric approaches, 3D parametric approaches are generally more accurate than image-based approaches, but they also require 3D scans as input.

Method
In this section, we describe the proposed linear regression model and the extraction of body measurements from a template SMPL mesh. We fit separate regression models for males and females. The purpose of the proposed method is to demonstrate that body measurements can be determined using only the information that each person typically knows about themselves, i.e., height and weight, without making any further measurement, which makes it suitable as a baseline.

Linear Regression Model
The method is shown in Figure 1. Each human body sample i is defined by 10 shape parameters, θ Si , and 63 pose parameters, θ Pi . The height and the 15 body measurements of the model are extracted from the template mesh. The input to the model consists of height (h) and weight (w). Weight should be available in the dataset; otherwise, it can be estimated from the calculated mesh volume. The output consists of the 15 other body measurements (A-O). The measurement names are listed in Table 1, and their extraction is described in Section 3.2. Then, the linear model for the j-th measurement is described as: where x T ∈ R 2 is a row vector consisting of the height and weight of the samples (an independent variable), a j ∈ R 2 is a column vector of the linear coefficients (the slope), b j ∈ R is an intercept of the regression model, and y j ∈ R is an output measurement (a dependent variable). These equations can be written more compactly in matrix form as: where X ∈ R N×3 are the heights and weights from the N samples from the training dataset and Y j ∈ R N×1 are their output measurements. Additionally, a column of ones is added to X to account for b j , which is now included in A j ∈ R 3×1 , representing all the model parameters. Therefore, the least-squares closed-form solution is: The linear regression has four assumptions: linearity, homoscedasticity, independence, and normality. The linearity assumes that the relationship between X and the mean Y is linear. The homoscedasticity assumes that the variance of residual is the same for any value of X. Independence assumes that the observations are independent of each other. Finally, the normality assumes that for any fixed value of X, Y is normally distributed. The latter three assumptions are related to residuals, and we verify those in Section 5.1. For further details on linear regression, we refer readers to the relevant literature [68].
Note that we also experiment with the interaction terms (h is height, w is weight): w h 2 (BMI), wh, w 2 , h 2 , and add them to the input vector x T . The model with no interaction terms is called the baseline, and the models with added interaction terms are called the augmented baselines. The augmented baselines are marked in the remainder of the paper as Baseline (I = N), where N is the number of interaction terms. The interaction terms I = 2 correspond to w h 2 and wh, and I = 4 corresponds to w h 2 , wh, w 2 , and h 2 . In case that the weight measurement is not available in the dataset, we estimate it using the extracted volume. The volume is extracted using a standard algorithm [69]. Then, the body weight (w) is estimated based on the human body density, which is approximately ρ = 1 ± 0.005 kg/L [31,70], and the extracted volume (V). To account for the variation in body density w.r.t. weight, we model the volume as a normal stochastic variable: Note that the standard deviation of 5 L applied to the extracted volume propagates to the standard deviation of 5 kg applied to the estimated weight. Additionally, to account for the variation in self-estimation of height and weight, we model self-estimation using another two stochastic variables, h = h extracted + N (µ h = 0, σ h = 1) cm, and w = V · ρ + N (µ w = 0, σ w = 1.5) kg. M(θ S ,θ P )  Table 1. To account for the errors in self-reporting height and weight, we model height and weight as stochastic variables by adding Gaussian noise to the input. A linear regression model is fitted to each of the 15 body measurements.

Extraction of Body Measurements
We use a total of 18 body measurements, 15 of which are a standard set of measurements used in previous works [21][22][23][24][25][26][27][28]31,67], and 3 of which are used specifically to compare with Virtual Caliper [31] (see Table 1). The measurements are either lengths or circumferences and are calcuated using their corresponding landmarks. The complete list of landmarks with their corresponding SMPL vertex index is shown in Table 2. To extract the lengths, such as the shoulder width and arm length, we calculate the Euclidean distances between the two respective landmarks. To extract the circumferences such as the waist or thigh circumference, we slice the mesh with a horizontal or vertical plane at the specified landmark location and sum up the resulting line segments [8,28].

Evaluation
We evaluate the linear baseline on BODY-fit [28] and ANSUR [30] datasets. More specifically, the BODY-fit dataset is extended by weight estimations obtained from the extracted mesh volumes. We call the extended dataset BODY-fit+W. We compare the baseline on BODY-fit+W with the following state-of-the-art image-based approaches: SMPLify [10], ExPose [14], and Yan et al. [28]. Unfortunately, most of the other methods have not published their source code or data, so we use their reported results on other datasets instead, such as CAESAR [49] and NOMO3D [8] datasets. In total, six datasets are referenced in this work, as listed in Table 3.

Datasets
The BODY-fit dataset contains 1474 male and 2675 female SMPL meshes. The template meshes are obtained by fitting them to the original 3D scans of people, which are not publicly available. In addition, the BODY-fit+W dataset contains weights that are estimated from corresponding mesh volumes. The distributions of male body measurements on BODY-fit+W, obtained by measuring the template meshes as described in Section 3.2, are shown in Figure 2. The range of the height is between 145 and 196 cm for male and between 135 and 190 cm for female, while weight is between 40 and 130 kg for male, and between 30 and 130 kg for female. Regarding other measurements, they also vary proportionally to their absolute values. For example, ankle circumferences are generally smaller than waist circumferences, etc. The body measurements on BODY-fit+W are similarly diverse as the body measurements from the ANSUR dataset, as shown in Figure 3. The ANSUR dataset is significant, as it represents the realistic human body population, while the BODY-fit+W represents the population of SMPL meshes fitted to 3D scans. Note that not all of the body measurements used in the BODY-fit+W dataset exist in the ANSUR dataset. Table 4 specifies the ANSUR attributes and expressions used to obtain the corresponding measurements from BODY-fit+W.

Quantitative Evaluation
The methods are compared quantitatively against the 15 standard body measurements and the three additional measurements to compare the baseline with the Virtual Caliper [31]. The metrics used for comparison are as follows: where i is the sample index, j represents the measurement, and N is the number of samples; , where i is the sample index, j represents measurement, and N is the number of samples; where j is the measurement and N is the number of samples. This metric shows the ratio of samples that are within the expert errors [30,33,34]. The expert errors are shown in Tables 5 and 6 (Expert error rows).
• Against other methods such as the Virtual Caliper [31] that estimates body lengths using a VR headset. • More detailed comparison with the representative 2D-based [10] and 3D-based methods [8]. On top of MAE, we also report the mean relative error (MRE) and the percentage of the samples within the expert errors (%<Expert).
Ground Truth Methods. We compare the baseline against the methods that use ground truth information as input. Several previous silhouette-based body measurement estimation methods [21][22][23][25][26][27] report their results using the ground truth silhouettes. To evaluate the baseline, we use the BODY-fit dataset and the volume as input. This way, we also exploit the ground truth information instead of estimating the weight. The results are shown in Table 5. Note that several methods [21][22][23] perform within the expert errors. All of these methods, including the baselines, perform comparably and achieve significantly low body measurement errors below 1 cm for all body measurements. However, using groundtruth silhouettes or body volume is unrealistic for real-world anthropometric applications. We provide these analyses for the sake of completeness, following the previous works. Table 5. Quantitative comparison to image-based body measurement approaches that use ground truth silhouettes (MAEs, in mm). In this case, we also show the performance of the linear baseline using the volume, height, and weight as deterministic variables (unrealistic). Note that we demonstrate only the performance of the BODY-fit model, as ANSUR does not contain volume measurements. The baseline (I = 4) additionally uses four interaction terms, as described in Section 3. Methods marked with † are evaluated on different, non-public data, and the results are reported in [21]. The best results are shown in bold.

Measurement
Dataset  State-of-the-Art Methods. Table 6 shows the performance of state-of-the-art body measurement estimation methods, compared to our baselines fitted on the BODY-fit+W and ANSUR datasets. The baseline fitted on the BODY-fit+W dataset models the volume, height, and weight as stochastic variables. For comparison with SMPLify [10,11] and ExPose [14], we scale their meshes to match ground truth height; otherwise, their mesh estimates would be significantly degraded by height estimation errors. Our baseline demonstrates comparable performance with the competing methods, even outperforming several popular deep learning approaches such as HMR [16], SMPLify [10,11], and ExPose [14]. Note that the baseline achieves MAEs within the expert errors for neck circumference (B), shoulderto-crotch (C), and forearm circumference (I). The baseline evaluated on the ANSUR dataset is less accurate on average, but it still shows competitive performance. A more detailed comparison between BODY-fit+W and ANSUR is given in Section 5.3. The discussion on the performance of deep mesh regression approaches is given in Section 5.4.
Other Methods. We also compare to the Virtual Caliper [31] that proposes using a VR headset for body measurement. Compared to our work, the Virtual Caliper evaluates its performance on real subjects and compares it to measurements obtained by their expert, which is more realistic. They also use self-reported weight as input, but they estimate height. For fair comparison, we predict three additional body measurements such as arm span, inseam height, and hip width, as shown in the additional part of Table 1. The results are shown in Table 7. The baseline outperforms the measurements obtained using the Virtual Caliper. The significant advantage of the proposed baseline is that it does not require a VR headset.
More Detailed Comparison. To compare the representative 2D-and 3D-based methods to the baseline, we calculate MAE, MRE, and %Expert metrics. The representative 2D-based method is SMPLify [10], even though it does not achieve the best performance among the 2D-based methods (see Table 6). However, the best performing method, by Smith et al. [21], does not provide the source code to evaluate on BODY-fit+W. As shown in Table 8, the 3D-based method by Yan et al. [8], evaluated on the NOMO3D dataset, achieves the best overall performance, with the exception of wrist (G) and ankle circumference (N). The baseline fitted to ANSUR achieves MREs below 5% for all body measurements, which is reasonable and useful for the anthropometric applications. Interestingly, the ratio of samples that are within the expert errors of above 50% are shoulder-to-crotch distance (C) and wrist circumference (G). In general, most ratios are above 25% for most of the body measurements. Table 6. Quantitative comparison to image-based body measurement and shape estimation approaches (MAEs in mm). In this case, we show the performance of the linear baseline using selfreported height and weight as stochastic variables (more realistic). The BODY-fit+W and ANSUR datasets are used to present the baseline performance. The baseline used I = 2 interaction terms, as described in Section 3. The best results are shown in bold.   Table 8. Detailed comparison between 2D-based methods (SMPLify [10]), 3D-based methods (Yan et al. [8]), and the two linear baselines (with I = 2 interaction terms), one fitted to the BODY-fit+W dataset, and one fitted to the ANSUR dataset [30]. Note that to fairly compare with Yan et al., the expert error values are extended according to [8]. The best results in each row for MAEs and %<Experts are shown in bold.

Discussion
The presented baseline demonstrates strong performance on the public datasets. In this section, we analyze the residual hypotheses, p-values, and R 2 scores (Section 5.1), comment on using height and weight for body measurement estimation in more detail (Section 5.2), compare the BODY-fit+W and ANSUR datasets in more detail (Section 5.3), discuss previous image-based mesh regression approaches (Section 5.4), and provide the guidelines for future body measurement estimation methods (Section 5.5).

Residuals, p-Values, and R 2 Scores
In this section, we verify the three assumptions of linear regression regarding residuals (homoscedasticity, independence, and normality) and check the p-values and R 2 scores for the regression models of each body measurement. Figure 4 shows the residuals for the BODY-fit+W and ANSUR models on train and test splits. The variance of the residuals is generally constant for all the values of both models, which satisfies the homoscedasticity assumption. The values are relatively randomly spread, which satisfies the independence assumption. Finally, as shown in the right sides of the two figures, the means of the residuals are zero and they are normally distributed, which satisfies the normality.  Tables 9 and 10 show the p-values, R 2 scores, MAEs, and RMSEs for male and female models, on BODY-fit+W and ANSUR, respectively, with two interaction terms (I = 2). We can observe that the vast majority of p-values are within the <0.05 threshold. For the simplicity of the analyses, we keep all the input variables that might lead to increased variance in the predictions and hence larger RMSEs [73]. Ideally, the R 2 scores should be as high as possible. Most of the scores for BODY-fit+W datasets are above or close to 0.8, except for head circumference (A) and shoulder breadth (O). It is reasonable that the head circumference is more difficult to estimate based only on height and weight and their derivative terms. Note that based on p BMI of the shoulder breadth, it would make sense to fit the model without the BMI input term, which may improve the R 2 score. The ANSUR model has somewhat lower R 2 scores, particularly for shoulder-to-crotch (C), wrist circumference (G), and ankle circumference (N). Note that the shoulder-to-crotch measure is derived from three manual measurements, as specified in Table 4, which might the part of the reason for the lower score. Intuitively, wrist and ankle circumferences only somewhat correspond to a person's height and weight, which makes those more difficult to estimate. Finally, even though the linear models could be further improved, the current performance is competitive w.r.t the state-of-the-art 2D-and 3D-based methods, which is the most important observation of these experiments. Table 9. The linear regression statistics for the BODY-fit+W dataset, for males and females (I = 2). The p-values correspond to the intercept (b), height (h), weight (w), BMI ( w h 2 ), and wh, respectively. In addition, we report adjusted R 2 scores, MAEs (mm), and RMSEs (mm).  Table 10. The linear regression statistics for the ANSUR dataset, for males and females (I = 2). The p-values correspond to the intercept (b), height (h), weight (w), BMI ( w h 2 ), and wh, respectively. In addition, we report adjusted R 2 scores, MAEs (mm), and RMSEs (mm).

Height and Weight for Body Measurement Estimation
It is not completely surprising that height and weight strongly correlate with the body measurements for the statistical model population. The previous works, such as [5,74], analyze the principal components of the SMPL and SCAPE [44] statistical body models, respectively. They add several standard deviations separately to each principal component of the mean shape. The resulting explained variance for the first 10 components is shown in Figure 5, and the variation in shapes are shown in Figure 6. It can be observed that the first two principal components explain most of the variance in body shapes ( Figure 5), which in turn define the extreme shape variations visible in Figure 6, particularly in terms of height and weight. Minor variations are visible for the third and fourth components. The remaining components do not significantly influence body shape and measurements. Whether or not these linear relationships hold for the general population, it is currently not easy to verify due to lack of public data. Still, the statistical models are expected to be made from a diverse set of human bodies; therefore, we consider this linear relationship relevant. Explained variance for the first 10 principal components of the dataset. The graph is generated on CAESAR-fits data [74] by applying PCA to the given vertices. As expected, the first two components, which most significantly correlate to height and weight, explain most of the variance in the data. Explained variance for the first 10 principal components of the dataset. As expected, the first two components, which highly correlate to height and weight, explain most of the variance in the data. The image is inspired by [74], originally made using SCAPE model [44]. We have drawn the above image using SMPL model [5].

Comparing BODY-Fit and ANSUR Models
The two datasets used for the evaluation of the proposed baseline represent the synthetic, statistical population (BODY-fit+W) and the realistic population (ANSUR). The linear models fitted on the BODY-fit+W and ANSUR datasets perform differently for certain body measurements, such as neck circumference (B), shoulder-to-crotch (C), arm length (J), inside-leg length (K), and ankle circumference (N) (see Table 6).
The differences can be explained by the fact that these body measurements are less dispersed in the height-weight space, as shown in Figure 7, even though the volume, height, and weight are modeled as stochastic variables. This might suggest that certain body measurements of the bodies from the synthetic population retain a linear relationship with height and weight, even though Gaussian noise has been added to input. Another reason might be that the human body populations of the two datasets are significantly different w.r.t. to these measurements. ANSUR represents the population of military personnel, while the original population of BODY-fit subjects are probably different.
In order to verify these hypotheses, the actual weight information from BODY-fit subjects is required, at least. Unfortunately, the weight information is not available. Finally, note that we simplify the analysis of ANSUR by not accounting for the diversity in selfestimation of height and weight, because a certain amount of error already exists in the manual measurements, which is difficult to quantify.

On Image-Based Mesh Regression
Most of the previous image-based methods train deep learning models to regress pose and shape parameters that correspond to the template 3D mesh [10,14,16,[21][22][23]28], which was made possible with the creation of the statistical models. This approach, while simple and powerful, has the problem of estimating the size of the person in an image. Due to 2D-to-3D scale ambiguity, it is impossible to determine absolute size without additional information [75]. Mesh regression approaches [10,14,16] typically ignore this problem by training convolutional networks on images and hoping that the network will fit proper height by exploiting the image context. As a result, they often estimate an incorrect absolute height when not explicitly trained for scale estimation. To allow a fair comparison with these methods, we scale their output meshes to match the ground truth height by multiplying with the ratio between the mesh height and the ground truth height. Nevertheless, errors are propagated to the body measurements, which is probably because the body measurements are distributed differently for different body heights. On the other hand, silhouette-based methods [21][22][23]28] typically render human meshes at a fixed distance from the camera and then train their model to estimate the height. The drawback of this approach is that the model only estimates the height correctly for that particular camera-distance configuration.

Limitations, Assumptions, and Future Guidelines
Limitations. The major limitation of body measurement estimation in general is a lack of large, public, and realistic benchmarks. The public benchmark used in this work, based on BODY-fit [28], contains 3D template meshes fitted to 3D scans. Even though the fitted meshes represent the scanning dataset, the SMPL fits are still an approximation of the original scans. Moreover, the 3D scanning process is also not perfect [9], so the scans themselves do not necessarily represent the original physical human bodies.
General Assumptions. Therefore, the first assumption is that the manual body measurements are comparable to the body measurements obtained from the 3D scan, i.e., from the SMPL template mesh. In addition, all ANSUR body measurements do not necessarily physically correspond to the measurements from the SMPL model, such as shoulder-tocrotch or waist circumference, as specified in Section 4.1, but we assume they are highly linearly correlated and thus comparable. The height is measured as the difference between the top head point and the heel point on the y (height) axis. However, most of the datasets' subjects are expected to take approximately the A-pose, which is not fully erect. This might result in height being incorrectly estimated in some cases. We assume that the posture is not significantly affected by breathing, that the subjects wear minimal or tight clothes, and that the body measurements such as height and head circumference are not affected by hair artifacts. The problem with hair artifacts is particularly important for the female subjects. Finally, we model all stochastic variables by adding normal distributions that are uncorrelated with other variables, which is not realistic for all groups inside the population. For example, underweight people tend to overestimate their weight and vice versa [76]. Future Guidelines. To be able to evaluate our model, as well as other body measurement estimation models, even more reliably, future work should focus on creating more realistic, diverse, and public benchmarks. To avoid estimating body weight from volume, the benchmark and the training dataset should contain measured weights. For the evaluation of future methods, we propose the 15 body measurements as a basic set of measurements for mesh comparison. The body measurements are intuitive and simple to obtain from template mesh, such as SMPL. The evaluation of the 15 measurements can be used in addition to the previously used per-vertex error (PVE) metric, which is used to compare between the mesh regression methods [12,61,77]. For more standardized body measurement comparison, we specify the exact vertices used to obtain each body measurement from the SMPL mesh.
Measurement Standards. When creating anthropometric datasets, special attention must be paid to body measurement standards [33] and body postures [34]. Even though manual expert measurements are considered to be the golden standard, they will never be perfect, because the human body is not rigid [78]. There are many proposed methods for body measurement error analysis [79][80][81][82], but the conclusion is that no statistical procedure is optimal for manual anthropometry in general. [83]. The most common, simple to calculate, and easily interpretable approach, proposed by Gordon et al. [30], includes the allowable or "expert" errors [30], which we mentioned earlier and compare against in Tables 5 and 6. The authors took into account many factors that potentially affect measurement accuracy, such as posture, time of the day (morning, evening), measurement technique and instrument, etc. The allowable errors are the inter-observer values obtained by the expert measurers based on the mean of absolute differences. Those values have been incorporated into the international standards [33,34].
Synthetic Data. Finally, the advantage of using synthetic data and 3D template meshes is that the body measurements can be extracted in a standardized way, which guarantees a standardized measurement. Another advantage is that synthetic data avoid privacy issues and approvals. Therefore, the promising future direction is the creation of large, more realistic, and more diverse synthetic anthropometric benchmarks. The statistical models such as SMPL are currently the best approximation of the overall population and are likely to be used as a tool for generating the body measurement benchmarks in the coming years.

Implementation Details
We implement the model and the experiments in Python 3.8. For linear regression and other models in Section 5, we use the scikit-learn package [84]. For creating mesh objects, calculating volume, slicing with planes, and visualization, we use the trimesh package [85]. For processing 3D meshes as SMPL template meshes and calculating body measurements, we use part of the public SMPL-X source code [11]. For other efficient computations, we use NumPy [86] and SciPy [87]. The seed for generating random numbers is set to (seed = 2021). All experiments are done on a single desktop computer, with an Intel Core i7-9700 CPU (3GHz, 8-core), 16GB RAM, and NVidia GeForce RTX 2080 Super, under Ubuntu 20.04 LTS. The source code, demos, and instructions are attached as Supplementary Files.

Conclusions
The presented regression method is a simple but significantly accurate tool for the convenient and automatic estimation of body measurements, without the need to remove clothes or capture images. Our work demonstrates that the self-reported height and body weight predict body dimensions as good as the state-of-the-art deep learning methods, or even better. The linear regression based on self-reported height and weight should be used as the baseline for future methods; i.e., any body measurement estimation method should never perform worse. The results reported in this paper mostly serve to show the approximate relative performances between the baseline and the competing methods on the currently available public benchmarks, but for the future reference and datasets, we recommend fitting the model to more diverse and realistic data, if applicable. The baseline is perfectly suitable for but not limited to the applications such as virtual try-on, VR, or ergonomics. The next step is to create more realistic, statistical model-based, public benchmarks to further evaluate the relationship between height and weight as well as other anthropometric measurements.

Data Availability Statement:
The benchmark data is available as a part of supplementary materials.