# Deep Spectral Meshes: Multi-Frequency Facial Mesh Processing with Graph Neural Networks

^{1}

^{2}

^{3}

^{4}

^{*}

## Abstract

**:**

## 1. Introduction

- The introduction of spectral decomposition of meshes in 3D shape representation learning.
- A novel parametric deep face model which enables independent control of high- and low-frequency deformations.
- Enhanced geometric and perceptual quality of generated meshes, achieved through the use of different representations of deformations at high and low frequencies.

## 2. Related Work

#### 2.1. Differential Coordinate-Based 3D Shape Representation

#### 2.2. Spectral Mesh Processing

#### 2.3. Geometric Deep Learning in Spectral Domain

#### 2.4. Parametric Models with Graph Networks

## 3. Method Overview

#### 3.1. Spectral Partitioning and Representation

#### 3.2. Neural Network

#### 3.3. Final Assembly

## 4. Deep Spectral Meshes

#### 4.1. Vertex Feature Representation

#### 4.1.1. Spectral Decomposition

#### 4.1.2. High-Frequency Band

#### 4.1.3. Low-Frequency Band

#### 4.2. Graph Network Architecture

#### 4.3. Network Structure

#### 4.4. Training Process

#### 4.5. Inference

## 5. Conditioning Influence

## 6. Implementation Details

## 7. Applications and Comparisons

#### 7.1. Mesh Reconstruction

#### 7.1.1. Quantitative Evaluation

#### 7.1.2. Qualitative Evaluation

#### 7.2. Mesh Interpolation

#### 7.3. Multi-Frequency Editing

## 8. Summary and Future Work

- This work restricts its application to low- and high-frequency bands. The partition of mesh data into more than two frequency bands and an investigation into the relationship between the learning model and the frequency bands remain unexplored. The limitation to two frequency bands in our approach is linked to the properties of the chosen mesh representations. Standardised Euclidean coordinates represent low-frequency information to ensure high point-wise accuracy of the generated meshes. The normalised deformation representation (DR) encodes high-frequency information for superior perceptual quality of the results. Future work on partitioning mesh data into more than two frequency bands could further explore the benefits of alternative feature representations at different frequency levels.
- Further research could describe the relationship between the choice of parameter k and the quality of generated meshes. Designing an algorithm to determine the optimal split between frequency bands is a potential avenue for future research. To achieve this, an objective function relating the quality of generated meshes to parameter k could be formulated. The selection of a suitable optimisation method would be crucial to efficiently obtain the optimal split.
- Our proposed method can be extended to address the problem of multi-frequency-based deformation transfer, which has not been investigated in existing research studies. The basic idea of multi-frequency-based deformation transfer involves decomposing source and target meshes into mean, low- and high-frequency parts. The differences between the source model and these parts at two different poses are determined and transferred to the corresponding bands of the target mesh. Subsequently, the graph neural network proposed in this paper can be employed to reconstruct a new shape for the target mesh with the pose of the source mesh.
- While our approach has been applied to deformable facial meshes, future work could extend the proposed method to articulated shapes like hands and bodies. Existing research has proposed various methods to relate articulated shapes to their underlying skeleton. With this extension, articulated shapes can be decomposed into mean, low- and high-frequency parts. Then, the relationships between these parts and the movements of the skeleton of the articulated shapes can be investigated. These relationships can be used to synthesise mean, low- and high-frequency parts of new poses. The graph neural network proposed in this paper can then extract features, reconstruct them, and synthesise new shapes from the reconstructed features.
- It may be possible to apply the approach proposed in this paper to 3D tumour image analysis. First, 3D shapes containing tumours can be reconstructed from medical images. Then, the approach could be employed to disentangle normal deformations and abnormal deformations caused by tumours and extract tumour features.

## Author Contributions

## Funding

## Data Availability Statement

## Conflicts of Interest

## References

- Russo, M. Polygonal Modeling: Basic and Advanced Techniques; Jones & Bartlett Learning: Burlington, MA, USA, 2010. [Google Scholar]
- Feng, X.; Shi, M. Surface representation and processing. In Proceedings of the 2009 8th IEEE International Conference on Cognitive Informatics, Hong Kong, China, 15–17 June 2009; IEEE: Piscatway, NJ, USA, 2009; pp. 542–545. [Google Scholar] [CrossRef]
- Sorkine, O.; Cohen-Or, D.; Lipman, Y.; Alexa, M.; Rössl, C.; Seidel, H.P. Laplacian surface editing. In Proceedings of the 2004 Eurographics/ACM SIGGRAPH symposium on Geometry processing, Nice, France, 8–10 July 2004; pp. 175–184. [Google Scholar]
- Zhang, H.; Kaick, O.V.; Dyer, R. Spectral mesh processing. Comput. Graph. Forum
**2010**, 29, 1865–1894. [Google Scholar] [CrossRef] - Sorkine, O. Laplacian Mesh Processing. In Eurographics (STARs); The Eurographics Association: Eindhoven, The Netherlands, 2005. [Google Scholar]
- Bronstein, M.M.; Bruna, J.; Lecun, Y.; Szlam, A.; Vandergheynst, P. Geometric Deep Learning: Going beyond Euclidean data. IEEE Signal Process. Mag.
**2017**, 34, 18–42. [Google Scholar] [CrossRef] - Egger, B.; Smith, W.; Tewari, A.; Wuhrer, S.; Zollhoefer, M.; Beeler, T.; Bernard, F.; Bolkart, T.; Kortylewski, A.; Romdhani, S.; et al. 3D Morphable Face Models—Past, Present, and Future. ACM Trans. Graph.
**2020**, 38, 157. [Google Scholar] [CrossRef] - Xiao, Y.P.; Lai, Y.K.; Zhang, F.L.; Li, C.; Gao, L. A survey on deep geometry learning: From a representation perspective. Comput. Vis. Media
**2020**, 6, 113–133. [Google Scholar] [CrossRef] - Gao, L.; Lai, Y.K.; Liang, D.; Chen, S.Y.; Xia, S. Efficient and flexible deformation representation for data-driven surface modeling. ACM Trans. Graph.
**2016**, 35, 158. [Google Scholar] [CrossRef] - Gao, L.; Lai, Y.K.; Yang, J.; Ling-Xiao, Z.; Xia, S.; Kobbelt, L. Sparse Data Driven Mesh Deformation. IEEE Trans. Vis. Comput. Graph.
**2021**, 27, 2085–2100. [Google Scholar] [CrossRef] - Tan, Q.; Zhang, L.; Yang, J.; Lai, Y.; Gao, L. Variational Autoencoders for Localized Mesh Deformation Component Analysis. IEEE Trans. Pattern Anal. Mach. Intell.
**2022**, 44, 6297–6310. [Google Scholar] [CrossRef] - Wu, Q.; Zhang, J.; Lai, Y.K.; Zheng, J.; Cai, J. Alive Caricature from 2D to 3D. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 7336–7345. [Google Scholar] [CrossRef]
- Melzi, S.; Rodol, E.; Castellani, U.; Bronstein, M. Localized Manifold Harmonics for Spectral Shape Analysis. Comput. Graph. Forum
**2018**, 37, 20–34. [Google Scholar] [CrossRef] - Xu, C.; Lin, H.; Hu, H.; He, Y. Fast calculation of Laplace-Beltrami eigenproblems via subdivision linear subspace. Comput. Graph.
**2021**, 97, 236–247. [Google Scholar] [CrossRef] - Lescoat, T.; Liu, H.; Thiery, J.; Jacobson, A.; Boubekeur, T.; Ovsjanikov, M. Spectral Mesh Simplification. Comput. Graph. Forum
**2020**, 39, 315–324. [Google Scholar] [CrossRef] - Wang, H.; Lu, T.; Au, O.; Tai, C. Spectral 3D mesh segmentation with a novel single segmentation field. Graph. Model.
**2014**, 76, 440–456. [Google Scholar] [CrossRef] - Tong, W.; Yang, X.; Pan, M.; Chen, F. Spectral mesh segmentation via ℓ
_{0}gradient minimization. IEEE Trans. Vis. Comput. Graph.**2020**, 26, 440–456. [Google Scholar] [CrossRef] - Bao, X.; Tong, W.; Chen, F. A Spectral Segmentation Method for Large Meshes. Commun. Math. Stat.
**2023**, 11, 583–607. [Google Scholar] [CrossRef] - Jain, V.; Zhang, H. Robust 3D Shape Correspondence in the Spectral Domain. In Proceedings of the IEEE International Conference on Shape Modeling and Applications 2006 (SMI’06), Matsushima, Japan, 14–16 June 2006; pp. 1–12. [Google Scholar] [CrossRef]
- Dubrovina, A.; Kimmel, R. Matching shapes by eigendecomposition of the Laplace-Beltrami operator. In Proceedings of the 5th International Symposium 3D Data Processing, Visualization and Transmission, Paris, France, 17–20 May 2010; pp. 1–8. [Google Scholar]
- Melzi, S.; Ren, J.; Rodolà, E.; Sharma, A.; Wonka, P.; Ovsjanikov, M. ZoomOut: Spectral upsampling for efficient shape correspondence. ACM Trans. Graph.
**2019**, 38, 155. [Google Scholar] [CrossRef] - Wu, Z.; Pan, S.; Chen, F.; Long, G.; Zhang, C.; Yu, P.S. A Comprehensive Survey on Graph Neural Networks. IEEE Trans. Neural Netw. Learn. Syst.
**2021**, 32, 4–24. [Google Scholar] [CrossRef] - Dong, Q.; Wang, Z.; Li, M.; Gao, J.; Chen, S.; Shu, Z.; Xin, S.; Tu, C.; Wang, W. Laplacian2Mesh: Laplacian-Based Mesh Understanding. IEEE Trans. Vis. Comput. Graph.
**2023**, 1–13. [Google Scholar] [CrossRef] - Lemeunier, C.; Denis, F.; Lavoué, L.; Dupont, F. SpecTrHuMS: Spectral transformer for human mesh sequence learning. Comput. Graph.
**2023**, 115, 191–203. [Google Scholar] [CrossRef] - Qiao, Y.; Gao, L.; Yang, J.; Rosin, P.; Lai, Y.; Chen, X. Learning on 3D Meshes With Laplacian Encoding and Pooling. IEEE Trans. Vis. Comput. Graph.
**2022**, 28, 1317–1327. [Google Scholar] [CrossRef] [PubMed] - Nasikun, A.; Hildebrandt, K. The Hierarchical Subspace Iteration Method for Laplace–Beltrami Eigenproblems. ACM Trans. Graph.
**2022**, 41, 17. [Google Scholar] [CrossRef] - Ranjan, A.; Bolkart, T.; Sanyal, S.; Black, M.J. Generating 3D Faces Using Convolutional Mesh Autoencoders. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 725–741. [Google Scholar]
- Defferrard, M.; Bresson, X.; Vandergheynst, P. Convolutional neural networks on graphs with fast localized spectral filtering. In Proceedings of the Advances in Neural Information Processing Systems, Barcelona, Spain, 5–10 December 2016; pp. 3844–3852. [Google Scholar]
- Bouritsas, G.; Bokhnyak, S.; Ploumpis, S.; Zafeiriou, S.; Bronstein, M. Neural 3D morphable models: Spiral convolutional networks for 3D shape representation learning and generation. In Proceedings of the IEEE International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 7212–7221. [Google Scholar] [CrossRef]
- Chen, Z.; Kim, T.K. Learning feature aggregation for deep 3D morphable models. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 13159–13168. [Google Scholar] [CrossRef]
- Gao, Z.; Yan, J.; Zhai, G.; Zhang, J.; Yang, Y.; Yang, X. Learning Local Neighboring Structure for Robust 3D Shape Representation. Proceedings of The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21) Learning, Virtual, 2–9 February 2021; Volume 35, pp. 1397–1405. [Google Scholar]
- Zhou, Y.; Wu, C.; Li, Z.; Cao, C.; Ye, Y.; Saragih, J.; Li, H.; Sheikh, Y. Fully Convolutional Mesh Autoencoder using Efficient Spatially Varying Kernels. Adv. Neural Inf. Process. Syst.
**2020**, 33, 9251–9262. [Google Scholar] - Verma, N.; Boyer, E.; Verbeek, J. FeaStNet: Feature-Steered Graph Convolutions for 3D Shape Analysis. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 2598–2606. [Google Scholar] [CrossRef]
- Cheng, S.; Bronstein, M.; Zhou, Y.; Kotsia, I.; Pantic, M.; Zafeiriou, S. MeshGAN: Non-linear 3D Morphable Models of Faces. arXiv
**2019**. [Google Scholar] [CrossRef] - Zhou, Y.; Deng, J.; Kotsia, I.; Zafeiriou, S. Dense 3D Face Decoding over 2500FPS: Joint Texture & Shape Convolutional Mesh Decoders. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 1097–1106. [Google Scholar]
- Yuan, Y.J.; Lai, Y.K.; Yang, J.; Fu, H.; Gao, L. Mesh Variational Autoencoders with Edge Contraction Pooling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Long Beach, CA, USA, 15–20 June 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 274–275. [Google Scholar]
- Jiang, Z.H.; Wu, Q.; Chen, K.; Zhang, J. Disentangled representation learning for 3D face shape. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 11949–11958. [Google Scholar] [CrossRef]
- Zheng, X.; Jiang, B.; Zhang, J. Deformation representation based convolutional mesh autoencoder for 3D hand generation. Neurocomputing
**2021**, 444, 356–365. [Google Scholar] [CrossRef] - Baran, I.; Vlasic, D.; Grinspun, E.; Popović, J.P. Semantic Deformation Transfer. In ACM SIGGRAPH 2009 Papers, Proceedings of the Special Interest Group on Computer Graphics and Interactive Techniques Conference, New Orleans, LA, USA, 3–7 August 2009; ACM: New York, NY, USA, 2009; pp. 1–6. [Google Scholar] [CrossRef]
- Sorkine, O.; Alexa, M.; Berlin, T.U. As-Rigid-As-Possible Surface Modeling. In Proceedings of the Symposium on Geometry Processing, Barcelona, Spain, 4–6 July 2007; Belyaev, A., Ed.; The Eurographics Association: Eindhoven, The Netherlands, 2007; Volume 4, pp. 109–116. [Google Scholar]
- Hinton, G.E.; Srivastava, N.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R.R. Improving neural networks by preventing co-adaptation of feature detectors. arXiv
**2012**. [Google Scholar] [CrossRef] - Humain Limited. Humain Limited—Research & Development. 2022. Available online: https://www.humain-studios.com/ (accessed on 12 May 2022).
- Yang, H.; Zhu, H.; Wang, Y.; Huang, M.; Shen, Q.; Yang, R.; Cao, X. FaceScape: A Large-scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seatle, WA, USA, 13–19 June 2020; pp. 601–610. [Google Scholar]
- Cao, C.; Weng, Y.; Zhou, S.; Tong, Y.; Zhou, K. FaceWarehouse: A 3D facial expression database for visual computing. IEEE Trans. Vis. Comput. Graph.
**2014**, 20, 413–425. [Google Scholar] [CrossRef] [PubMed] - Lehoucq, R.B.; Sorensen, D.C.; Yang, C. ARPACK Users’ Guide: Solution of Large-Scale Eigenvalue Problems with Implicitly Restarted Arnoldi Methods; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 1998. [Google Scholar]
- Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L. Pytorch: An imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst.
**2014**, 32, 8026–8037. [Google Scholar] - Váša, L.; Rus, J. Dihedral Angle Mesh Error: A fast perception correlated distortion measure for fixed connectivity triangle meshes. Eurographics Symp. Geom. Process.
**2012**, 31, 1715–1724. [Google Scholar] [CrossRef] - Corsini, M.; Larabi, M.C.; Lavoué, G.; Petřík, O.; Váša, L.; Wang, K. Perceptual metrics for static and dynamic triangle meshes. Comput. Graph. Forum
**2013**, 32, 101–125. [Google Scholar] [CrossRef] - Gong, S.; Chen, L.; Bronstein, M.; Zafeiriou, S. SpiralNet++: A fast and highly efficient mesh convolution operator. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshop, ICCVW 2019, Seoul, Republic of Korea, 27–28 October 2019; pp. 4141–4148. [Google Scholar] [CrossRef]
- Hanocka, R.; Fleishman, S.; Hertz, A.; Fish, N.; Giryes, R.; Cohen, D. MeshCNN: A Network with an Edge. ACM Trans. Graph. (TOG)
**2019**, 38, 1–12. [Google Scholar] [CrossRef]

**Figure 3.**Comparison of the reconstruction results with our method ($k=500$, $\gamma =1$, $\mathbf{Z}=64$) and with common representations used in other methods: Euclidean coordinates [32,34,50], standardised Euclidean coordinates [27,29,30,31,49] and normalised deformation representation (DR) [12,37]. Across Facsimile and FaceWarehouse datasets, our method outperforms in reconstructing examples from the training set and favourably balances perceptual and geometric quality on the Pareto-front of optimal solutions. Our method underperforms on the FaceScape [43] dataset because the benefit of using normalised DR representation for high-frequency information is minuscule compared to standardised Euclidean representation.

**Figure 4.**Qualitative comparison of the reconstruction results of training data with our method ($k=500$, $\gamma =1$) and with common representations used in other methods: Euclidean coordinates [32,34,50], standardised Euclidean coordinates [27,29,30,31,49] and the normalised deformation representation (DR) [12,37]. The meshes generated by our method achieve superior results compared to other feature representations. Zooming into the digital version is recommended to see the surface artefacts on the results generated with Euclidean and standardised Euclidean representations.

**Figure 5.**Qualitative comparison of the reconstruction results of test data with our method ($k=500$, $\gamma =1$) and with common representations used in other methods: Euclidean coordinates [32,34,50], standardised Euclidean coordinates [27,29,30,31,49] and the normalised deformation representation (DR) [12,37]. The meshes generated by our method have similar surface quality to the outputs with DR while achieving much lower volume loss in the neck, chin, and cheek areas.

**Figure 6.**Visual comparison of the reconstruction results of the Facsimile™ and FaceWarehouse datasets using our method ($k=500$, $\gamma =1$, $\mathbf{Z}=64$) and four other methods: Mesh Autoencoder [32], SpiralNet++ [49], Neural 3DMM [29] and FeaStNet [33]. It is recommended to zoom into the digital version to compare the reconstructed meshes.

**Figure 7.**The outcomes of the user study, which compared the visual similarity to the ground truth of the meshes generated by our method and other methods: Mesh Autoencoder [32], SpiralNet++ [49], Neural 3DMM [29] and FeaStNet [33]. The bars show the percentage of participants who selected the mesh generated by the given method as more similar to the ground truth mesh. The participants were asked to select "Difficult to say" only when they had to guess between the generated models.

**Figure 8.**The results from the user study comparing our method with common representations used in other methods: Euclidean coordinates [32,34,50], standardised Euclidean coordinates (Eucl. Std.) [27,29,30,31,49] and the normalised deformation representation (DR Norm.) [12,37]. The bars show the percentage of participants who selected the mesh generated by the given method as more similar to the ground truth mesh. The participants were asked to select “Difficult to say” only when they had to guess between the generated models.

**Figure 9.**Interpolation of low-frequency and high-frequency latent parameters, $k=500$ Two facial meshes (in green and purple outlines) from the Facsimile™ [42] dataset are encoded. In (

**A**), the model is trained with the Conditioning Factor $\gamma =1.0$. In (

**B**), the Conditioning Factor $\gamma =0.4$. The meshes arranged in a grid are decoded from interpolated latent parameters. In (

**C**), the meshes in green and purple outlines are interpolated in the vertex space.

**Figure 10.**Comparison of latent code editing between the proposed method and Mesh Autoencoder [32]. In (

**A**), the editing of low-frequency latent codes of encoded mesh ${\mathbf{P}}_{1}$. In (

**B**), the editing of high-frequency latent codes of encoded mesh ${\mathbf{P}}_{2}$. Top and middle row: the examples decoded using our model with $k=500$ and Conditioning Factor $\gamma =0.4$ and $\gamma =1.0$. Bottom row: the results of editing a subset of latent parameters using the method in [32]. The parameters of our method successfully disentangle high and low frequencies. While subjective, it can be observed that lower $\gamma $ provides more control and produces more diverse results. Meanwhile, altering the parameters of Mesh Autoencoder [32] affects the entire frequency spectrum.

FaceWarehouse (11,510 verts) | Facsimile™ (14,921 verts) | FaceScape (26,317 verts) | |
---|---|---|---|

Computation of $\mathbf{U}$ CPU time [s] | 26.92 | 33.72 | 58.22 |

Computation of $\mathbf{X}$ CPU time [s] | 2.11 | 7.70 | 9.88 |

Computation of Equation (9) CPU time [s] | 0.20 | 0.36 | 0.98 |

Computation of $\mathbf{U}$ CPU memory [MB] | 45.0 | 58.3 | 102.8 |

Computation of $\mathbf{X}$ CPU memory [GB] | 1.04 | 1.74 | 5.41 |

Training | Test | |||
---|---|---|---|---|

${\mathit{L}}_{\mathbf{1}}$
Norm$\times {\mathbf{10}}^{-\mathbf{3}}\downarrow $ | DAME$\times {\mathbf{10}}^{-\mathbf{2}}\downarrow $ |
${\mathit{L}}_{\mathbf{1}}$
Norm$\times {\mathbf{10}}^{-\mathbf{3}}\downarrow $ | DAME$\times {\mathbf{10}}^{-\mathbf{2}}\downarrow $ | |

Facsimile™ | ||||

Ours | 1.61 | 2.76 | 6.42 | 3.17 |

Mesh Autoencoder. | 2.60 | 6.04 | 8.32 | 5.81 |

SpiralNet++ | 1.35 | 5.35 | 6.38 | 4.87 |

Neural 3DMM | 1.71 | 3.84 | 5.95 | 3.81 |

FeaStNet | 2.02 | 5.30 | 9.07 | 5.35 |

FaceWarehouse | ||||

Ours | 0.91 | 1.10 | 6.27 | 1.29 |

Mesh Autoencoder. | 2.54 | 5.27 | 5.33 | 5.50 |

SpiralNet++ | 1.21 | 6.06 | 4.69 | 5.63 |

Neural 3DMM | 1.58 | 4.84 | 4.02 | 4.46 |

FeaStNet | 1.92 | 6.81 | 8.17 | 6.30 |

**Table 3.**Quantitative comparison of the reconstruction results with our method ($k=500$, $\gamma =1$) and with common representations used in other methods: Euclidean coordinates [32,34,50], standardised Euclidean coordinates [27,29,30,31,49] and the normalised deformation representation (DR) [12,37]. To ensure a fair comparison between our method and other input representations, they are evaluated on the fully convolutional variational graph autoencoder with a single encoder and a single decoder. The encoder is the same as ${E}_{high}$ or ${E}_{low}$, and the decoder is the same as ${D}_{high}$ or ${D}_{low}$, without the dropout layer. All the comparisons encode to latent space $\mathbf{Z}$ of 64 parameters. Our method outperforms the reconstruction of examples from the training set on most datasets. On the test set, our method favourably compromises between the point-wise ${L}_{1}$ precision and the perceptual DAME metric. Other methods considerably sacrifice one of these in favour of another.

Training | Test | |||
---|---|---|---|---|

${\mathit{L}}_{\mathbf{1}}$
Norm$\times {\mathbf{10}}^{-\mathbf{3}}\downarrow $ | DAME$\times {\mathbf{10}}^{-\mathbf{2}}\downarrow $ |
${\mathit{L}}_{\mathbf{1}}$
Norm$\times {\mathbf{10}}^{-\mathbf{3}}\downarrow $ | DAME$\times {\mathbf{10}}^{-\mathbf{2}}\downarrow $ | |

Facsimile™ | ||||

Ours | 1.61 | 2.76 | 6.42 | 3.17 |

DR | 4.77 | 3.05 | 9.29 | 3.00 |

Euclidean Std. | 2.36 | 4.90 | 5.78 | 3.89 |

Euclidean | 2.60 | 6.04 | 8.32 | 5.81 |

FaceWarehouse | ||||

Ours | 0.91 | 1.10 | 6.27 | 1.29 |

DR | 2.20 | 1.14 | 7.42 | 1.22 |

Euclidean Std. | 1.11 | 3.23 | 5.33 | 2.53 |

Euclidean | 2.54 | 5.27 | 5.34 | 5.50 |

FaceScape | ||||

Ours | 1.27 | 2.25 | 1.65 | 2.41 |

DR | 6.06 | 1.64 | 5.92 | 1.61 |

Euclidean Std. | 0.96 | 1.81 | 1.30 | 1.82 |

Euclidean | 1.32 | 2.20 | 1.71 | 2.26 |

**Table 4.**Ablation study on the reconstruction task demonstrating the impact of normalisation of the deformation representation (DR) and the standardisation of Euclidean coordinate input features.

Training | Validation | |||
---|---|---|---|---|

${\mathit{L}}_{\mathbf{1}}$
Norm$\times {\mathbf{10}}^{-\mathbf{3}}\downarrow $ | DAME$\times {\mathbf{10}}^{-\mathbf{2}}\downarrow $ |
${\mathit{L}}_{\mathbf{1}}$ Norm$\times {\mathbf{10}}^{-\mathbf{3}}\downarrow $ | DAME$\times {\mathbf{10}}^{-\mathbf{2}}\downarrow $ | |

Facsimile™ | ||||

DR without normalisation | 4.01 | 2.77 | 12.05 | 3.07 |

DR with normalisation | 4.77 | 3.05 | 9.29 | 3.00 |

Euclidean without standardisation | 2.60 | 6.07 | 8.32 | 5.81 |

Euclidean with standardisation | 2.36 | 4.90 | 5.78 | 3.89 |

FaceWarehouse | ||||

DR without normalisation | 2.57 | 1.71 | 10.03 | 1.19 |

DR with normalisation | 2.21 | 1.14 | 7.42 | 1.22 |

Euclidean without standardisation | 2.54 | 5.27 | 5.33 | 5.50 |

Euclidean with standardisation | 1.12 | 3.23 | 5.33 | 2.53 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Kosk, R.; Southern, R.; You, L.; Bian, S.; Kokke, W.; Maguire, G.
Deep Spectral Meshes: Multi-Frequency Facial Mesh Processing with Graph Neural Networks. *Electronics* **2024**, *13*, 720.
https://doi.org/10.3390/electronics13040720

**AMA Style**

Kosk R, Southern R, You L, Bian S, Kokke W, Maguire G.
Deep Spectral Meshes: Multi-Frequency Facial Mesh Processing with Graph Neural Networks. *Electronics*. 2024; 13(4):720.
https://doi.org/10.3390/electronics13040720

**Chicago/Turabian Style**

Kosk, Robert, Richard Southern, Lihua You, Shaojun Bian, Willem Kokke, and Greg Maguire.
2024. "Deep Spectral Meshes: Multi-Frequency Facial Mesh Processing with Graph Neural Networks" *Electronics* 13, no. 4: 720.
https://doi.org/10.3390/electronics13040720