Semi-Supervised Manifold Alignment Using Parallel Deep Autoencoders

: The aim of manifold learning is to extract low-dimensional manifolds from high-dimensional data. Manifold alignment is a variant of manifold learning that uses two or more datasets that are assumed to represent different high-dimensional representations of the same underlying manifold. Manifold alignment can be successful in detecting latent manifolds in cases where one version of the data alone is not sufﬁcient to extract and establish a stable low-dimensional representation. The present study proposes a parallel deep autoencoder neural network architecture for manifold alignment and conducts a series of experiments using a protein-folding benchmark dataset and a suite of new datasets generated by simulating double-pendulum dynamics with underlying manifolds of dimensions 2, 3 and 4. The dimensionality and topological complexity of these latent manifolds are above those occurring in most previous studies. Our experimental results demonstrate that the parallel deep autoencoder performs in most cases better than the tested traditional methods of semi-supervised manifold alignment. We also show that the parallel deep autoencoder can process datasets of different input domains by aligning the manifolds extracted from kinematics parameters with those obtained from corresponding image data. and show that PDAE was the best-performing method in the experiments using the 2D-3D data.


Introduction
Circles and lines are one-dimensional manifolds. Two-dimensional manifolds include surfaces such as spheres, tori and pretzels. Higher-dimensional manifolds include curved spaces that locally look like a Euclidean space. The mathematical definition of a topological n-dimensional manifold, or n-manifold, M, requires that M is a second countable Hausdorff space, and each each point in M has a neighbourhood that is homeomorphic to an open subset of the Euclidean space R n [1][2][3].
Manifold learning refers to techniques of non-linear dimensionality reduction that can extract latent low-dimensional manifolds from high-dimensional data [4][5][6][7]. For example, we can consider the frames of a video where each frame can be regarded as a pixel vector of a few thousand dimensions. If the video shows a rotating object or if the camera rotates in a circle, then manifold learning techniques such as isomap [5] can extract a circle from the video frame sequence, which shows that the essential information contained in the high-dimensional data is that of a circular rotation, which can be visualised as a 1-manifold homeomorphic to S 1 in two dimensions [8]. If the object in the image sequence is not only rotated, but also translated, we obtain a manifold homeomorphic to S 1 × I, that is a cylinder [9]. A torus homeomorphic to S 1 × S 1 was obtained by [10] from an image set of robot heads that turn left and right in one circle or nod up and down in another circle.

Semi-Supervised Manifold Alignment
Ham et al. [14] characterised manifold alignment using the correspondence information of two input datasets that have subsets of corresponding point pairs as semi-supervised manifold alignment. These subsets of corresponding points are employed to establish the alignment of the two latent low-dimensional manifolds. Ham et al. [14] referred to points with correspondence information as "labelled" points. Among the semi-supervised manifold alignment methods, two approaches can be distinguished [38]: Approach I: Dimensionality reduction is followed by alignment [23]. Approach II: A joint manifold is created to represent the union of the given manifolds, and then, the joint manifold is mapped to a lower-dimensional space [39].
For experimental comparison with the new PDAE, we employed the following three methods. Each is an example of one of the above two general approaches and can be applied at the feature level (feat) and at the instance level (inst) [39]: Method 1 (MAPA), Manifold Alignment using Procrustes Analysis [23]: This method is a version of Approach I and in its second step applies Procrustes analysis to rescale and rotate manifold S Y to align it with manifold S X . If Locality Preserving Projections (LPP) [40] are used in the dimensionality reduction step, it results in feature-level manifold alignment, and we refer to the method as MAPA-feat in the following sections. If Laplacian eigenmaps [41] are used in the dimensionality reduction step to obtain instance-level alignment, we refer to the method as MAPA-inst. Method 2 (MALG), Manifold Alignment preserving Local Geometry [39]: This method is a version of Approach II. First, a joint manifold Z is calculated using the graph Laplacians of the given manifolds. If, in the next step, eigenvalue decomposition of Z provides instance-level alignment, we refer to it as MALG-inst. If generalised eigenvalue decomposition of Z is used for feature-level alignment, we refer to the method as MALG-feat. Method 3 (MAGG), Manifold Alignment preserving Global Geometry [39]: This method is a version of Approach II. A joint manifold Z is generated using the global distances of corresponding pairs in X ∪ Y. Eigenvalue decomposition of Z provides dimensionality reduction to obtain the aligned low-dimensional manifolds in the case of instance-level alignment (MAGG-inst). Generalised eigenvalue decomposition is used instead in the case of feature-level alignment (MAGG-feat).
In summary, this classification comprises six semi-supervised manifold alignment methods: MAPA-feat, MAPA-inst, MALG-feat, MALG-inst, MAGG-inst, and MAGG-feat. These will be used in the following for comparisons with the proposed PDAE.

Parallel Deep Autoencoder
Let A : R m −→ R p and B : R p −→ R m be mappings representing the encoder and decoder parts, respectively, of an autoencoder, that maps an input vector X ∈ R m to an outputX = B • A(X) of equal dimension where A(X) at the code layer is p-dimensional with p < m. In a deep autoencoder, both mappings A and B comprise several layers [42].
The autoencoders of the present study use p = 3, so that the code layer activations provide a compressed representation of the input that can be visualised as a 3D point cloud, which can approximate a manifold, for example a 2D surface in R 3 .
In our neural network-based approach to manifold alignment, two deep autoencoders are run in parallel ( Figure 1). The two encoders A X and A Y embed the two datasets X and Y in 3D coded space as S X = A X (X) and S Y = A Y (Y), respectively. Then, the decoders B X and B Y try to reconstruct the data toX = B X (A X (X)) andŶ = B Y (A Y (Y)) in R m . The two high-dimensional datasets X and Y include correspondence subsets X C and Y C , resp., and are compressed to low-dimensional manifolds S X and S Y , where D X and D Y are the low-dimensional representations of X C and Y C , respectively. The minimisation of E corr applies regularisation pressure to D X and D Y to align the low-dimensional manifolds S X and S Y .
If dataset X has n instances or points X 1 , . . . , X n and each instance comprises m features, then X is represented as a (n × m)-matrix where the instances correspond to the rows. Training is conducted to minimise the reconstruction errors E(X, These two error functions are combined to obtain the total reconstruction error: A correspondence error between the two code layers is calculated to regulate the alignment of the compressed data. X c ⊂ X and Y c ⊂ Y are defined as the subsets of corresponding points, that is selected pairs of points from both datasets that are in a one-to-one relationship and refer to the same state of the system. In the simulation experiments, X c ⊂ X and Y c ⊂ Y comprised a uniform selection of v% of the data of X and Y, respectively. When reducing the dimensionality of X and Y, also X c and Y c are mapped into 3D space, resulting in The error between corresponding points in D X and D Y is calculated as: and we refer to it as the correspondence error. The aim of minimising E corr is to align the manifolds underlying the data closely in coded space so that they can support each other in establishing the joint latent manifold that underlies both datasets. Finally, the total model error is defined as: Parameters λ and µ can be tuned to put more emphasis on the dimensionality reduction aspects or the alignment aspect of the model. The parameter setting λ = µ = 1 was sufficient to demonstrate the desired abilities of PDAE for our study. These weight parameters may require careful re-adjustment on other more complex data. By minimising E recon , the deep autoencoders try to learn the intrinsic manifolds in X and Y separately. The simultaneous minimisation of E recon and E corr reduces the distance of the code layer outputs where a joint three-dimensional representation of the datasets is generated, that is the parallel model trained by minimising the total error E extracts the intrinsic manifolds at the code layer and simultaneously aligns them.

Asymmetric PDAE
We also developed an asymmetric version of PDAE to perform manifold alignment in situations where the two datasets represent two different input modalities ( Figure 2). Specifically, we considered the case where dataset X is a sample set of feature vectors and dataset Y comprises images. The framework is the same as in the symmetric case above with the difference being that one of the two autoencoders is replaced by a convolutional autoencoder (CAE), that is a fully-connected autoencoder (FAE) was used to reduce the dimensionality of X to S X and a CAE was used to reduce the dimensionality of the image dataset Y to S Y . In our experiments, the code layer of both autoencoders had dimension three, and both datasets were assumed to be different high-dimensional representations of the same latent manifold. The input layer of the CAE had five kernels with a dimension equal to the pixel dimension of the input images in Y. In the encoder part, the CAE had multiple convolutional layers with maxpooling and dropout layers, and the decoder part mirrored this architecture. The errors E recon , E corr and E were calculated similarly as in Equations (1)-(3). The gradient for training was applied to the total hyper-parameter set of PDAE.  The left autoencoder is a fully-connected autoencoder, which takes feature vectors from X as input. The right autoencoder is a CAE, which takes images from Y as input. Both autoencoders have a fully-connected layer with three neurons as their code layer.

Performance Evaluation
The main tool for performance evaluation in the present study is the qualitative visual assessment of the resulting manifold visualisations. This is supported by a quantitative evaluation, which takes into account that method-associated scaling effects can affect the distances between corresponding points. To counteract this effect, these distances were normalised by the maximum Euclidean distance of points on each of the manifolds as follows: where S X (i) and S Y (i) are corresponding points for i = 1, . . . , n. Note that in the present study, all datasets are constructed so that each point in X has a corresponding point in Y. This allows calculating the matching error ∆ = (∑ n i=1 D i )/n as the average of the normalised Euclidean distances D i , i = 1, . . . , n. Smaller distances between corresponding points usually indicate more close alignments. Hence, ∆ reflects the proximity of two aligned manifolds in low-dimensional space, which we considered as a measure of the quality of an alignment. In addition, we considered the standard deviation: as a measure of the smoothness of an alignment. Note that when using more general datasets, other more general performance measures such as maximum mean discrepancy, KL divergence, or correlation matrices would be required as is common, for example, in the field of domain adaption [43].

Experiments and Results
In the experiments, the proposed PDAE approach and the six above discussed traditional manifold alignment methods were compared to extract and align pairs of 1-, 2-, 3-, and 4-dimensional manifolds. A well-known benchmark dataset that contains protein structure manifolds in three dimensions [44] was used in experiments that resulted in the extraction and alignment of one-dimensional manifolds. The other datasets were new and generated specifically for our study in simulation. They were the main datasets of our study and comprised high-dimensional motion data representing the dynamics of two double pendulums with different degrees of freedom. This data were used for a series of manifold alignment experiments that resulted in the extraction and alignment of manifolds homeomorphic to S 1 × S 1 , S 1 × S 2 , and S 2 × S 2 , which were 2-, 3-, and 4-manifolds, respectively. For testing the asymmetric PDAE, one of the S 1 × S 1 datasets was replaced by a corresponding image sequence of the pendulum motion. The traditional manifold alignment methods were executed in MATLAB 2016, and the autoencoders were developed, trained, and executed using TensorFlow 1.3.1 with Python 3.5.

1-Manifold Alignment
The dataset stemmed from the Protein Data Bank at Brookhaven National Laboratories [44]. It contains one-dimensional manifolds representing the structure of the glutaredoxin protein PDB-1G7O. The protein structure can be described in three-dimensional space as a chain of amino acids, that is the structures are 1-manifolds in three-dimensional space. The dataset comprises 21 models of the glutaredoxin protein and provides for each model 215 points in 3D. Models 1 and 21 were previously used for method evaluation by articles on manifold alignment using Procrustes analysis [23] and manifold alignment preserving local geometry [39]. We followed their example and used the same models to test our PDAE. To examine the robustness of the methods, the dataset generated from Model 21 was scaled by a factor of four because previous publications on manifold alignment did the same when testing the robustness of their methods [22,23,39,45,46]. Figure 3a shows the three-dimensional graphs of Models 1 and 21 where the x, y, and z coordinates of each model were the columns of the input data matrices X and Y, respectively.
Datasets X and Y were aligned by MAPA, MALG, MAGG, and PDAE. PDAE had nine fully-connected layers, where the architecture of the encoder network was 3-4-5-4-, the code layer had three neurons, and the decoder architecture was -4-5-4-3, that is it mirrored the encoder. The tanh function was applied to the output of each layer as the activation function. The network was trained for 10,000 epochs using the Adam optimiser with a learning rate of 0.001 and stopping condition E ≤ 0.0008.
The graphs of the resulting aligned manifolds shown in Figure 3b indicated that the outcome of MAPA-feat was not as robust as that of the other methods in Figure 3c-h. The manifolds aligned by the parallel autoencoders in Figure 3h were as well-aligned and as accurate as the MALG and MAGG methods from Approach 2 in Figure 3d We calculated ∆ and σ of the aligned manifolds for each method. The results in Table 1 show that PDAE was the best performer. PDAEs were executed 10 times with 10 different sets of randomly-initialised weights. The standard deviation of ∆ for the basic autoencoder was 0.009 and for the deep autoencoder was 0.003. This showed that the selection of the initial weights did not have much impact on the autoencoder results.  Figure 3. The standard deviation σ is provided in parenthesis. PDAE performed the best.

Double Pendulum Datasets
We generated high-dimensional datasets by simulating the motion of double pendulums where three different conditions were considered ( Figure 4): 2D-2D, 2D-3D and 3D-3D. The two limbs of the double pendulum are denoted as u 1 and u 2 , and the joints are denoted as J 1 and J 2 . One end of u 1 is fixed at R 1 , and the limb can rotate around this joint. The other end of u 1 is attached to u 2 at joint R 2 . u 2 can rotate freely around joint J 2 . The free end point e of the pendulum is referred to as the "end-effector" and has coordinates e x , e y in the two-dimensional case and e x , e y , e z in the three-dimensional version using a right-handed coordinate system with the origin at joint J 1 . The data were generated in simulation for the 2D-2D, 2D-3D and 3D-3D cases as follows ( Figure 4): (i) 2D-2D motion: The pendulum has two Degrees-Of-Freedom (DOF), that is, both limbs u 1 and u 2 rotate in the two-dimensional (x-y)-plane, each of them describing a circle. In Figure 4a, θ 1 and θ 2 are the rotation angles of limbs u 1 and u 2 at joints J 1 and J 2 , respectively. Accordingly, the manifold representing the dynamics of the 2D-2D case is the cross-product of two circles, S 1 × S 1 , which is homeomorphic to the two-dimensional torus, that is a 2-manifold. (ii) 2D-3D motion: The pendulum has three DOFs, where limb u 2 can rotate on a two-dimensional sphere S 2 in three-dimensional space, while u 1 is restricted to rotate on a circle S 1 in a two-dimensional plane. That is, the manifold representing the dynamics of the 2D-3D case is homeomorphic to S 1 × S 2 , which is a 3-manifold. As the pendulum moves in 3D space, the end-effector has the 3D coordinates e x , e y , e z . In Figure 4b, θ y and θ z are the angles of u 2 with axes y and z , respectively, and describe the motion on the sphere S 2 . θ 1 is the angle between the x-axis and u 1 and describes the two-dimensional rotation of the sphere's centre in the (x-y)-plane. (iii) 3D-3D motion: In this case, the pendulum has four DOFs, where both limbs can rotate on two-dimensional spheres in 3D space. In Figure 4c, θ y and θ z are the angles of u 1 with the y and z axes, respectively, and θ y and θ z are the angles of u 2 with the y and z axes, respectively. Accordingly, we expected that the manifolds representing the dynamics of the 3D-3D case were homeomorphic to S 2 × S 2 , which is a 4-manifold.
For each case, two datasets, X and Y, were generated that represented the motion of two similar double pendulums that differed only in different limb lengths and limb length ratios: Pendulum X: (u 2 /u 1 ) = 0.75/1.25 = 0.60 Pendulum Y: (u 2 /u 1 ) = 1.25/1.56 = 0.80 that is, we restricted the experiments to the case u 2 < u 1 .
The feature vectors for each sample point (or instance) were calculated from the kinematics at the joints and the coordinates of the end-effector. The end-effector coordinates were calculated using forward kinematics. Then, feature vectors for the 2D-2D, 2D-3D, and 3D-3D cases were defined as: 2D-2D: (e x , e y , cos θ 1 , cos θ 2 , sin θ 1 , sin θ 2 ) 2D-3D: (e x , e y , e z , cos θ 1 , cos θ y , cos θ z , sin θ 1 , sin θ y , sin θ z ) 3D-3D: (e x , e y , e z , cos θ y , cos θ z , cos θ y , cos θ z , sin θ z , sin θ y , sin θ y , sin θ z ) The data points for the 2D double pendulums were generated using its equations of motion and then sampled at angular increments of 10 • in θ 1 and θ 2 at both joints. There were (360/10) 2 = 1296 instances and six features, resulting in a dataset of size 1296 × 6 for each of the pendulums X and Y.
In the case of the 2D-3D motion, instances were sampled at angular increments of 30 • at three angles θ 1 , θ y and θ z of the corresponding joints. As a result, the number of instances was (360/30) 3 = 1728, and with the nine features, the size of the two datasets, X and Y, was 1728 × 9.
In the 3D-3D case, instances were sampled at angular increments of 30 • in the rotational angles θ y , θ z , θ y and θ z at the corresponding joints. As a result, the number of instances was (360/30) 4 = 20,736, and with the 11 features, the size for each of the two datasets X and Y was 20,736 × 11. In order to challenge the robustness of the different alignment methods and to simulate potential real-world scenarios, two different types of noise were added in separate experiments to the clean datasets X and Y. The first type of noise we refer to as "actuator noise", and it was added to the joint angles to imitate the noise at actuator joints in a real-world system. The range of actuator noise was incremented from 0 • to [−10 • , 10 • ] in steps of 2 • . The second type of noise was added to the end-effector coordinates, and we refer to it as "coordinate noise". This noise could simulate, for example, the jittery motion of robot limbs. In the experiments, the coordinate noise range was increased from 0.0 to [−1.0, 1.0] in steps of size 0.2.

PDAE Architecture
The deep autoencoders that were used as part of PDAE had six neurons in the input layer for 2D-2D motion alignment where the data matrix had six features. Similarly, for the 2D-3D data, the input layer had nine neurons, and for the 3D-3D motion, the input layer had 11 neurons. Then, the number of neurons was reduced by one in each of the consecutive hidden layers until it reached three at the code layer of the deep autoencoders. The decoder had the same layer architecture as the encoder, but in reverse order. In summary, the architectures of the deep autoencoders were: 2D-2D: 6-5-4-3-4-5-6 2D-3D: 9-8-7-6-5-4-3-4-5-6-7-8-9 3D-3D: 11-10-9-8-7-6-5-4-3-4-5-6-7-8-9-10-11 The weights of the autoencoders were randomly initialised within [−1, 1] using a normal distribution. We plotted the network error for learning rates in the range [0.0001, 0.05] for 500 epochs to find the best performing learning rate using the Adam optimiser, which was 0.01 for the 2D-2D case and 0.001 for the 2D-3D and 3D-3D cases. Then, PDAEs were trained for 10,000 epochs with a stopping criterion of E total ≤ 0.001 and tanh as the activation function.

Results of 2-Manifold Alignment
In the case of the 2D-2D motion data, limb u l rotated around joint J 1 in a circle and limb u 2 rotated in another circle around joint J 2 . In three-dimensional space, this motion can be represented as a torus S 1 × S 1 , that is a 2-manifold. In the first row of Figure 5, we can see that for zero noise, the visualisations of the results for MAPA-inst, MALG and PDAE resulted in objects that resembled the expected torus-shaped surfaces, while the other methods produced cylinder-like deformations of torus-shaped surfaces. With increasing levels of noise (only three representative levels are displayed), the instance-level alignments were less stable than the feature-level alignments. The addition of noise led all traditional methods to fail by collapsing the resulting manifolds or misaligning the two sets. Visually, the best outcomes were achieved by PDAE where for all levels of noise, an object resembling a torus-like surface with minor deformations was obtained.  . The expected result is a torus S 1 × S 1 . However, the outcomes of MAPA, MALG and MAGG tend to collapse into a cylinder or for noise ranges ≥±2 • misalign or otherwise disintegrate, particularly at the instance level. The only exception seems to be MALG-feat at the highest level of actuator noise. Otherwise, the graphs in the rightmost column demonstrate that of all methods tested, PDAE has the best ability to produce the expected torus-like manifold at all levels of noise.
The alignment errors ∆ in Table 2 indicate that MALG-feat resulted in the closest alignment among the conventional methods. The ∆ of the autoencoder was lower than that of MALG-feat for higher levels of noise. Table 2 also shows that PDAE had the lowest standard deviation (in parentheses) at high levels of noise. This indicates that PDAE more smoothly aligned than the other methods. It should be noted that low values for ∆ or σ can also occur when a torus manifold cannot correctly be established and uniformly collapses or projects into a simpler form as, for example, a cylinder in some cases of MALG-feat. We trained and tested PDAE with five different sets of initial random weights, and the standard deviation of the mean of the resulting alignment errors was about 0.0005 at zero noise. This indicates that the autoencoder results do not depend in a notable way on the selection of the initial random weights.

Results of 3-Manifold Alignment
In the case of the 2D-3D motion data, limb u 1 rotated around joint J 1 in a circle and limb u 2 rotated on a two-dimensional sphere around joint J 2 . In 3D space, this motion is represented by a manifold homeomorphic to S 1 × S 2 and can be visualised by a circle of spheres. For clearer visualisation of the quality of the alignments, we plotted only six of the aligned spheres equally distributed along the circle in Figure 6. The figures indicate visually (best if enlarged) that already for zero noise, the visualisations of the results of all methods had alignment issues except PDAE, which produced a near-perfect alignment. The alignment errors ∆ and σ in Table 3 corroborate the visual assessment and show that PDAE was the best-performing method in the experiments using the 2D-3D data.

Results of 4-Manifold Alignment
The 11-dimensional data of the two 3D double pendulums were described in Section 5.2.1. Each of the two datasets X and Y was represented by a 20,736 × 11 data matrix. In the 3D pendulum motion, the rotation of limb u 2 described a sphere S 2 , and the motion of the other limb u 1 described another sphere S 2 , so that the 3D pendulum motion resembled S 2 × S 2 , which is a 4-manifold. As this was too complex to visualise in full, we took snapshots of the motion around J 1 at 90 • steps and for the motion around J 2 at 30 • steps. This way, the rotation of u 2 resulted in six spherical shapes that were uniformly distributed on a bigger sphere, which represented the motion of u 1 . The visualisations in Figure 7 show that the instance-level methods were not successful in aligning the high-dimensional nonlinear motion data. The manifolds of the instance-level alignments collapsed even without any noise. Only MAGG-feat and PDAE produced the expected visualisation, comprising six spheres that represented the snapshots we selected in the motion data on S 2 × S 2 . The outcome of MAGG-feat was also supported by our case study [47].
Inst Feat Inst Feat Inst Figure 7. Alignments of 3D-3D motion manifolds: Each graph visualises a different way of aligning the manifolds underlying datasets X and Y, which are collected from snapshots of 90 • steps at u 1 and 30 • steps at u 2 . All manifolds of the instance-level methods collapsed or misaligned. The outcome of MAGG-feat and the deep autoencoder shows the expected results, that is six spheres representing snapshots of the pendulum movements on a 4-manifold homeomorphic to S 2 × S 2 .
For the remaining experiments, random noise was added to the data in several stages as described in Section 5.2.1. The noisy datasets were aligned using MAPA-feat, MALG-feat and PDAE. All manifolds collapsed after noise addition, and therefore, only visualisations for zero noise are included in Figure 7. ∆ and σ were calculated as described in Section 4. The numerical results in Table 4 together with the qualitative visual evaluations of Figure 7 showed that PDAE performed better than MAGG-feat and the other methods. Table 4. 3D-3D manifold alignment: Shown is the alignment error ∆ with the standard deviation for each level of the noise. The lowest ∆ or closest alignment of each row is highlighted in bold. The experiments were executed on the university's high performance computing grid with 60 GB RAM using two parallel k80 GPUs. The huge speed advantage of inference with the autoencoder was representative of all our data and all comparative simulations we conducted (Table 6). However, these speed results can only be indicative for precise benchmarking of a standalone high-performance machine or a specialised setup would be required.

Cross-Modality Manifold Alignment
The following pilot case study was included to demonstrate the cross-modality ability of the asymmetric PDAE concept. The 2D-2D kinematics dataset of double pendulum motion as described in Section 5.2.1 was used as dataset X and image frames of a simulated video of the same motion were used as dataset Y. Dataset X had 1296 instances, where each instance was a six-dimensional feature vector. Dataset Y had the same number of instances where each instance was an image with 128 × 128 pixels. The datasets were generated and ordered so that each instance X i of X had a corresponding instance Y i in Y obtained from the same joint angles.
In the asymmetric PDAE, the structure of the fully-connected autoencoder was similar to the autoencoder used to align the 2-manifolds described in Section 5.2.2. The structure of the Convolutional Autoencoder (CAE) is given in Table 5. The asymmetric PDAE was trained for 10,000 epochs using the Adam optimiser with a learning rate of 0.0001.
Manifold alignment experiments using the asymmetric PDAE were conducted in four different ways, each using a differently-sized correspondence subset for the calculation of E corr . The sizes of the correspondence subsets in the four experiments were 10%, 30%, 50% and 100% of the total number of instances. The results using the four different correspondence subsets are shown in Figure 8. The figures show that the more correspondence pairs were used, the better the alignment could be performed.

Discussion and Conclusions
While previous research had shown that deep autoencoders are capable of manifold learning, this study introduced a parallel deep autoencoder model for manifold alignment. The present study focused on semi-supervised approaches where parts of the data were labelled by correspondence information. Future studies may address the general case of data without correspondence information and a wider range of datasets.
In the new parallel model, two deep autoencoder networks were trained in parallel to minimise the sum of their reconstruction errors and a correspondence error. The minimisation of the reconstruction errors led the deep autoencoders to extract manifolds intrinsic to the datasets. The minimisation of the correspondence error aligned the extracted manifolds at the code layer. The activations at the 3D code layers allowed visualising the aligned manifolds or sections of them in three dimensions.
First, PDAE was evaluated on a well-known protein structure benchmark dataset, where it performed comparably to traditional manifold alignment methods.
Then, in the main part of our study, PDAE was tested on a suite of new 6-dimensional, 9-dimensional, and 11-dimensional datasets that we generated by simulating the non-linear dynamics of two double pendulums and adding various levels of noise. It should be noted that manifold learning and manifold alignment can be computationally very expensive (Table 6). Hence, the experiments of our comparative study were restricted to two pendulums with similar arm-length proportions. With this data, the expected resulting manifolds were of sufficient complexity to challenge the methods, but could still be visualised either in full as two-dimensional tori in three-dimensional space ( Figure 5) or using snapshots or sections of the resulting 3-manifolds ( Figure 6) or 4-manifolds ( Figure 7).
Interestingly, the visualisation of the aligned manifold obtained from the two three-dimensional pendulum motion datasets on the right in Figure 7 together with the numerical evaluation in Table 4 indicated that, even in this unexampled case of representing a latent 4-manifold, PDAE produced the expected outcome and performed better than all other tested methods. In fact, all methods failed except PDAE and MAGG-feat. The quantitative performance evaluation in Table 4 shows that the alignment using PDAE was closer and more stable than that of MAGG. Moreover, due to the involvement of high-dimensional matrix multiplication, MAGG required a significantly higher execution time than the other methods (Table 6). Table 6 summarises the execution times for all methods used in our study when applied to the clean and complete versions of our simulated datasets. While training of the PDAE took a long time, the inference, that is the execution time using the trained neural model, was significantly faster than running the other methods. It is important to note that if new data points are included in the dataset, the trained model of PDAE can still be executed while the other methods require recalculation. Table 6. Shown are the execution times of the different manifold alignment methods when processing our data. The dataset sizes were 1296 × 6, 1728 × 9, and 20,736 × 11 for the 2-, 3-and 4-manifold data, respectively. The training times of the PDAE were recorded for 10,000 epochs and averaged over five runs starting from different initial weights. Standard deviations are in parenthesis. The other methods did not involve randomness, and their execution times remained the same in repeat experiments. MAGG calculates the distances between the two datasets at the input, which have to be of a common fixed dimension. In contrast, PDAE can be designed to take two datasets of different dimensions as inputs. For example, in the experiments in Section 5.3, PDAE demonstrated successful alignment of data of two different modalities and of different dimensions.

2-Manifold Alignment 3-Manifold Alignment 4-Manifold
In summary, the results of the reported manifold alignment experiments showed that PDAE performed competitively and in most cases better than the traditional methods MAPA, MALG and MAGG. When synthetic random noise was added to the data at the actuator and the end-effector, the alignment using the new PDAE was often still possible and more stable than that of all other methods that were used for comparison.
The addition of noise is a critical contribution in this type of study using manifolds of dimension larger than one, as in traditional manifold learning, there is a substantial risk that the topology of the resulting manifolds cannot be established or collapses if there are not enough data or if there is any disturbance or noise in the data [11,12].
It is due to these issues that traditional manifold alignment struggled to become popular in applications and that the present study had to confine itself to simulated data to achieve expressive comparative results. Nonetheless, the torus surface S 1 × S 1 , the 3-manifold S 1 × S 2 and the 4-manifold S 2 × S 2 were topologically more complex manifolds than the manifolds underlying the data of most previous studies on manifold alignment. The experiments of our study showed that the concept of PDAE allowed manifold alignment to result in manifolds of non-trivial topology.
We hope that the new PDAE model with its fast inference times, its robustness to noise and its ability to process datasets of different dimensions and to extract manifolds of 2, 3, 4 or more dimensions will open new opportunities for applications of semi-supervised manifold alignment. Funding: Some of the computational (and/or storage) resources used in this work were enabled by Intersect Australia Limited and partially subsidised by funding from the Australian Research Council through ARCLIEF Grants LE160100002 and LE170100032.