GANs-Based Intracoronary Optical Coherence Tomography Image Augmentation for Improved Plaques Characterization Using Deep Neural Networks

: Data augmentation using generative adversarial networks (GANs) is vital in the creation of new instances that include imaging modality tasks for improved deep learning classiﬁcation. In this study, conditional generative adversarial networks (cGANs) were used on a dataset of OCT (Optical Coherence Tomography)-acquired images of coronary atrial plaques for synthetic data creation for the ﬁrst time, and further validated using deep learning architecture. A new OCT images dataset of 51 patients marked by three professionals was created and programmed. We used cGANs to synthetically populate the coronary aerial plaques dataset by factors of 5 × , 10 × , 50 × and 100 × from a limited original dataset to enhance its volume and diversiﬁcation. The loss functions for the generator and the discriminator were set up to generate perfect aliases. The augmented OCT dataset was then used in the training phase of the leading AlexNet architecture. We used cGANs to create synthetic images and envisaged the impact of the ratio of real data to synthetic data on classiﬁcation accuracy. We illustrated through experiments that augmenting real images with synthetic images by a factor of 50 × during training helped improve the test accuracy of the classiﬁcation architecture for label prediction by 15.8%. Further, we performed training time assessments against a number of iterations to identify optimum time efﬁciency. Automated plaques detection was found to be in conformity with clinical results using our proposed class conditioning GAN architecture.


Introduction
Deep neural networks (DNNs) have emerged as a promising solution in medical imaging classification tasks. Deep learning algorithms model high level abstractions by making use of immense neural networks with multiple layers of execution units. These advanced computing networks realize ameliorated training techniques to unfold complex data patterns in big data problems. DNNs discern features incrementally and build feature sets independently, and thus do not require any supervision. Broadly, they are classified as multi-layer perceptrons (MLPs), recurrent neural networks (RNNs) and convolutional neural networks (CNNs) [1]. Generally, CNNs are preferred over other modalities for medical imaging problems and classification purposes.
Data augmentation (DA) techniques assist with the reliable characterization of limited medical image datasets processed through DNNs [2,3]. Limited training data in deep models can be handled by applying DA techniques to scale up the dataset in terms of size Optics 2023, 4 289 and diversity [4,5]. To reduce the generalization error of the classification model, augmented data should focus primarily on generic features to mitigate overfitting [6,7]. Aggregating different augmentations assists with significant data augmentation, but leads to overfitting for limited medical imaging datasets, including OCT-acquired coronary plaque images [8,9]. A new classification algorithm suitable for real-time clinical assessment of coronary arterial plaques has been reported [10,11]; however, the real impact of synthetic data creation has not been thoroughly elaborated. Feature space augmentation is generally not preferred for medical imaging datasets due to interpretation, time and space complexities. Using GANs, we can synthetically generate new medical images, including OCT images, from limited available samples by setting up the generator and the discriminator in competition. This helps resolve the prevalent issues of data volume, diversification and class balance for coronary plaques images acquired through OCT [12]. The availability of public datasets has allowed clinical professionals to draw inter-comparisons between different GANs [13,14]. Standard GANs are being replaced by different variants to achieve better classification results and parameters optimization [15]. These include WGAN, Super GAN, LS-GAN, Bi-GAN and StyleGAN for medical imaging datasets of Alzheimer's and brain and liver tumors, and high-resolution synthesis of sclerosis [16][17][18][19][20][21][22]. Generally, cGANs are favored for creating synthetic images related to cardiac imaging modalities, as they leverage additional information in the form of labelled data [23][24][25][26][27][28][29]. Further, they avoid modal collapse [30] and result in faster convergence by conditioning class labels to both the discriminator and the generator.
To the best of the authors' knowledge, cGANs have not been exploited for the augmentation of coronary images acquired via OCT imaging for the classification of coronary atrial plaques. The resolve of this study was to perform data augmentation on a newly created OCT dataset using cGANs. The created images were then appended to an original OCT dataset and validated using leading AlexNet classification architecture. Further, we aimed to establish the impact of augmentation on the deep learning classification of coronary atrial plaques. To achieve this, the objective functions of the generator and the discriminator were set in competition against each other. Using the gradient descent method for optimization ensured the creation of new instances that exactly replicated the original images. Transfer learning was exploited for training purposes. We froze the initial layers of our model and fine-tuned them accordingly. To establish ground-truth, three professional clinicians determined the plaque type present in an A-scan. We envisaged the impact of the ratio of real data to synthetic data on classification accuracy. Using assessments, we illustrated how populating real images with created instances during the training phase increased our confidence in the reliable label prediction of coronary plaques. Using cGANs-created images and varying the ratio between created images and real data within each sub-batch resulted in up to a 15.8% improvement in classification accuracy.

Methods
A newly created dataset of 88 atrial stenoses in 51 patients was developed using the commercially available OCT system. Our target vessels were those with stenosis. However, for the sake of simplicity, serial stenosis and by-pass graft stenosis were not taken into consideration. The study was approved by Galway's Clinical Research Ethics Committee (GCREC) with informed consents from the patients. A general flow of the algorithm implemented in the study is presented in Figure 1.
All three clinicians individually labelled the OCT acquired images, but each label was finally decided on the basis of mutual consensus. We set up a classification problem by setting up a specific label against the rest. Preprocessing steps were applied to raw OCT acquired images before feeding them further into our model. Vulnerable plaques that mapped with the established fibrous cap thickness criterion were excluded from this study. Measuring the signal intensity to lumen helped characterize them as either lipid (low signal intensity) or calcified (high signal intensity) plaques. In the original dataset, Optics 2023, 4 290 27% of images were labeled as "calcified", 23% as "lipid plaque", 21% as "mixed plaque" and 29% as "no plaque".
Optics 2023, 4, FOR PEER REVIEW 3 Figure 1. An executional flow of the phases performed to achieve classification of different coronary atrial plaques using cGANs.
All three clinicians individually labelled the OCT acquired images, but each label was finally decided on the basis of mutual consensus. We set up a classification problem by setting up a specific label against the rest. Preprocessing steps were applied to raw OCT acquired images before feeding them further into our model. Vulnerable plaques that mapped with the established fibrous cap thickness criterion were excluded from this study. Measuring the signal intensity to lumen helped characterize them as either lipid (low signal intensity) or calcified (high signal intensity) plaques. In the original dataset, 27% of images were labeled as "calcified", 23% as "lipid plaque", 21% as "mixed plaque" and 29% as "no plaque".

Medical Images Augmentation Using GANs
Conditional GAN transformed image data from lower to higher dimensional space using a random noise vector [31,32]. Adding a vector of features derived from an OCT image, along with the noise input, the discriminator's task was to find the similarity index between the real and fake images and assign false images to its input labels. Conditioning was accomplished by feeding one hot vector into the input of the generator [33]. A highlevel exhibit of our cGAN is presented in Figure 2. The generator and the discriminator operated with multiple convolution layers, batch normalization layers and ReLU functions. Class encoding was one hot vector with a length equivalent to the number of classes (4 in our case), and the position of the nth class was one, with the rest set to zero. In our cGAN network, we parsed in OCT images along with the class information to which each image belonged. We approximated an unfamiliar data pattern through a generator that attempted to fit samples to a known prior distribution. The generator (G) competed against the discriminator (D) to compute the desired convergence point.

Medical Images Augmentation Using GANs
Conditional GAN transformed image data from lower to higher dimensional space using a random noise vector [31,32]. Adding a vector of features derived from an OCT image, along with the noise input, the discriminator's task was to find the similarity index between the real and fake images and assign false images to its input labels. Conditioning was accomplished by feeding one hot vector into the input of the generator [33]. A highlevel exhibit of our cGAN is presented in Figure 2. The generator and the discriminator operated with multiple convolution layers, batch normalization layers and ReLU functions. Class encoding was one hot vector with a length equivalent to the number of classes (4 in our case), and the position of the nth class was one, with the rest set to zero. In our cGAN network, we parsed in OCT images along with the class information to which each image belonged. We approximated an unfamiliar data pattern through a generator that attempted to fit samples to a known prior distribution. The generator (G) competed against the discriminator (D) to compute the desired convergence point.
The generator created accurate new instances using the algorithm presented in Figure 3. The discriminator's task was to differentiate between the real and real-like created instances, as illustrated in Figure 2. If the cGANs-produced image was identical to the original image, then it was appended to the labelled class of images.
Our discriminator correctly labeled the incident real and false images into their specific classes. Our loss function for the discriminator was the sum of false and real image loss. Our aim was to diminish the error of forecasting real images of the dataset and the generated fake images [34,35]. The generator created accurate new instances using the algorithm presented in Figure 3. The discriminator's task was to differentiate between the real and real-like created instances, as illustrated in Figure 2. If the cGANs-produced image was identical to the original image, then it was appended to the labelled class of images. Our discriminator correctly labeled the incident real and false images into their specific classes. Our loss function for the discriminator was the sum of false and real image loss. Our aim was to diminish the error of forecasting real images of the dataset and the generated fake images [34,35].
The discriminator's training flow is presented in Figure 4. In this Figure, G is the generator, D represents the discriminator, X indicates the fake images conditioned on Y, Y is the hot label for real data images and Y is the hot label for fake images. Random noise was fed to the generator along with class encoded hot label Y to foster fake images conditioned upon Y. In this cycle, gradients were not passed down the generator. The generator training process was similar to the discriminator training cycle, as illustrated in Figure 4. Both the discriminator and the generator were proficient enough to foster data close to the original data. During this training phase, discriminator gradients were conceded by the generator. The generator created accurate new instances using the algorithm presented in Figure 3. The discriminator's task was to differentiate between the real and real-like created instances, as illustrated in Figure 2. If the cGANs-produced image was identical to the original image, then it was appended to the labelled class of images. Our discriminator correctly labeled the incident real and false images into their specific classes. Our loss function for the discriminator was the sum of false and real image loss. Our aim was to diminish the error of forecasting real images of the dataset and the generated fake images [34,35].
The discriminator's training flow is presented in Figure 4. In this Figure, G is the generator, D represents the discriminator, X indicates the fake images conditioned on Y, Y is the hot label for real data images and Y is the hot label for fake images. Random noise was fed to the generator along with class encoded hot label Y to foster fake images conditioned upon Y. In this cycle, gradients were not passed down the generator. The generator training process was similar to the discriminator training cycle, as illustrated in Figure 4. Both the discriminator and the generator were proficient enough to foster data close to the original data. During this training phase, discriminator gradients were conceded by the generator. The discriminator's training flow is presented in Figure 4. In this Figure, G is the generator, D represents the discriminator, X indicates the fake images conditioned on Y, Y is the hot label for real data images and Y is the hot label for fake images. Random noise was fed to the generator along with class encoded hot label Y to foster fake images conditioned upon Y. In this cycle, gradients were not passed down the generator. The generator training process was similar to the discriminator training cycle, as illustrated in Figure 4. Both the discriminator and the generator were proficient enough to foster data close to the original data. During this training phase, discriminator gradients were conceded by the generator.
Generator and discriminator loss functions are already reported in the literature [36,37] and are expressed in Equations (1) and (2).
As illustrated in Equations (1) and (2), we were dealing with two neural networks in which the generator began with a random data distribution and attempted to foster exact replicas of the real images. The discriminator network improved at distinguishing between real and generated images through successive training iterations. Both networks work against each other to generate perfect aliases of the original images. Optics 2023, 4, FOR PEER REVIEW Generator and discriminator loss functions are already reported in the liter [36,37] and are expressed in Equations (1) and (2).
As illustrated in Equations (1) and (2), we were dealing with two neural netwo which the generator began with a random data distribution and attempted to foster replicas of the real images. The discriminator network improved at distinguishin tween real and generated images through successive training iterations. Both netw work against each other to generate perfect aliases of the original images. Figure 5 represents the cutting edge CNN algorithm (AlexNet), in which each c signifies the input and output characteristics [38]. AlexNet architecture consists of layers with five convolutional and three fully-connected layers. AlexNet was preferr validate of our synthetically created images due to its competitive edge in trainin cGANs. Dropout was functional on the first and second fully-connected layers. The image dimensions were 227 × 227 × 3. For down-sampling, a max-pooling operation implemented for a stride of two between the adjacent frames. In the dropout pha neuron with a probability of ½ was disregarded. This means that the neuron had no tributed to the forward propagation or the back propagation loss. This ensured that iteration captured a diverse sample of the model's parameters without being over fi  Figure 5 represents the cutting edge CNN algorithm (AlexNet), in which each chunk signifies the input and output characteristics [38]. AlexNet architecture consists of eight layers with five convolutional and three fully-connected layers. AlexNet was preferred to validate of our synthetically created images due to its competitive edge in training the cGANs. Dropout was functional on the first and second fully-connected layers. The input image dimensions were 227 × 227 × 3. For down-sampling, a max-pooling operation was implemented for a stride of two between the adjacent frames. In the dropout phase, a neuron with a probability of 1 2 was disregarded. This means that the neuron had not contributed to the forward propagation or the back propagation loss. This ensured that every iteration captured a diverse sample of the model's parameters without being over fitted. To achieve transfer learning, we detached the last layer and used AlexNet as our preprepared model. For the second pass of the training cycle, we release the 4th and 5th convolutional layers with our first three layers jammed. Fine-tuning was completed by eliminating the fully linked nodes and inserting new layers. We regularized our classification forecasts via cross-entropy loss amid the factual label distribution and the projected label.

Results and Discussion
We performed simulations on the newly created OCT dataset, which contained 146 original images belonging to four classes that were then augmented using cGAN with To achieve transfer learning, we detached the last layer and used AlexNet as our pre-prepared model. For the second pass of the training cycle, we release the 4th and 5th convolutional layers with our first three layers jammed. Fine-tuning was completed by eliminating the fully linked nodes and inserting new layers. We regularized our classification forecasts via cross-entropy loss amid the factual label distribution and the projected label.

Results and Discussion
We performed simulations on the newly created OCT dataset, which contained 146 original images belonging to four classes that were then augmented using cGAN with multiplication factors of 5×, 10×, 50× and 100×. The × indicates images acquired via OCT and factor indicates images generated using augmentation. We used 70:30 factors for training and testing purposes. We used TensorFlow, an open-source framework (v2.10.0), to build the model and to implement it. We used CUDA to accelerate training a batch size of 64. For model training, a momentum of 0.9 was used with an initial learning rate of Lr = 10 −4 . This initial value of momentum helped accelerate training and converge the optimization cycle at the end of training. The learning drop factor was 0.1 with a learning drop rate period of five. The Adam optimizer was used to update network weights. Our model distinguished the coronary arterial plaques in a time-competitive fashion. The results were considered using the testing dataset, whereas assorted hyper-parameters were performed using the validation set. Figure 6 exhibits GANs-generated sample images of our four classes; namely, normal, calcified, lipid and mixed atrial plaques.
To achieve transfer learning, we detached the last layer and used AlexNet as our preprepared model. For the second pass of the training cycle, we release the 4th and 5th convolutional layers with our first three layers jammed. Fine-tuning was completed by eliminating the fully linked nodes and inserting new layers. We regularized our classification forecasts via cross-entropy loss amid the factual label distribution and the projected label.

Results and Discussion
We performed simulations on the newly created OCT dataset, which contained 146 original images belonging to four classes that were then augmented using cGAN with multiplication factors of 5×, 10×, 50× and 100×. The × indicates images acquired via OCT and factor indicates images generated using augmentation. We used 70:30 factors for training and testing purposes. We used TensorFlow, an open-source framework (v2.10.0), to build the model and to implement it. We used CUDA to accelerate training a batch size of 64. For model training, a momentum of 0.9 was used with an initial learning rate of L r = 10 −4 . This initial value of momentum helped accelerate training and converge the optimization cycle at the end of training. The learning drop factor was 0.1 with a learning drop rate period of five. The Adam optimizer was used to update network weights. Our model distinguished the coronary arterial plaques in a time-competitive fashion. The results were considered using the testing dataset, whereas assorted hyper-parameters were performed using the validation set. Figure 6 exhibits GANs-generated sample images of our four classes; namely, normal, calcified, lipid and mixed atrial plaques.  Next, cGAN-generated images were validated using the leading AlexNet architecture, in which each layer performed multi-level feature extraction and local features alongside. The composition of our fully connected layer helped reduced the dimensionality of training parameters. During the discriminator's training, the same hot label was transformed to a tensor and combined with a fake image to feed it. The discriminator also received real data images along with the respective hot label transformed tensors. The discriminator determined the input was a real image based on binary cross-entropy loss. We abated the cross-entropy loss L in our algorithm using Equation (3) for multiclass classification. As we already had target probability distribution for an input class label, our aim was to predict the target distribution with reasonable confidence.
where m denotes the number of classes, y is the ground truth label and Y represents the softmax normalized model prediction.
We minimized cross-entropy across our training dataset by averaging out the crossentropy for all training images. Our classification problem had three classes; any sample Optics 2023, 4 294 belonged to one of these three classes. The discrete probability distribution had a value of one when a sample belonged to a specific class, and zero for the rest of the classes. Table 1 indicates the validation accuracy with and without the augmented data. It is clear that without synthetic data the classification accuracy of our model was 82.9%; it reached 98.7% after the synthetic data was merged with the original data. This resulted in reduced overfitting and improved the model's generality. The number of iterations and its impact on the validation accuracy is highlighted in Table 1. The relationship between the iterations and training time is exhibited in Figure 7 for different augmentation scaling. We performed experiments for 5×, 10×, 50× and 100× and recorded the training time in minutes for each set of experiments. These multiplicative factors were chosen for simplicity and by observing the diversity and imbalance present in our original dataset. To observe the pronounced effect of augmentation, we scaled for 50× and 100×. This helped determine the extent to which data could be augmented and enhanced our classification accuracy. It can be inferred from Figure 7 that 5× was the optimum augmentation scheme for our created dataset. This is attributed to fact that our original OCT dataset was limited in size; therefore, 5× and 10× did not significantly improvise in terms of the samples' diversification. Similarly, 100× led to overfitting our sample space along with adding an additional computational burden. An interesting point that can be deduced from the results presented in Figure 7 is that the increase in the training data was not exactly proportional to the increase in the performance. When a specific threshold was achieved, no further improvement in the performance was observed. This is attributed to the fact that the diversity of data could not be further improved.
We further envisaged the effect of the augmented-to-real ratio on the increase in the classifier's performance, as illustrated in Figure 8. The dotted line indicates the nonaugmented baseline. As we increased the augmented-to-real ratio, the dataset inflated in terms of size and diversity, allowing the classifier to train more rigorously. This diversification of samples ameliorated the classifier's ability to more correctly ascertain the labels. The classifier's accuracy increased until our augmented-to-real ratio reached 50, as indicated in Figure 8. It then flattened due to similar and repeated samples being created. This filled the dataset, but could not further boost the generality of the model. The overall increase in the classifier's performance due to augmented data was 15.8%. In Figure 8, the height of the bar indicates variance and the outer stubbles mark the maximums and minimums. We further envisaged the effect of the augmented-to-real ratio on the increase in the classifier's performance, as illustrated in Figure 8. The dotted line indicates the non-augmented baseline. As we increased the augmented-to-real ratio, the dataset inflated in terms of size and diversity, allowing the classifier to train more rigorously. This diversification of samples ameliorated the classifier's ability to more correctly ascertain the labels. The classifier's accuracy increased until our augmented-to-real ratio reached 50, as indicated in Figure 8. It then flattened due to similar and repeated samples being created. This filled the dataset, but could not further boost the generality of the model. The overall increase in the classifier's performance due to augmented data was 15.8%. In Figure 8, the height of the bar indicates variance and the outer stubbles mark the maximums and minimums. Optics 2023, 4, FOR PEER REVIEW 9 Figure 8. Impact of augmented-to-real data ratio on the classifier's performance.
Fundamental augmentation methods generate synthetic medical images relatively easily, whereas DNNs employ cross domains to create new images that achieve more diversity in the creation of new data. The choice of augmentation technique is contingent upon the available dataset and the ultimate objective involved. To the best of the authors' knowledge, there is no reported published research in which augmentation has been applied directly to coronary atrial plaques data. For the sake of assessment, our proposed augmentation technique was compared to other datasets for an inter-comparison, as illustrated in Table 2. In Table 2, we present the dataset length, type of dataset and achieved accuracies both without and with augmentation techniques applied. Only fundamental data creation techniques, including random cropping, distortion, blurring and random erasing, were performed on certain datasets indicated in Table 2. However, Table 2 also Fundamental augmentation methods generate synthetic medical images relatively easily, whereas DNNs employ cross domains to create new images that achieve more diversity in the creation of new data. The choice of augmentation technique is contingent upon the available dataset and the ultimate objective involved. To the best of the authors' knowledge, there is no reported published research in which augmentation has been applied directly to coronary atrial plaques data. For the sake of assessment, our proposed augmentation technique was compared to other datasets for an inter-comparison, as illustrated in Table 2. In Table 2, we present the dataset length, type of dataset and achieved accuracies both without and with augmentation techniques applied. Only fundamental data creation techniques, including random cropping, distortion, blurring and random erasing, were performed on certain datasets indicated in Table 2. However, Table 2 also includes more sophisticated data creation techniques, such as the one we proposed, to validate the efficacy of artificial data creation via deep learning coupled augmentation.

Conclusions and Future Work
We presented an in-depth consideration of using cGANs to generate synthetic coronary atrial plaques images. Data augmentation was proven particularly advantageous to avoid overfitting in our case and in other scenarios where there is limited training data available. The observed percentage increase in classification accuracy upholds the importance of exploiting cGANs for data augmentation in medical imaging datasets, including ours. However, the ratio of synthetic data to real data to populate the real OCT dataset was found to be crucial in terms of model overfitting, time competitiveness and confident assessment of the model's performance.
Building a deep learning algorithm for real-time clinical assessments with data privacy intact and aggregating cGANs with other augmentation meta-learning architectures, such as neural style transfers, are imperative areas for future work. We also envisage the possibility of increasing GANs' training speed via concurrent networks for large-scale medical imaging datasets. Furthermore, the practical coupling of data augmentation algorithms into software development tools and the optimization of applications also offer blue sky research avenues to unleash the real potential of data augmentation in automated medical imaging. Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to basic character of research.