Style Transfer and Topological Feature Analysis of Text-Based CAPTCHA via Generative Adversarial Networks

Xue, Tao; Guo, Zixuan; Yin, Zehang; Rong, Yu

doi:10.3390/math13111861

Open AccessArticle

Style Transfer and Topological Feature Analysis of Text-Based CAPTCHA via Generative Adversarial Networks

¹

School of Information Engineering, China University of Geosciences (Beijing), Beijing 100083, China

²

Frontiers Science Center for Deep-Time Digital Earth, China University of Geosciences (Beijing), Beijing 100083, China

³

Faculty of Science, The University of Sydney, Sydney, NSW 2006, Australia

^*

Author to whom correspondence should be addressed.

Mathematics 2025, 13(11), 1861; https://doi.org/10.3390/math13111861

Submission received: 28 April 2025 / Revised: 26 May 2025 / Accepted: 31 May 2025 / Published: 2 June 2025

(This article belongs to the Special Issue Topological Analysis and Computation of Chemical Graphs and Physical Networks)

Download

Browse Figures

Versions Notes

Abstract

The design and cracking of text-based CAPTCHAs are important topics in computer security. This study proposes a method for the style transfer of text-based CAPTCHAs using Generative Adversarial Networks (GANs). First, a curated dataset was used, combining a text-based CAPTCHA library and image collections from four artistic styles—Van Gogh, Monet, Cézanne, and Ukiyo-e—which were used to generate style-based text CAPTCHA samples. Subsequently, a universal style transfer model, along with trained CycleGAN models for both single- and double-style transfers, were employed to generate style-enhanced text-based CAPTCHAs. Traditional methods for evaluating the anti-recognition capability of text-based CAPTCHAs primarily focus on recognition success rates. This study introduces topological feature analysis as a new method for evaluating text-based CAPTCHAs. Initially, the recognition success rates of the three methods across four styles were evaluated using Muggle-OCR. Subsequently, the graph diameter was employed to quantify the differences between text-based CAPTCHA images before and after style transfer. The experimental results demonstrate that the recognition rates of style-enhanced text-based CAPTCHAs are consistently lower than those of the original CAPTCHA, suggesting that style transfer enhances anti-recognition capability. Topological feature analysis indicates that style transfer results in a more compact topological structure, further validating the effectiveness of the GAN-based twice-transfer method in enhancing CAPTCHA complexity and anti-recognition capability.

Keywords:

generative adversarial networks; text-based captcha; style transfer; topological feature analysis; graph; diameter; anti-recognition capability

MSC:

68T45

1. Introduction

CAPTCHA is a fully automated public Turing test used to assess computers and humans [1]. As a fundamental component of human–computer interaction interfaces, CAPTCHA is vital for identity verification and risk mitigation. It plays an indispensable role in online services, such as e-commerce, online ticketing, and email communication [2], where users are often required to submit sensitive personal information, including identity and payment details [3]. Inadequate website security exposes sensitive information to potential attacks, leading to privacy breaches and financial loss [4]. Attacks on website systems can result in the leakage of corporate secrets and disrupt business operations [5]. According to Alexa’s global ranking of the top 50 websites, 80% employed CAPTCHA during login, registration, repeated incorrect password attempts, and password-recovery processes to counter automated attacks. Notably, 60% of these websites, including major platforms such as Microsoft, Baidu, and eBay, use text-based CAPTCHA [6]. Text-based CAPTCHA has been widely adopted because of its simple interaction, large password space, and strong adaptability across various scenarios [7]. Therefore, investigating the security of text-based CAPTCHA is essential for enhancing traditional CAPTCHA-generation techniques.

Over the past few years, techniques for cracking text-based CAPTCHA have advanced significantly. Initially, attackers relied on analyzing specific CAPTCHA-generation rules and random seeds, but they now employ Optical Character Recognition (OCR) and deep neural networks [8] to break these systems. Consequently, CAPTCHA attackers utilize an ever-growing array of methods with increasing investments in data and computational resources, leading to higher attack success rates. In addition to the evolution of text-based CAPTCHA cracking techniques, anti-recognition technologies have undergone continuous iterations and enhancements. These advancements have introduced increasingly complex distortions, including character warping, stretching, random interference lines, and intricate background noise, which have progressively bolstered CAPTCHA security [9]. However, the inclusion of these disturbances often makes it difficult for human users to accurately identify characters within CAPTCHA images, thereby diminishing the overall user experience. Given the aforementioned challenges, it is essential to design a CAPTCHA that not only effectively reduces the recognition rate of automated systems but also ensures that users can reliably identify them. One potential solution is to preserve character outlines in CAPTCHA images while altering their stylistic appearances. This approach lowers the recognition rate of automated systems without compromising the user experience [10,11].

Style transfer is the transformation of an image into two distinct domains. Specifically, it involves providing a style image and converting any given image to reflect that style while preserving as much of the original content as possible [12]. The earliest methods of image-style transfer were based on statistical image techniques, which exploited the statistical differences between the input and target style images to achieve the transfer. A prominent example is the ‘neural style’ method introduced by Gatys et al. [13]. However, each transfer process in the model requires multiple iterations on a noisy image, which results in a substantial computational burden [14]. The next phase of development focused on style transfer based on convolutional neural networks (CNNs), leveraging deep learning models for efficient feature extraction and classification. A prominent example is the fast-style transfer method introduced by Johnson et al. [15], which utilizes a pretrained CNN model to rapidly convert an input image into a target style. Currently, the most widely adopted method is style transfer based on generative adversarial networks (GANs) using the CycleGAN approach proposed by Zhu et al. [16], which is a notable example. This method enables style transfer between two distinct domains, produces more realistic and effective style-transferred images, and is increasingly being applied in the style transfer field [14].

To enhance the anti-recognition capability and usability of text CAPTCHAs, this study proposes a method for style transfer of text CAPTCHAs using CycleGAN. Furthermore, a topological feature analysis approach was employed to thoroughly investigate the transferred CAPTCHA images, offering a novel perspective on CAPTCHA security evaluation. Topological feature analysis emphasizes the global structure and preserves key information from the original dataset to a significant extent. In contrast to traditional feature extraction methods, features identified through topological analysis remain unaffected by minor perturbations in data values and can reveal characteristics that traditional methods may overlook [17]. Initially, the datasets required for the experiment were gathered and organized, including the text CAPTCHA dataset generated using the Captcha library and the image sets for the four styles: Van Gogh, Monet, Cézanne, and Ukiyo-e. Subsequently, style-enhanced text CAPTCHAs were generated using the universal style transfer model, the trained CycleGAN model for once-transfer, and the trained CycleGAN model for twice-transfer. Finally, style-enhanced text CAPTCHAs were processed using Muggle-OCR to assess the recognition success rates of the three methods under the four different styles. These results were compared with topological feature analysis outcomes to evaluate the anti-recognition capabilities of each model. The technical approach used in this study is illustrated in Figure 1.

2. Theoretical Basis

2.1. Generative Adversarial Networks

Generative Adversarial Networks (GANs) are deep learning models that generate data resembling real data through competition between two neural networks. GANs primarily consist of a generator and a discriminator. The generator creates synthetic data from random noise, whereas the discriminator classifies the data (Figure 2). The generator continuously adjusts its parameters to deceive the discriminator, whereas the discriminator learns to distinguish between real and fake data. This adversarial process leads to the optimization of both networks, enabling the generator to produce data that closely approximate real-world data [18].

2.2. Cycle Generative Adversarial Networks

Unlike GANs, CycleGAN introduces cycle-consistency loss to ensure bidirectional transformation; that is, when an image is converted from one domain to another and then back again, its consistency is preserved. CycleGAN enables transformations between two distinct domains, such as horse and zebrafish images, without the need for paired training data. The core idea of this method involves the use of two generators and two discriminators to perform image transformation (Figure 3). The first generator converts an image from one domain to another, whereas the second generator performs a reverse transformation. The two discriminators were tasked with evaluating the quality of the generated images [16].

2.3. Analysis of Topological Feature of Graphs and Networks

Topological feature analysis aims to understand the function and behavior of a graph or network by examining its structural properties. Common topological parameters include the diameter, fractal dimension, and energy. Among these, the diameter refers to the maximum shortest path between any two nodes in the network and is used to describe the distance characteristics of the graph. A larger diameter indicates a ‘looser’ structure, whereas a smaller diameter indicates a ‘tighter’ structure [19]. The formula is as follows:

Diameter (G) = \max_{u, v \in V} d (u, v),

(1)

Let

G = (V, E)

represent a graph, where

V

is the set of nodes, and

E

is the set of edges.

u

and

v

are any two nodes in the graph and

d (u, v)

denotes the shortest path length from node u to node v.

During image processing, the original image was initially converted to grayscale, and the background noise was removed through binarization (Figure 4), preserving essential structural information. During this process, each pixel is treated as a node in the graph and adjacent pixels are connected by edges, forming a two-dimensional complex graph network. The diameters of the graphs were then calculated.

To preserve the structural information of the graph more effectively and facilitate the subsequent calculation of its diameter, adjacent nodes were connected and the Euclidean distance between each pair of nodes was computed to represent the edge weights, resulting in a weighted undirected graph.

Algorithm 1 constructs the graph by utilizing the provided edge list to form a weighted graph, with each edge assigned a weight that serves as the basis for subsequent analysis, including the calculation of the graph’s diameter.

Algorithm 1: Create Weighted Graph
	Input: A list of edges with nodes and weights
	Output: A weighted graph G
1	edges = [ (’a’, ’b’, 2.8), (’a’, ’f’, 3.0), …];
2	Create an empty graph G;
3	foreach edge in edges do
4	Add weighted edge to G from edge;
5	return G

Subsequently, the nodes and edges of the graphs were extracted. Starting from the source node, Dijkstra’s algorithm is applied to compute the shortest path from the source node to all other nodes [19]. The shortest edge was chosen to extend the current shortest path, and the shortest paths between all pairs of nodes were calculated. The maximum shortest path length was identified by evaluating the shortest paths for all the node pairs, which corresponded to the diameter of the graph.

Algorithm 2 computes the shortest paths between all pairs of nodes in the graph using Dijkstra’s algorithm and returns the maximum value among the shortest paths, which corresponds to the diameter of the graph [20].

Algorithm 2: Calculate Network Diameter
	Input: Graph G, List of edges
	Output: Network diameter
1	Calculate all pairs shortest path using Dijkstra’s algorithm;
2	foreach node1 in nodes do
3	foreach node2 in nodes after node1 do
4	Calculate the shortest path distance from node1 to node2;
5	Sum the distances for all pairs;
6	Store and display each edge weight;
7	return The maximum of the shortest paths as the network diameter

To demonstrate the process of calculating the graph diameter, a portion of the graph structure in the binarized image was enlarged (Figure 5). The locally magnified graph contains nine nodes and 36 node pairs, and the shortest path between each node pair must be calculated. For instance, there are eight node pairs involving node ‘a’. Upon calculation, the path length from node ‘a’ to node ‘e’ passing through nodes ‘g’ and ‘i’ is nine, which is the maximum shortest path length among the node pairs involving ‘a’. This is also the maximum shortest path length across all node pairs, meaning that the diameter of the graph is nine.

3. Methodology

3.1. Enhanced CAPTCHA for Universal Style Transfer Models

The experiment consisted of three steps: generation of a text CAPTCHA dataset; selection of a style image; and application of the arbitrary-image-stylization-v2-256 universal style transfer model to generate style-enhanced CAPTCHAs.

Generation of text CAPTCHA dataset. To account for current anti-crawling technologies and avoid potential copyright issues, the experiment utilized the Captcha library to generate the required text in the CAPTCHA dataset. The Captcha library, written in Python 3, was designed to create various types of CAPTCHAs, including numeric, alphabetic, and alphanumeric variations. It is simple to use and highly customizable. A total of 3000 text CAPTCHAs, each with a width of 170 px, height of 80 px, and length of four characters, were generated using the Captcha library (Figure 6). The character set consisted of 10 digits and 26 uppercase and lowercase English letters.

Selection of the style image set. Considering factors such as the impact of various style images on enhancing CAPTCHA security and clarity, we selected multiple style images for the style transfer process applied to text CAPTCHAs. These images represent four typical artistic styles: Van Gogh, Monet, Cézanne, and Ukiyo-e (Figure 7, row a). A comparison of the style enhancement effects for each style was conducted in later stages.

The TensorFlow Hub repository provides a collection of pretrained universal image-style transfer models that can be directly utilized to apply style transfer to input images. In this experiment, the arbitrary-image-stylization-v2-256 model, based on convolutional neural networks, is introduced from TensorFlow Hub. This model employs convolutional neural networks to learn how to separate the content and style of input images by training large datasets of real and style images to produce images with a new style. Furthermore, the model offers a degree of controllability that allows the user to adjust certain hyperparameters to modify the style of generated images. The model was loaded using the hub.load() function, and the text CAPTCHA image to be processed, along with the chosen style image, was passed to the model() function to generate the style-enhanced text CAPTCHA (Figure 7).

In Figure 7, row a displays four-style images: Van Gogh, Monet, Cézanne, and Ukiyo-e. Rows b and d show the original CAPTCHA images and their corresponding style-enhanced versions. Rows c and e illustrate the topological maps of the style-enhanced CAPTCHAs based on the four different styles. It is evident that after applying the universal style transfer model, the text content of CAPTCHA remains unchanged, whereas interference elements such as character distortions, noise lines, and noise points are effectively preserved. In addition, by utilizing different style images, the generated text CAPTCHAs exhibited distinct styles that aligned with their respective style images. This demonstrates that the arbitrary-image-stylization-v2-256 universal style transfer model successfully generated style-enhanced text CAPTCHAs, ensuring that the content of the original text CAPTCHA was preserved and that the style changed according to the applied style image. It is also evident that the color, saturation, and other attributes of style-enhanced text CAPTCHAs vary according to the image style used. Specifically, the Van Gogh-style image derived from the classic Sunflowers painting results in a vibrant deep yellow background for the generated Van Gogh-style text CAPTCHA, whereas the Monet-style text CAPTCHA features a more subdued gray background. In comparison, the Cézanne-style and Ukiyo-e-style text CAPTCHAs exhibit more pronounced styles, with brighter and more vivid background colors and subtle enhancements to elements such as noise points. This indicates that, when focusing on style enhancement, selecting style images with richer colors and more diverse elements often produces more striking results, ultimately leading to a better user experience.

After obtaining style-enhanced text CAPTCHAs corresponding to the four styles, the average diameter of the graph was chosen to quantify the topological properties of the text CAPTCHA images before and after style transfer, as illustrated in the process flow in Figure 8. Initially, the image undergoes essential preprocessing, which includes conversion to grayscale and then binarization to simplify the structural information. Subsequently, the image is transformed into a graph structure, where each pixel is represented as a node and the connections between adjacent pixels are treated as edges [21].

To assess the differences more effectively before and after style transfer, the average diameter of the graph was calculated for 3000 original images and the images after applying the Van Gogh-style transfer, using the Van Gogh style as an example. The calculation results are listed in Table 1.

The calculation results in Table 1 demonstrate a significant difference in the topological features between the images after the universal style transfer and the original images. The average diameter of the graph decreases from 70.357 in the original images to 56.675, representing a reduction of 13.682. This decrease in the average diameter indicates that the topological structure of the image became more compact, shifting from a relatively loose structure to a more tightly knit structure during the universal style transfer process.

3.2. Once-Transfer Based on Generative Adversarial Networks (GANs)

The experimental process for once-transfer based on a GAN introduces a model training phase, in contrast to the style transfer experiment using the universal style transfer model. As a result, it is divided into four steps: the first step is to generate the text CAPTCHA dataset; the second step is to select the style image set for application; the third step is to train the CycleGAN model using the training dataset; and the fourth step is to apply the trained CycleGAN model to generate style-enhanced CAPTCHAs.

Step 1: Generation of text CAPTCHA dataset. The Captcha library was used to generate 3000 original text CAPTCHAs for the training dataset, with an additional 1000 original text CAPTCHAs created for the test dataset (Figure 9).

Step 2: Selection of style image set. The same four styles (Van Gogh, Monet, Cézanne, and Ukiyo-e) were selected for the image set. However, unlike the universal style transfer model, which allows for the selection of any once-image, this experiment employed CycleGAN for the style transfer. For each of the four styles, 100 images were selected for the training set (Figure 9) and 50 images were selected for the test set.

Step 3: Training the CycleGAN-style transfer model. The CycleGAN model was trained by inputting the pre-generated text CAPTCHA training set and style image training set with the relevant parameters and training epochs specified. This process enables the model to transfer the corresponding style to the original text CAPTCHA images.

Step 4: The trained CycleGAN model was applied to generate style-enhanced CAPTCHAs. The text CAPTCHA images from the validation set along with the style images were input into the trained CycleGAN model to generate the corresponding style-enhanced text CAPTCHAs (Figure 10).

As shown in Figure 10, after the style transfer using the trained CycleGAN model, the styles of the text CAPTCHA images underwent substantial transformations. The Van Gogh-style text CAPTCHA features a deep yellow background, reminiscent of sunflowers, and a relatively cluttered background. The Monet-style style-enhanced text CAPTCHA has a more subdued appearance, with a light blue and purple background, and is clearer overall. The Cézanne-style text CAPTCHA shares similarities with the Van Gogh-style text CAPTCHA, both of which have a yellow background; however, the Cézanne-style text is lighter and offers a more user-friendly experience. Finally, the Ukiyo-e-style text CAPTCHA exhibits variable colors, but remains relatively subtle and plain, with clear characters that are easy to read and recognize.

In addition, the topological structures presented in rows b and d of Figure 10 reveal significant structural changes in the topological maps of the four different style images after a once-transfer. Compared to the original images, the image after the Van Gogh-style transfer exhibited more irregular structures, with the node distribution becoming more dispersed and complex. The Monet-style transfer results in an image resembling Impressionist works, featuring blurred and light-shadow variations with smoothed details and blurred boundaries. The topological map of the Cézanne-style image shows a more regular node distribution, consistent with the visual characteristics of the Cézanne style. Finally, the topological map of the Ukiyo-e-style image displays a higher node density, indicating that the details of the image have been replaced by more pronounced contour lines and decorative elements.

Overall, style-enhanced text CAPTCHAs across the four styles did not hinder the user’s ability to recognize and use them. However, owing to the varying styles of the image sets, there were differences in the vibrancy of the colors and richness of the elements, resulting in considerable variation in the styles of the generated text CAPTCHAs. From a topological perspective, style transfer in different styles influences the structural aspects of images. It is also evident that the effect of CAPTCHA on the difficulty in recognizing the original text differs. Van Gogh- and Cézanne-style text CAPTCHAs, with their more intricate backgrounds, are more challenging to recognize than the other two styles.

Similarly, by inputting 3000 original images and 3000 Van Gogh-style images obtained through the GAN-based style transfer, the average graph diameter was calculated. The results are summarized in Table 2.

The results indicate that both the universal style transfer and the GAN-based style transfer lead to a significant decrease in the average graph diameter, suggesting that the nodes within the graph become more tightly connected, and the overall structure is more compact. Moreover, the reduction in the average diameter following the universal style transfer was comparable to that observed after the GAN-based one-style transfer, implying that both methods induced a similar degree of topological compression in the image structure.

3.3. Twice-Transfer Based on Generative Adversarial Networks (GANs)

To further investigate the impact of style transfer on text-based CAPTCHAs, a secondary-style transfer experiment was designed. The first Van Gogh-style enhanced text CAPTCHA was used as the test set sample data, and then the second Van Gogh-style transfer was conducted. The first Monet-style text CAPTCHA was taken as the test set sample data; the second style transfer of the Monet style was then conducted, and the rest were the same.

Style-enhanced text-based CAPTCHAs generated via a single round of CycleGAN-based transfer were used as the input for a second identical style transfer. The resulting images are shown in Figure 11, Figure 12, Figure 13 and Figure 14. Compared to their singly transferred counterparts, all the CAPTCHAs exhibited varying degrees of visual transformation after the second transfer. For the Van Gogh style, the changes were relatively subtle. Upon closer inspection, a faint addition of purplish-blue hue to the background was observed. In contrast, the Monet-style CAPTCHAs exhibit more pronounced alterations—additional background colors were introduced, and the character contours appear slightly faded. The Cézanne-style images showed a noticeable increase in shading and softening of character intensity. Similarly, for the Ukiyo-e style, the background darkened slightly, whereas the character outlines became less distinct. Despite these stylistic changes, the overall legibility of the CAPTCHAs remained largely unaffected. The clarity of the characters, along with noise elements such as interference lines and dots, was well preserved. Thus, from the perspective of the user experience and functional usability, secondary-style transfers do not cause any degradation.

A topological feature analysis was conducted on text-based CAPTCHA images subjected to a second round of GAN-based style transfer, focusing on the calculation of the average graph diameter. The results of this analysis are summarized in Table 3.

After the twice-transfer based on GANs, the average diameter of the graph was reduced from 70.357 to 47.933, which is a decrease of 22.424. This indicates that the twice-transfer also results in a more compact graph structure. Compared with the primary style transfer, the twice-transfer significantly enhances the compactness of the image and improves the resistance of text CAPTCHAs to recognition.

4. Result Analysis

4.1. Evaluation of the Anti-Recognition Capability

This experiment utilizes the Recognition Success Rate (RSR) metric to evaluate the anti-recognition capability of text CAPTCHAs in a general OCR recognition interface. The formula used is as follows:

R S R = C_{γ} / C,

(2)

In this formula,

C_{γ}

represents the number of correctly recognized text CAPTCHA images, and

C

denotes the total number of text CAPTCHA images. RSR is a direct measure of the resistance of CAPTCHAs to recognition. A higher RSR indicates lower resistance, meaning that the CAPTCHA is more easily recognized, whereas a lower RSR suggests a stronger anti-recognition capability, as the CAPTCHA is more resistant to automated recognition.

In the twice-transfer experiment, each dataset from the primary transfer was subjected to a secondary transfer using the same style and three other styles, resulting in four distinct datasets. The average recognition rate of these four datasets on Muggle-OCR was then used as a metric for evaluating the anti-recognition capability after the secondary transfer. A bar chart comparing the recognition rates of text CAPTCHAs generated by the universal style transfer model, once-transfer model, and twice- transfer is shown in Figure 15.

As can be clearly seen in Figure 15, the recognition rates of text CAPTCHAs enhanced with various styles using the three methods are all lower than the recognition rate of the original text CAPTCHA, which is 50.16% on Muggle-OCR, indicating that the anti-recognition capability of text CAPTCHAs is improved regardless of the style transfer method used.

Once-transfer using the GAN model provides a relatively weaker improvement in the anti-recognition capability of the original text CAPTCHA compared to the universal style transfer model. Style transfers using the Ukiyo-e style show a particularly large gap. However, after optimization through twice-transfer, the improvement in anti-recognition capability was more pronounced. Compared with the results of once-transfer, twice-transfer generally leads to a significant reduction in the recognition success rate. Compared with the style-enhanced text CAPTCHAs generated by the universal style transfer model, the anti-recognition capabilities of the twice-transfer using the GAN model show minor difference across various styles.

The universal style transfer model, which uses a pretrained CNN, allows for the rapid application of styles to CAPTCHA images [22]. However, the style of a generated CAPTCHA is heavily dependent on the selected style image. In other words, the improvement in anti-recognition capability is closely related to the selected style image. In contrast, the results of style transfer using the GAN network were relatively stable. It learns the features of the entire dataset using the same style and altering a single image or a portion of the style image has minimal impact on the results. Therefore, the style transfer for text CAPTCHAs based on GANs offers better stability than the universal style transfer model.

4.2. Topological Feature Analysis

In the topological feature analysis, the average diameters of the three CAPTCHA images from the different style transfer stages were calculated, as shown in Figure 16.

Compared to the original CAPTCHA, the topological features of the CAPTCHA image after style transfer underwent significant changes. The diameter of the graph decreased by approximately 32.01% after the twice-transfer, showing the greatest reduction. This indicates that while different style transfer methods can lead to compression of the graph’s structural features, the extent of this compression varies. After the twice-transfer, the diameter of the graph changed more significantly, indicating that this method made the topological structure of the graph more compact. A comparison clearly revealed the impact of style transfer on the graph structure.

5. Conclusions

This study introduced a pretrained universal style transfer model for the style transfer of text CAPTCHAs. The style of the text CAPTCHA obtained through the transfer was determined by the selected style image. Factors such as usability, character clarity, and noise were unaffected during the transfer process. The style-enhanced text CAPTCHA demonstrated significantly higher anti-recognition capability than the original text CAPTCHA.

The trained CycleGAN model could effectively perform style transfers for text CAPTCHAs without affecting their usability. The anti-recognition capability of the style-enhanced text CAPTCHAs showed varying degrees of improvement compared to the original text CAPTCHA.

Twice-transfer could significantly enhance the anti-recognition capability of text CAPTCHAs obtained after once-transfer without affecting their usability. Compared with style transfer using a universal style transfer model, GAN-based style transfer offers advantages in terms of flexibility, potential for improving anti-recognition capability, and stability while maintaining the usability of CAPTCHAs throughout the process.

By calculating the diameter of the graph, we demonstrated that the compactness of the text CAPTCHA images significantly increased after style transfer. Topological feature analysis provides a more comprehensive understanding of the impact of style transfer on the structure of CAPTCHA images and offers new theoretical insights and methods for CAPTCHA design and security evaluation.

Author Contributions

Conceptualization, T.X.; Data curation, Y.R.; Formal analysis, T.X. and Z.Y.; Investigation, Z.Y.; Methodology, T.X. and Z.G.; Software, Z.G. and Z.Y.; Supervision, T.X.; Validation, Z.G.; Visualization, Z.G.; Writing—original draft, T.X. and Z.G.; Writing—review and editing, T.X. and Z.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research was financially supported by “Deep-time Digital Earth” Science and Technology Leading Talents Team Funds for the Central Universities for the Frontiers Science Center for Deep-time Digital Earth, China University of Geosciences (Beijing) (grant number: 2652023001), and Innovation and Entrepreneurship Program for College Students (Project No. X202311415096).

Data Availability Statement

The data presented in this study are openly available in https://github.com/houswy/Style-Transfer-and-Topological-Feature-Analysis.git.

Acknowledgments

We gratefully acknowledge the financial support provided by the “Deep-time Digital Earth” Science and Technology Leading Talents Team Funds for the Central Universities for the Frontiers Science Center for Deep-time Digital Earth, China University of Geosciences (Beijing) (grant number: 2652023001), and Innovation and Entrepreneurship Program for College Students (grant number: X202311415096).

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Abbreviations

The following abbreviations are used in this manuscript:

GANs	Generative adversarial networks
OCR	Optical character recognition
CNNs	Convolutional neural networks
RSR	Recognition success rate

References

Von Ahn, L.; Blum, M.; Hopper, N.J.; Langford, J. CAPTCHA: Using hard AI problems for security. In International Conference on the Theory and Applications of Cryptographic Techniques; Springer: Berlin/Heidelberg, Germany, 2003; pp. 294–311. [Google Scholar]
Zhu, B.B.; Yan, J.; Bao, G.B.; Yang, M.W.; Xu, N. Captcha as Graphical Passwords-A New Security Primitive Based on Hard AI Problems. IEEE Trans. Inf. Forensics Secur. 2014, 9, 891–904. [Google Scholar] [CrossRef]
Pour, M.S.; Nader, C.; Friday, K.; Bou-Harb, E. A comprehensive survey of recent internet measurement techniques for cyber security. Comput. Secur. 2023, 1, 103123. [Google Scholar] [CrossRef]
Thobhani, A.; Gao, M.; Hawbani, A.; Ali, S.T.; Abdussalam, A. Captcha recognition using deep learning with attached binary images. Electronics 2020, 9, 1522. [Google Scholar] [CrossRef]
Lorenzi, D. Enhancing Security and Usability from a Human Perspective on the World Wide Web; Rutgers University-Graduate School-Newark: New Brunswick, NJ, USA, 2016. [Google Scholar]
Ye, G.; Tang, Z.; Fang, D.; Zhu, Z.; Feng, Y.; Xu, P.; Chen, X.; Wang, Z. Yet Another Text Captcha Solver: A Generative Adversarial Network Based Approach. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security (CCS), Toronto, TO, Canada, 15ߝ19 October 2018; pp. 332–348. [Google Scholar]
Gao, H.; Yan, J.; Cao, F.; Zhang, Z.; Lei, L.; Tang, M.; Zhang, P.; Zhou, X.; Wang, X.; Li, J.; et al. A Simple Generic Attack on Text Captchas. In Proceedings of the 23rd Annual Network and Distributed System Security Symposium (NDSS), San Diego, CA, USA, 21–24 February 2016. [Google Scholar]
Schmidhuber, J. Deep learning in neural networks: An overview. Neural Netw. 2015, 6, 85–117. [Google Scholar] [CrossRef] [PubMed]
Gao, H.; Wang, W.; Qi, J.; Wang, X.; Liu, X.; Yan, J. The robustness of hollow CAPTCHAs. In Proceedings of the 2013 ACM SIGSAC Conference on Computer & Communications Security, Berlin, Germany, 4–8 November 2013; pp. 1075–1086. [Google Scholar]
Goodfellow, Z.I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. In Proceedings of the NIPS’14: Proceedings of the 28th International Conference on Neural Information Processing Systems; Montreal, QC, Canada, 8–13 December 2014, Volume 27.
Elson, J.; Douceur, J.R.; Howell, J.; Saul, J. Asirra: A CAPTCHA that exploits interest-aligned manual image categorization. CCS 2007, 7, 15. [Google Scholar]
Liu, L.; Xi, Z.; Ji, R.; Ma, W. Advanced deep learning techniques for image style transfer: A survey. Signal Process.-Image Commun. 2019, 7, 465–470. [Google Scholar] [CrossRef]
Gatys, L.A.; Ecker, A.S.; Bethge, M. Image style transfer using convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2414–2423. [Google Scholar]
Cai, Q.; Ma, M.X.; Wang, C.; Li, H.S. Image neural style transfer: A review*. Comput. Electr. Eng. 2023, 1, 8. [Google Scholar] [CrossRef]
Johnson, J.; Alahi, A.; Li, F.-F. Perceptual Losses for Real-Time Style Transfer and Super-Resolution. In Proceedings of the 14th European Conference on Computer Vision (ECCV), Amsterdam, The Netherlands, 11–14 October 2016; pp. 694–711. [Google Scholar]
Zhu, J.-Y.; Park, T.; Isola, P.; Efros, A.A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2223–2232. [Google Scholar]
Nielson, J.L.; Paquette, J.; Liu, A.W.; Guandique, C.F.; Tovar, C.A.; Inoue, T.; Irvine, K.A.; Gensel, J.C.; Kloke, J.; Petrossian, T.C. Topological data analysis for discovery in preclinical spinal cord injury and traumatic brain injury. Nat. Commun. 2015, 6, 8581. [Google Scholar] [CrossRef] [PubMed]
Creswell, A.; White, T.; Dumoulin, V.; Arulkumaran, K.; Sengupta, B.; Bharath, A.A. Generative adversarial networks: An overview. IEEE Signal Process. Mag. 2018, 35, 53–65. [Google Scholar] [CrossRef]
da, F.; Costa, L.; Rodrigues, F.A.; Travieso, G.; Boas, P.R.V. Characterization of complex networks: A survey of measurements. Adv. Phys. 2007, 56, 167–242. [Google Scholar]
Noto, M.; Sato, H. A method for the shortest path search by extended Dijkstra algorithm. In Proceedings of the SMC 2000 Conference Proceedings. 2000 IEEE International Conference on Systems, Man and Cybernetics’ Cybernetics Evolving to Systems, Humans, Organizations, and Their Complex Interactions’(cat. no. 0), Nashville, TN, USA, 8–11 October 2000; IEEE: Piscataway, NJ, USA, 2000; Volume 3, pp. 2316–2320. [Google Scholar]
Felzenszwalb, P.F.; Huttenlocher, D.P. Efficient graph-based image segmentation. Int. J. Comput. Vis. 2004, 59, 167–181. [Google Scholar] [CrossRef]
Ji, T.; Luo, Y.; Lin, Y.; Yang, Y.; Zheng, Q.; Lian, S.; Li, J. ImageVeriBypasser: An image verification code recognition approach based on Convolutional Neural Network. Expert Syst. 2024, 41, e13658. [Google Scholar] [CrossRef]

Figure 1. Technology roadmap. (a) This panel shows the CAPTCHA-generation workflow; (b) This panel shows the experimental validation workflow.

Figure 2. GAN structure diagram.

Figure 3. CycleGAN structure diagram.

Figure 4. Original image and binarizes image. (a) This panel shows the original image, displaying the characters “0glB”; (b) This panel shows the binarized version of the original image, where the characters “0glB” are highlighted as black, and the background is turned to white, with the noise significantly reduced.

Figure 5. Enlarged view of partial structure of binarized image. (a) This panel shows the binarized image of the CAPTCHA “0glB”; (b) This panel provides an enlarged view of a partial structure extracted from the binarized image. It represents a graph structure where nodes are connected by edges.

Figure 6. Text CAPTCHA generated by the Captcha library.

Figure 7. Style-enhanced CAPTCHA and topology diagram. (a) The four style images used for transfer: Van Gogh, Monet, Cézanne, and Ukiyo-e; (b) The original text-based CAPTCHA “0asG” in the first column, followed by its style-enhanced versions after applying the universal style transfer model in the Van Gogh, Monet, Cézanne, and Ukiyo-e styles in the second to fifth columns, respectively; (c) The corresponding topological graph of the CAPTCHA "0asG" after universal style transfer in the respective columns; (d) The original text-based CAPTCHA “0glB” in the first column, followed by its style-enhanced versions after applying the universal style transfer model in the Van Gogh, Monet, Cézanne, and Ukiyo-e styles in the second to fifth columns, respectively; (e) The corresponding topological graph of the CAPTCHA "0glB" after universal style transfer in the respective columns.

Figure 8. Flow chart of topological feature calculation.

Figure 9. Partial diagrams of the training set and the test set.

Figure 10. The image after once-transfer based on GAN and topology diagram.

Figure 11. Van Gogh—Van Gogh-style enhanced CAPTCHA and topological diagram. (a) The row shows the transformation of CAPTCHA "0asG" through two rounds of Van Gogh style transfer. The first column is the original CAPTCHA, followed by the style-transferred images in the second and fourth columns, and their corresponding topological graphs in the third and fifth columns; (b) The row shows the transformation of CAPTCHA "9xA1" through two rounds of Van Gogh style transfer. The first column is the original CAPTCHA, followed by the style-transferred images in the second and fourth columns, and their corresponding topological graphs in the third and fifth columns; (c) The row shows the transformation of CAPTCHA "cKKI" through two rounds of Van Gogh style transfer. The first column is the original CAPTCHA, followed by the style-transferred images in the second and fourth columns, and their corresponding topological graphs in the third and fifth columns; (d) The row shows the transformation of CAPTCHA "HRyX" through two rounds of Van Gogh style transfer. The first column is the original CAPTCHA, followed by the style-transferred images in the second and fourth columns, and their corresponding topological graphs in the third and fifth columns.

Figure 12. Monet—Monet-style enhanced CAPTCHA and topological diagram. (a) The row shows the transformation of CAPTCHA "0asG" through two rounds of Monet style transfer. The first column is the original CAPTCHA, followed by the style-transferred images in the second and fourth columns, and their corresponding topological graphs in the third and fifth columns; (b) The row shows the transformation of CAPTCHA "9xA1" through two rounds of Monet style transfer. The first column is the original CAPTCHA, followed by the style-transferred images in the second and fourth columns, and their corresponding topological graphs in the third and fifth columns; (c) The row shows the transformation of CAPTCHA "cKKI" through two rounds of Monet style transfer. The first column is the original CAPTCHA, followed by the style-transferred images in the second and fourth columns, and their corresponding topological graphs in the third and fifth columns; (d) The row shows the transformation of CAPTCHA "HRyX" through two rounds of Monet style transfer. The first column is the original CAPTCHA, followed by the style-transferred images in the second and fourth columns, and their corresponding topological graphs in the third and fifth columns.

Figure 13. Cézanne—Cézanne-style enhanced CAPTCHA and topological diagram. (a) The row shows the transformation of CAPTCHA "0asG" through two rounds of Cézanne style transfer. The first column is the original CAPTCHA, followed by the style-transferred images in the second and fourth columns, and their corresponding topological graphs in the third and fifth columns; (b) The row shows the transformation of CAPTCHA "9xA1" through two rounds of Cézanne style transfer. The first column is the original CAPTCHA, followed by the style-transferred images in the second and fourth columns, and their corresponding topological graphs in the third and fifth columns; (c) The row shows the transformation of CAPTCHA "cKKI" through two rounds of Cézanne style transfer. The first column is the original CAPTCHA, followed by the style-transferred images in the second and fourth columns, and their corresponding topological graphs in the third and fifth columns; (d) The row shows the transformation of CAPTCHA "HRyX" through two rounds of Cézanne style transfer. The first column is the original CAPTCHA, followed by the style-transferred images in the second and fourth columns, and their corresponding topological graphs in the third and fifth columns.

Figure 14. Ukiyo-e—Ukiyo-e-Style Enhanced CAPTCHA and topological diagram. (a) The row shows the transformation of CAPTCHA "0asG" through two rounds of Ukiyo-e style transfer. The first column is the original CAPTCHA, followed by the style-transferred images in the second and fourth columns, and their corresponding topological graphs in the third and fifth columns; (b) The row shows the transformation of CAPTCHA "9xA1" through two rounds of Ukiyo-e style transfer. The first column is the original CAPTCHA, followed by the style-transferred images in the second and fourth columns, and their corresponding topological graphs in the third and fifth columns; (c) The row shows the transformation of CAPTCHA "cKKI" through two rounds of Ukiyo-e style transfer. The first column is the original CAPTCHA, followed by the style-transferred images in the second and fourth columns, and their corresponding topological graphs in the third and fifth columns; (d) The row shows the transformation of CAPTCHA " HRyX" through two rounds of Ukiyo-e style transfer. The first column is the original CAPTCHA, followed by the style-transferred images in the second and fourth columns, and their corresponding topological graphs in the third and fifth columns.

Figure 15. Recognition rates of different CAPTCHA types.

Figure 16. Comparison of topological features for different categories.

Table 1. Topological analysis after universal style transfer.

Parameter	Original Images	After Universal Style Transfer
Average diameter	70.357	56.675

Table 2. Topological analysis after once-transfer.

Parameter	Original Images	After Universal Style Transfer	After Once-Transfer
Average diameter	70.357	56.675	53.850

Table 3. Topological analysis after twice-transfer.

Parameter	Original Images	After Universal Style Transfer	After Once- Transfer	After Twice- Transfer
Average diameter	70.357	56.675	53.850	47.933

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Xue, T.; Guo, Z.; Yin, Z.; Rong, Y. Style Transfer and Topological Feature Analysis of Text-Based CAPTCHA via Generative Adversarial Networks. Mathematics 2025, 13, 1861. https://doi.org/10.3390/math13111861

AMA Style

Xue T, Guo Z, Yin Z, Rong Y. Style Transfer and Topological Feature Analysis of Text-Based CAPTCHA via Generative Adversarial Networks. Mathematics. 2025; 13(11):1861. https://doi.org/10.3390/math13111861

Chicago/Turabian Style

Xue, Tao, Zixuan Guo, Zehang Yin, and Yu Rong. 2025. "Style Transfer and Topological Feature Analysis of Text-Based CAPTCHA via Generative Adversarial Networks" Mathematics 13, no. 11: 1861. https://doi.org/10.3390/math13111861

APA Style

Xue, T., Guo, Z., Yin, Z., & Rong, Y. (2025). Style Transfer and Topological Feature Analysis of Text-Based CAPTCHA via Generative Adversarial Networks. Mathematics, 13(11), 1861. https://doi.org/10.3390/math13111861

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Style Transfer and Topological Feature Analysis of Text-Based CAPTCHA via Generative Adversarial Networks

Abstract

1. Introduction

2. Theoretical Basis

2.1. Generative Adversarial Networks

2.2. Cycle Generative Adversarial Networks

2.3. Analysis of Topological Feature of Graphs and Networks

3. Methodology

3.1. Enhanced CAPTCHA for Universal Style Transfer Models

3.2. Once-Transfer Based on Generative Adversarial Networks (GANs)

3.3. Twice-Transfer Based on Generative Adversarial Networks (GANs)

4. Result Analysis

4.1. Evaluation of the Anti-Recognition Capability

4.2. Topological Feature Analysis

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI