Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (42)

Search Parameters:
Keywords = DALL·E

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
47 pages, 18189 KiB  
Article
Synthetic Scientific Image Generation with VAE, GAN, and Diffusion Model Architectures
by Zineb Sordo, Eric Chagnon, Zixi Hu, Jeffrey J. Donatelli, Peter Andeer, Peter S. Nico, Trent Northen and Daniela Ushizima
J. Imaging 2025, 11(8), 252; https://doi.org/10.3390/jimaging11080252 - 26 Jul 2025
Viewed by 408
Abstract
Generative AI (genAI) has emerged as a powerful tool for synthesizing diverse and complex image data, offering new possibilities for scientific imaging applications. This review presents a comprehensive comparative analysis of leading generative architectures, ranging from Variational Autoencoders (VAEs) to Generative Adversarial Networks [...] Read more.
Generative AI (genAI) has emerged as a powerful tool for synthesizing diverse and complex image data, offering new possibilities for scientific imaging applications. This review presents a comprehensive comparative analysis of leading generative architectures, ranging from Variational Autoencoders (VAEs) to Generative Adversarial Networks (GANs) on through to Diffusion Models, in the context of scientific image synthesis. We examine each model’s foundational principles, recent architectural advancements, and practical trade-offs. Our evaluation, conducted on domain-specific datasets including microCT scans of rocks and composite fibers, as well as high-resolution images of plant roots, integrates both quantitative metrics (SSIM, LPIPS, FID, CLIPScore) and expert-driven qualitative assessments. Results show that GANs, particularly StyleGAN, produce images with high perceptual quality and structural coherence. Diffusion-based models for inpainting and image variation, such as DALL-E 2, delivered high realism and semantic alignment but generally struggled in balancing visual fidelity with scientific accuracy. Importantly, our findings reveal limitations of standard quantitative metrics in capturing scientific relevance, underscoring the need for domain-expert validation. We conclude by discussing key challenges such as model interpretability, computational cost, and verification protocols, and discuss future directions where generative AI can drive innovation in data augmentation, simulation, and hypothesis generation in scientific research. Full article
(This article belongs to the Special Issue Celebrating the 10th Anniversary of the Journal of Imaging)
Show Figures

Graphical abstract

15 pages, 1599 KiB  
Article
Visual Representations in AI: A Study on the Most Discriminatory Algorithmic Biases in Image Generation
by Yazmina Vargas-Veleda, María del Mar Rodríguez-González and Iñigo Marauri-Castillo
Journal. Media 2025, 6(3), 110; https://doi.org/10.3390/journalmedia6030110 - 18 Jul 2025
Viewed by 355
Abstract
This study analyses algorithmic biases in AI-generated images, focusing on aesthetic violence, gender stereotypes, and weight discrimination. By examining images produced by the DALL-E Nature and Flux 1 systems, it becomes evident how these tools reproduce and amplify hegemonic beauty standards, excluding bodily [...] Read more.
This study analyses algorithmic biases in AI-generated images, focusing on aesthetic violence, gender stereotypes, and weight discrimination. By examining images produced by the DALL-E Nature and Flux 1 systems, it becomes evident how these tools reproduce and amplify hegemonic beauty standards, excluding bodily diversity. Likewise, gender representations reinforce traditional roles, sexualising women and limiting the presence of non-normative bodies in positive contexts. The results show that training data and the algorithms used significantly influence these trends, perpetuating exclusionary visual narratives. The research highlights the need to develop more inclusive and ethical AI models, with diverse data that reflect the plurality of bodies and social realities. The study concludes that artificial intelligence (AI), far from being neutral, actively contributes to the reproduction of power structures and inequality, posing an urgent challenge for the development and regulation of these technologies. Full article
Show Figures

Figure 1

16 pages, 759 KiB  
Article
Interpretation of AI-Generated vs. Human-Made Images
by Daniela Velásquez-Salamanca, Miguel Ángel Martín-Pascual and Celia Andreu-Sánchez
J. Imaging 2025, 11(7), 227; https://doi.org/10.3390/jimaging11070227 - 7 Jul 2025
Viewed by 629
Abstract
AI-generated content has grown significantly in recent years. Today, AI-generated and human-made images coexist across various settings, including news media, social platforms, and beyond. However, we still know relatively little about how audiences interpret and evaluate these different types of images. The goal [...] Read more.
AI-generated content has grown significantly in recent years. Today, AI-generated and human-made images coexist across various settings, including news media, social platforms, and beyond. However, we still know relatively little about how audiences interpret and evaluate these different types of images. The goal of this study was to examine whether image interpretation is influenced by the origin of the image (AI-generated vs. human-made). Additionally, we aimed to explore whether visual professionalization influences how images are interpreted. To this end, we presented 24 AI-generated images (produced using Midjourney, DALL·E, and Firefly) and 8 human-made images to 161 participants—71 visual professionals and 90 non-professionals. Participants were asked to evaluate each image based on the following: (1) the source they believed the image originated from, (2) the level of realism, and (3) the level of credibility they attributed to it. A total of 5152 responses were collected for each question. Our results reveal that human-made images are more readily recognized as such, whereas AI-generated images are frequently misclassified as human-made. We also find that human-made images are perceived as both more realistic and more credible than AI-generated ones. We conclude that individuals are generally unable to accurately determine the source of an image, which in turn affects their assessment of its credibility. Full article
Show Figures

Figure 1

26 pages, 12914 KiB  
Article
Copy/Past: A Hauntological Approach to the Digital Replication of Destroyed Monuments
by Giovanni Lovisetto
Heritage 2025, 8(7), 255; https://doi.org/10.3390/heritage8070255 - 27 Jun 2025
Viewed by 627
Abstract
This article offers a critical analysis of two ‘replicas’ of monuments destroyed by ISIL in 2015: the Institute for Digital Archaeology’s Arch of Palmyra (2016) and the lamassu from Nimrud, exhibited in the Rinascere dalle Distruzioni exhibition (2016). Drawing on Jacques Derrida’s formulation [...] Read more.
This article offers a critical analysis of two ‘replicas’ of monuments destroyed by ISIL in 2015: the Institute for Digital Archaeology’s Arch of Palmyra (2016) and the lamassu from Nimrud, exhibited in the Rinascere dalle Distruzioni exhibition (2016). Drawing on Jacques Derrida’s formulation of hauntology and Umberto Eco’s theory of forgery, this study examines the ontological, ethical, and ideological stakes of digitally mediated replication. Rather than treating digital and physical ‘copies’ as straightforward reproductions of ancient ‘originals’, the essay reframes them as specters: material re-appearances haunted by loss, technological mediation, and political discourses. Through a close analysis of production methods, rhetorical framings, media coverage, and public reception, it argues that presenting such ‘replicas’ as faithful restorations or acts of cultural resurrection collapses a hauntological relationship into a false ontology. The article thus shows how, by concealing the intermediary, spectral role of digital modeling, such framings enable the symbolic use of these ‘replicas’ as instruments of Western technological triumphalism and digital colonialism. This research calls for a critical approach that recognizes the ontological peculiarities of such replicas, foregrounds their reliance on interpretive rather than purely mechanical processes, and acknowledges the ideological weight they carry. Full article
(This article belongs to the Special Issue Past for the Future: Digital Pathways in Cultural Heritage)
Show Figures

Figure 1

23 pages, 3249 KiB  
Article
User Experience and Perceptions of AI-Generated E-Commerce Content: A Survey-Based Evaluation of Functionality, Aesthetics, and Security
by Chrysa Stamkou, Vaggelis Saprikis, George F. Fragulis and Ioannis Antoniadis
Data 2025, 10(6), 89; https://doi.org/10.3390/data10060089 - 17 Jun 2025
Viewed by 1673
Abstract
The integration of generative artificial intelligence (AI) in e-commerce is constantly increasing and in different forms, while transforming content creation. Its impact on user experience remains underexplored. This study examines user perceptions of AI-generated e-commerce content, focusing on functionality, aesthetics, and security. A [...] Read more.
The integration of generative artificial intelligence (AI) in e-commerce is constantly increasing and in different forms, while transforming content creation. Its impact on user experience remains underexplored. This study examines user perceptions of AI-generated e-commerce content, focusing on functionality, aesthetics, and security. A survey was conducted where 223 participants were requested to browse through the pages of an online store developed using ChatGPT and DALL·E and evaluate it, providing feedback through a constructed questionnaire. The collected data was subjected to descriptive statistical analysis, exploratory factor analysis (EFA), and comparative statistical tests to identify key user experience dimensions and possible demographic variances in satisfaction. Factor analysis extracted two main components influencing user experience: “Service Quality and Security” and “Design and Aesthetics”. Further analysis highlighted a slight variation in user evaluations between male and female participants. Although security-related questions were addressed with caution, the rest of the findings indicate that AI-generated content was well-received and highly rated. Clearly, generative AI is a valuable tool for businesses, AI developers, and anyone seeking to optimize AI-driven processes to enhance user engagement. It can be confidently concluded that it positively contributes to the development of a functional and aesthetically appealing e-commerce platform. Full article
Show Figures

Figure 1

15 pages, 4095 KiB  
Article
AI-Generated Mnemonic Images Improve Long-Term Retention of Coronary Artery Occlusions in STEMI: A Comparative Study
by Zahraa Alomar, Meize Guo and Tyler Bland
Technologies 2025, 13(6), 217; https://doi.org/10.3390/technologies13060217 - 26 May 2025
Cited by 1 | Viewed by 700
Abstract
Medical students face significant challenges retaining complex information, such as interpreting ECGs for coronary artery occlusions, amidst demanding curricula. While artificial intelligence (AI) is increasingly used for medical image analysis, this study explored using generative AI (DALLE-3) to create mnemonic-based images to enhance [...] Read more.
Medical students face significant challenges retaining complex information, such as interpreting ECGs for coronary artery occlusions, amidst demanding curricula. While artificial intelligence (AI) is increasingly used for medical image analysis, this study explored using generative AI (DALLE-3) to create mnemonic-based images to enhance human learning and retention of medical images, in particular, electrocardiograms (ECGs). This study is among the first to investigate generative AI as a tool not for automated diagnosis but as a human-centered educational aid designed to enhance long-term retention in complex visual tasks like ECG interpretation. We conducted a comparative study with 275 first-year medical students across six campuses; an experimental group (n = 40) received a lecture supplemented with AI-generated mnemonic ECG images, while control groups (n = 235) received standard lectures with traditional ECG diagrams. Student achievement and retention were assessed by course examinations, and student preference and engagement were measured using the Situational Interest Survey for Multimedia (SIS-M). Control groups showed a significant decline in scores on the relevant exam question over time, whereas the experimental group’s scores remained stable, indicating improved long-term retention. Experimental students also reported significantly higher situational interest in the mnemonic-based images over traditional images. AI-generated mnemonic images can effectively improve long-term retention of complex ECG interpretation skills and enhance student engagement and preference, highlighting generative AI’s potential as a valuable cognitive tool in image analysis during medical education. Full article
(This article belongs to the Special Issue Application of Artificial Intelligence in Medical Image Analysis)
Show Figures

Figure 1

16 pages, 2039 KiB  
Article
Ontogenetic Growth Changes in Mercury and Stable Isotope Ratios of Carbon, Nitrogen, and Oxygen in Male and Female Dalli-Type Dall’s Porpoises (Phocoenoides dalli) Stranded in Hokkaido, Japan
by Tetsuya Endo, Osamu Kimura, Masaru Terasaki, Yoshihisa Kato, Yukiko Fujii and Koichi Haraguchi
J. Mar. Sci. Eng. 2025, 13(5), 892; https://doi.org/10.3390/jmse13050892 (registering DOI) - 30 Apr 2025
Viewed by 539
Abstract
We investigated the ontogenetic growth changes in total mercury (THg) concentrations, δ13C, δ15N, and δ18O values, and body length (BL) of dalli-type Dall’s porpoises. THg concentrations in the liver of mature porpoises stranded in Hokkaido, Japan, [...] Read more.
We investigated the ontogenetic growth changes in total mercury (THg) concentrations, δ13C, δ15N, and δ18O values, and body length (BL) of dalli-type Dall’s porpoises. THg concentrations in the liver of mature porpoises stranded in Hokkaido, Japan, were markedly higher than those in the muscle. The THg concentrations in the livers of males and females increased sharply when their BLs exceeded approximately 1.9 m and 1.8 m, respectively, the BLs at which they might attain maturity. The asymptotes of the THg increases were close to their maximum BLs of 2.2 m and 2.0 m for males and females, respectively. The δ15N levels in muscles were higher in the calves than in the weaned porpoises, probably due to the consumption of 15N-enriched milk, whereas the δ13C values in the calves were variable and similar to those in the weaned porpoises. The δ18O values of male and female muscles increased with increasing BL. Positive correlations were found between the THg concentrations and either the δ13C values or the δ18O values in the weaned animals, but not with the δ15N values. These results imply a feeding shift towards deeper pelagic areas with growth, as the δ13C and δ18O values and the THg concentrations tend to be higher in these areas. Full article
(This article belongs to the Section Marine Aquaculture)
Show Figures

Figure 1

16 pages, 1664 KiB  
Article
Who Is to Blame for the Bias in Visualizations, ChatGPT or DALL-E?
by Dirk H. R. Spennemann
AI 2025, 6(5), 92; https://doi.org/10.3390/ai6050092 - 29 Apr 2025
Viewed by 1361
Abstract
Due to range of factors in the development stage, generative artificial intelligence (AI) models cannot be completely free from bias. Some biases are introduced by the quality of training data, and developer influence during both design and training of the large language models [...] Read more.
Due to range of factors in the development stage, generative artificial intelligence (AI) models cannot be completely free from bias. Some biases are introduced by the quality of training data, and developer influence during both design and training of the large language models (LLMs), while others are introduced in the text-to-image (T2I) visualization programs. The bias and initialization at the interface between LLMs and T2I applications has not been examined to date. This study analyzes 770 images of librarians and curators generated by DALL-E from ChatGPT-4o prompts to investigate the source of gender, ethnicity, and age biases in these visualizations. Comparing prompts generated by ChatGPT-4o with DALL-E’s visual interpretations, the research demonstrates that DALL-E primarily introduces biases when ChatGPT-4o provides non-specific prompts. This highlights the potential for generative AI to perpetuate and amplify harmful stereotypes related to gender, age, and ethnicity in professional roles. Full article
(This article belongs to the Special Issue AI Bias in the Media and Beyond)
Show Figures

Figure 1

23 pages, 37586 KiB  
Article
Revisiting Wölfflin in the Age of AI: A Study of Classical and Baroque Composition in Generative Models
by Adrien Deliege, Maria Giulia Dondero and Enzo D’Armenio
J. Imaging 2025, 11(5), 128; https://doi.org/10.3390/jimaging11050128 - 22 Apr 2025
Cited by 1 | Viewed by 660
Abstract
This study explores how contemporary text-to-image models interpret and generate Classical and Baroque styles under Wölfflin’s framework—two categories that are atemporal and transversal across media. Our goal is to see whether generative AI can replicate the nuanced stylistic cues that art historians attribute [...] Read more.
This study explores how contemporary text-to-image models interpret and generate Classical and Baroque styles under Wölfflin’s framework—two categories that are atemporal and transversal across media. Our goal is to see whether generative AI can replicate the nuanced stylistic cues that art historians attribute to them. We prompted two popular models (DALL•E and Midjourney) using explicit style labels (e.g., “baroque” and “classical”) as well as more implicit cues (e.g., “dynamic”, “static”, or reworked Wölfflin descriptors). We then collected expert ratings and conducted broader qualitative reviews to assess how each output aligned with Wölfflin’s characteristics. Our findings suggest that the term “baroque” usually evokes features recognizable in typically historical Baroque artworks, while “classical” often yields less distinct results, particularly when a specified genre (portrait, still life) imposes a centered, closed-form composition. Removing explicit style labels may produce highly abstract images, revealing that Wölfflin’s descriptors alone may be insufficient to convey Classical or Baroque styles efficiently. Interestingly, the term “dynamic” gives rather chaotic images, yet this chaos is somehow ordered, centered, and has an almost Classical feel. Altogether, these observations highlight the complexity of bridging canonical stylistic frameworks and contemporary AI training biases, underscoring the need to update or refine Wölfflin’s atemporal categories to accommodate how generative models—and modern visual culture—reinterpret Classical and Baroque. Full article
(This article belongs to the Special Issue Celebrating the 10th Anniversary of the Journal of Imaging)
Show Figures

Figure 1

18 pages, 2949 KiB  
Article
Generative Artificial Intelligence as a Catalyst for Change in Higher Education Art Study Programs
by Anna Ansone, Zinta Zālīte-Supe and Linda Daniela
Computers 2025, 14(4), 154; https://doi.org/10.3390/computers14040154 - 20 Apr 2025
Cited by 1 | Viewed by 1954
Abstract
Generative Artificial Intelligence (AI) has emerged as a transformative tool in art education, offering innovative avenues for creativity and learning. However, concerns persist among educators regarding the potential misuse of text-to-image generators as unethical shortcuts. This study explores how bachelor’s-level art students perceive [...] Read more.
Generative Artificial Intelligence (AI) has emerged as a transformative tool in art education, offering innovative avenues for creativity and learning. However, concerns persist among educators regarding the potential misuse of text-to-image generators as unethical shortcuts. This study explores how bachelor’s-level art students perceive and use generative AI in artistic composition. Ten art students participated in a lecture on composition principles and completed a practical composition task using both traditional methods and generative AI tools. Their interactions were observed, followed by the administration of a questionnaire capturing their reflections. Qualitative analysis of the data revealed that students recognize the potential of generative AI for ideation and conceptual development but find its limitations frustrating for executing nuanced artistic tasks. This study highlights the current utility of generative AI as an inspirational and conceptual mentor rather than a precise artistic tool, highlighting the need for structured training and a balanced integration of generative AI with traditional design methods. Future research should focus on larger participant samples, assess the evolving capabilities of generative AI tools, and explore their potential to teach fundamental art concepts effectively while addressing concerns about academic integrity. Enhancing the functionality of these tools could bridge gaps between creativity and pedagogy in art education. Full article
(This article belongs to the Special Issue Smart Learning Environments)
Show Figures

Figure 1

17 pages, 8996 KiB  
Article
The Impact of Ancient Greek Prompts on Artificial Intelligence Image Generation: A New Educational Paradigm
by Anna Kalargirou, Dimitrios Kotsifakos and Christos Douligeris
AI 2025, 6(4), 81; https://doi.org/10.3390/ai6040081 - 18 Apr 2025
Viewed by 1239
Abstract
Background/Objectives: This article explores the use of Ancient Greek as a prompt language in DALL·E 3, an Artificial Intelligence software for image generation. The research investigates three dimensions of Artificial Intelligence’s ability: (a) the sense and visualization of the concept of distance, (b) [...] Read more.
Background/Objectives: This article explores the use of Ancient Greek as a prompt language in DALL·E 3, an Artificial Intelligence software for image generation. The research investigates three dimensions of Artificial Intelligence’s ability: (a) the sense and visualization of the concept of distance, (b) the mixing of representational as well as mythic contents, and (c) the visualization of emotions. More specifically, the research not only investigates AI’s potentialities in processing and representing Ancient Greek texts but also attempts to assess its interpretative boundaries. The key question is whether AI can faithfully represent the underlying conceptual and narrative structures of ancient literature or whether its representations are superficial and constrained by algorithmic procedures. Methods: This is a mixed-methods experimental research design examining whether a specified Artificial Intelligence software can sense, understand, and graphically represent linguistic and conceptual structures in the Ancient Greek language. Results: The study highlights Artificial Intelligence’s possibility in classical language education as well as digital humanities regarding linguistic complexity versus AI’s power in interpretation. More specifically, the research not only investigates AI’s potentialities in processing and representing Ancient Greek texts but also attempts to assess its interpretative boundaries. The key question is whether AI can faithfully represent the underlying conceptual and narrative structures of ancient literature or whether its representations are superficial and constrained by algorithmic procedures. The study highlights Artificial Intelligence’s possibility in classical language education as well as digital humanities regarding linguistic complexity versus AI’s power in interpretation. Conclusions: The research is a step toward a more extensive discussion on Artificial Intelligence in historical linguistics, digital pedagogy, as well as aesthetic representation by Artificial Intelligence environments. Full article
Show Figures

Figure 1

20 pages, 10573 KiB  
Article
A Validity Analysis of Text-to-Image Generative Artificial Intelligence Models for Craniofacial Anatomy Illustration
by Syed Ali Haider, Srinivasagam Prabha, Cesar A. Gomez-Cabello, Sahar Borna, Sophia M. Pressman, Ariana Genovese, Maissa Trabilsy, Andrea Galvao, Keith T. Aziz, Peter M. Murray, Yogesh Parte, Yunguo Yu, Cui Tao and Antonio Jorge Forte
J. Clin. Med. 2025, 14(7), 2136; https://doi.org/10.3390/jcm14072136 - 21 Mar 2025
Cited by 1 | Viewed by 1665
Abstract
Background: Anatomically accurate illustrations are imperative in medical education, serving as crucial tools to facilitate comprehension of complex anatomical structures. While traditional illustration methods involving human artists remain the gold standard, the rapid advancement of Generative Artificial Intelligence (GAI) models presents a new [...] Read more.
Background: Anatomically accurate illustrations are imperative in medical education, serving as crucial tools to facilitate comprehension of complex anatomical structures. While traditional illustration methods involving human artists remain the gold standard, the rapid advancement of Generative Artificial Intelligence (GAI) models presents a new opportunity to automate and accelerate this process. This study evaluated the potential of GAI models to produce craniofacial anatomy illustrations for educational purposes. Methods: Four GAI models, including Midjourney v6.0, DALL-E 3, Gemini Ultra 1.0, and Stable Diffusion 2.0 were used to generate 736 images across multiple views of surface anatomy, bones, muscles, blood vessels, and nerves of the cranium in both oil painting and realistic photograph styles. Four reviewers evaluated the images for anatomical detail, aesthetic quality, usability, and cost-effectiveness. Inter-rater reliability analysis assessed evaluation consistency. Results: Midjourney v6.0 scored highest for aesthetic quality and cost-effectiveness, and DALL-E 3 performed best for anatomical detail and usability. The inter-rater reliability analysis demonstrated a high level of agreement among reviewers (ICC = 0.858, 95% CI). However, all models showed significant flaws in depicting crucial anatomical details such as foramina, suture lines, muscular origins/insertions, and neurovascular structures. These limitations were further characterized by abstract depictions, mixing of layers, shadowing, abnormal muscle arrangements, and labeling errors. Conclusions: These findings highlight GAI’s potential for rapidly creating craniofacial anatomy illustrations but also its current limitations due to inadequate training data and incomplete understanding of complex anatomy. Refining these models through precise training data and expert feedback is vital. Ethical considerations, such as potential biases, copyright challenges, and the risks of propagating inaccurate information, must also be carefully navigated. Further refinement of GAI models and ethical safeguards are essential for safe use. Full article
Show Figures

Figure 1

33 pages, 129733 KiB  
Article
Mindful Architecture from Text-to-Image AI Perspectives: A Case Study of DALL-E, Midjourney, and Stable Diffusion
by Chaniporn Thampanichwat, Tarid Wongvorachan, Limpasilp Sirisakdi, Pornteera Chunhajinda, Suphat Bunyarittikit and Rungroj Wongmahasiri
Buildings 2025, 15(6), 972; https://doi.org/10.3390/buildings15060972 - 19 Mar 2025
Cited by 7 | Viewed by 3568
Abstract
Mindful architecture is poised to foster sustainable behavior and simultaneously mitigate the physical and mental health challenges arising from the impacts of global warming. Previous studies demonstrate that a substantial educational gap persists between architecture and mindfulness. However, recent advancements in text-to-image AI [...] Read more.
Mindful architecture is poised to foster sustainable behavior and simultaneously mitigate the physical and mental health challenges arising from the impacts of global warming. Previous studies demonstrate that a substantial educational gap persists between architecture and mindfulness. However, recent advancements in text-to-image AI have begun to play a significant role in generating conceptual architectural imagery, enabling architects to articulate their ideas better. This study employs DALL-E, Midjourney, and Stable Diffusion—popular tools in the field—to generate imagery of mindful architecture. Subsequently, the architects decoded the architectural characteristics in the images into words. These words were then analyzed using natural language processing techniques, including Word Cloud Generation, Word Frequency Analysis, and Topic Modeling Analysis. Research findings conclude that mindful architecture from text-to-image AI perspectives consistently features structured lines with sharp edges, prioritizes openness with indoor–outdoor spaces, employs both horizontal and vertical movement, utilizes natural lighting and earth-tone colors, incorporates wood, stone, and glass elements, and emphasizes views of serene green spaces—creating environments characterized by gentle natural sounds and calm atmospheric qualities. DALL-E is the text-to-image AI that provides the most detailed representation of mindful architecture. Full article
(This article belongs to the Section Architectural Design, Urban Science, and Real Estate)
Show Figures

Figure 1

30 pages, 34873 KiB  
Article
Text-Guided Synthesis in Medical Multimedia Retrieval: A Framework for Enhanced Colonoscopy Image Classification and Segmentation
by Ojonugwa Oluwafemi Ejiga Peter, Opeyemi Taiwo Adeniran, Adetokunbo MacGregor John-Otumu, Fahmi Khalifa and Md Mahmudur Rahman
Algorithms 2025, 18(3), 155; https://doi.org/10.3390/a18030155 - 9 Mar 2025
Cited by 1 | Viewed by 1382
Abstract
The lack of extensive, varied, and thoroughly annotated datasets impedes the advancement of artificial intelligence (AI) for medical applications, especially colorectal cancer detection. Models trained with limited diversity often display biases, especially when utilized on disadvantaged groups. Generative models (e.g., DALL-E 2, Vector-Quantized [...] Read more.
The lack of extensive, varied, and thoroughly annotated datasets impedes the advancement of artificial intelligence (AI) for medical applications, especially colorectal cancer detection. Models trained with limited diversity often display biases, especially when utilized on disadvantaged groups. Generative models (e.g., DALL-E 2, Vector-Quantized Generative Adversarial Network (VQ-GAN)) have been used to generate images but not colonoscopy data for intelligent data augmentation. This study developed an effective method for producing synthetic colonoscopy image data, which can be used to train advanced medical diagnostic models for robust colorectal cancer detection and treatment. Text-to-image synthesis was performed using fine-tuned Visual Large Language Models (LLMs). Stable Diffusion and DreamBooth Low-Rank Adaptation produce images that look authentic, with an average Inception score of 2.36 across three datasets. The validation accuracy of various classification models Big Transfer (BiT), Fixed Resolution Residual Next Generation Network (FixResNeXt), and Efficient Neural Network (EfficientNet) were 92%, 91%, and 86%, respectively. Vision Transformer (ViT) and Data-Efficient Image Transformers (DeiT) had an accuracy rate of 93%. Secondly, for the segmentation of polyps, the ground truth masks are generated using Segment Anything Model (SAM). Then, five segmentation models (U-Net, Pyramid Scene Parsing Network (PSNet), Feature Pyramid Network (FPN), Link Network (LinkNet), and Multi-scale Attention Network (MANet)) were adopted. FPN produced excellent results, with an Intersection Over Union (IoU) of 0.64, an F1 score of 0.78, a recall of 0.75, and a Dice coefficient of 0.77. This demonstrates strong performance in terms of both segmentation accuracy and overlap metrics, with particularly robust results in balanced detection capability as shown by the high F1 score and Dice coefficient. This highlights how AI-generated medical images can improve colonoscopy analysis, which is critical for early colorectal cancer detection. Full article
Show Figures

Figure 1

17 pages, 9381 KiB  
Article
The Architectural Language of Biophilic Design After Architects Use Text-to-Image AI
by Chaniporn Thampanichwat, Tarid Wongvorachan, Limpasilp Sirisakdi, Panyaphat Somngam, Taksaporn Petlai, Sathirat Singkham, Bhumin Bhutdhakomut and Narongrit Jinjantarawong
Buildings 2025, 15(5), 662; https://doi.org/10.3390/buildings15050662 - 20 Feb 2025
Cited by 3 | Viewed by 2220
Abstract
Biophilic design is an architectural concept that bridges the gap between modern buildings and the innate human longing for nature. In addition, it promotes physical and mental well-being while aligning with several Sustainable Development Goals. Recent research highlights that the architectural language used [...] Read more.
Biophilic design is an architectural concept that bridges the gap between modern buildings and the innate human longing for nature. In addition, it promotes physical and mental well-being while aligning with several Sustainable Development Goals. Recent research highlights that the architectural language used to describe the attributes of biophilic architecture remains unclear. Previous research has shown that text-to-image AI enhances architects’ ability to articulate their ideas more effectively. Therefore, this study aims to address the following research question: What are the architectural languages of biophilic design after architects use text-to-image AI? The initial step involves generating images of biophilic architecture by using three popular text-to-image AI tools: DALL-E 3, MidJourney, and Stable Diffusion. The 30 selected images were used to help architects develop the architectural language to describe the characteristics of biophilic design across 10 categories: Form, Space, Movement, Light, Color, Material, Object, View, Sound, and Weather. The terms obtained were analyzed using natural language processing (NLP) techniques, including word cloud analysis, frequency analysis, and topic modeling. The results indicate that the architectural language of biophilic design exhibits greater detail and clarity after architects utilize text-to-image AI. Nevertheless, in some instances, the language used to describe biophilic design is also constrained by the images generated by the text-to-image AI that the architects observe. Full article
(This article belongs to the Section Architectural Design, Urban Science, and Real Estate)
Show Figures

Figure 1

Back to TopTop