Next Article in Journal
Clinical Validation of Commercial AI Software for the Detection of Incidental Vertebral Compression Fractures in CT Scans of the Chest and Abdomen
Next Article in Special Issue
Machine and Deep Learning for the Diagnosis, Prognosis, and Treatment of Cervical Cancer: A Scoping Review
Previous Article in Journal
Development and Performance Evaluation of T-prep24: A Novel Automated Nucleic Acid Extraction System Based on Silica Magnetic Beads
Previous Article in Special Issue
Stacked Ensemble Learning for Classification of Parkinson’s Disease Using Telemonitoring Vocal Features
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

The Role of ChatGPT in Dermatology Diagnostics

1
Department of Dermatology, Rambam Health Care Campus, Haifa 3109601, Israel
2
Bruce Rappaport Faculty of Medicine, Technion Institute of Technology, Haifa 3525433, Israel
3
Ophthalmology Unit, Tzafon Medical Center, Tiberias 1528001, Israel
4
Dermatology Unit, Ziv Medical Center, Safed 13100, Israel
5
Maccabi Healthcare Services, Tel Aviv 6817110, Israel
*
Author to whom correspondence should be addressed.
Diagnostics 2025, 15(12), 1529; https://doi.org/10.3390/diagnostics15121529
Submission received: 10 May 2025 / Revised: 6 June 2025 / Accepted: 13 June 2025 / Published: 16 June 2025
(This article belongs to the Special Issue Machine-Learning-Based Disease Diagnosis and Prediction)

Abstract

:
Artificial intelligence (AI), especially large language models (LLMs) like ChatGPT, has disrupted different medical disciplines, including dermatology. This review explores the application of ChatGPT in dermatological diagnosis, emphasizing its role in natural language processing (NLP) for clinical data interpretation, differential diagnosis assistance, and patient communication enhancement. ChatGPT can enhance a diagnostic workflow when paired with image analysis tools, such as convolutional neural networks (CNNs), by merging text and image data. While it boasts great capabilities, it still faces some issues, such as its inability to perform any direct image analyses and the risk of inaccurate suggestions. Ethical considerations, including patient data privacy and the responsibilities of the clinician, are discussed. Future perspectives include an integrated multimodal model and AI-assisted framework for diagnosis, which shall improve dermatology practice.

1. Introduction

Artificial intelligence (AI) has become an increasingly pivotal force in medicine, offering transformative potential across diagnostics, treatment planning, and patient care [1]. In recent years, AI applications have significantly expanded, encompassing tools like natural language processing (NLP), image analysis, and machine learning, which enable precise analysis and prediction in diverse medical fields [2]. Large language models (LLMs), like ChatGPT, now propel an important frontier of healthcare innovation. While originally designed for use as conversational agents, these are now adapted for other purposes, such as patient communication, clinical decision support, and research assistance [3].
One of the very few emerging areas where substantial promise exists for ChatGPT and other similar LLMs is dermatology diagnostics. As dermatology relies predominantly on seeing and writing descriptively accurate texts, it emerges as a distinctive context within which an AI-driven tool such as ChatGPT can glean a major benefit. This includes facilitating the interpretation and generation of medical texts and diagnostic image descriptions that could assist healthcare professionals by generating descriptions or pinpointing characteristic patterns of various skin conditions [4,5]. In addition, NLP capabilities may allow ChatGPT to automatically mine the open access medical literature to help clinicians stay up to date with complex dermatological cases and relevant guidelines [4].
Electronic medical records (EMRs) are essential for dermatology, yet much of the data remains unstructured, making analysis challenging. Natural language processing (NLP) helps streamline EMR data, enabling automated documentation, improved patient history gathering, differential diagnosis suggestions, and integration with AI image analysis. These capabilities enhance patient care, especially in teledermatology, where NLP assists in organizing patient data for remote consultations. With models like ChatGPT-4 gaining traction, understanding their role in dermatology can maximize their potential in diagnostics and clinical support [6,7,8,9].
The objectives of this review are to examine the diversified applications of ChatGPT in dermatology diagnostics, such as interpreting images, natural language processing for synthesizing and summarizing clinical knowledge, integration with machine learning, and programmatic support for dermatological research. By undertaking these dimensions, we aim to dissect the potential, limitations, and ethical perspectives that must be taken into account for responsible implementation within dermatological practice.
We will be considering ChatGPT since it represents a significant step forward in using AI in the clinical setting, particularly in dermatology, having been designed by OpenAI. It is one of the most exciting commercial products that intersects with the clinical world, especially where physicians do not have the luxury of specializing in AIs. Its NLP abilities allow it to interact with clinical data for undertaking differentiation of diagnoses, interpretations of patient symptoms, and clinical documentation. With the increasing adoption rate among dermatologists, it is imperative to carry out an exhaustive review of the social uses, benefits, and drawbacks of ChatGPT. The review will help to illustrate how ChatGPT can further improve dermatologic care while also highlighting areas where improvement is needed so that clinical applications can be used more effectively and responsibly.

Background and Current State of AI in Dermatology

Over the last few decades, artificial intelligence (AI) has become an indispensable element in the field of dermatology diagnostics. Going from pattern recognition to deep learning systems, the evolution of AI in dermatological studies, both technologically and academically, has become more intricate [4]. Many of these early programs were designed to help clinicians recognize simple patterns in skin lesions through imaging techniques and computerized algorithms primarily to diagnose common skin conditions, such as melanoma and psoriasis [10].
Due to progress in machine learning (ML) and deep learning (DL), AI systems have become far more capable and adaptable [11]. DL models have shown to be especially useful in the analysis of skin lesions, where they learn from large sets of carefully annotated images and find patterns that are difficult for the human eye to detect [12]. For example, convolutional neural networks (CNNs) have attained an accuracy high enough to distinguish between benign and malignant lesions, sometimes performing at a level comparable to that of dermatologists [13,14]. However, traditional AI models have limitations. These include problems of generalizability across populations and lighting conditions and reliance on large, diverse datasets that are often difficult to collect and annotate within dermatology [15,16].
The development of AI has ushered in the era of large language models (LLMs), including OpenAI’s ChatGPT, which do indeed mark a change in utility for AI in fields of medicine like dermatology. While LLMs began with the purposes of natural language processing, they have become advanced enough to process and generate text in a manner similar to that of humans [17]. Unlike conventional ML or DL models, which mainly focus on image analysis, LLMs like ChatGPT can analyze text inputs for clinical decision support and facilitate patient–physician communication by explaining diagnostic reasoning or treatment options in lay language [18]. Such versatility makes LLMs an exciting addition to dermatology, especially in areas where image-based AI tools fall short, like patient-specific concerns or contextualizing image findings with patient histories [19].
As an AI algorithm, ChatGPT can contribute to diagnostic workflows beyond the limits of image-only models through analysis of substantial amounts of general medical knowledge such as addressing patient concerns, providing proofs detailing the diagnosis through evidence-based insights [20], and even an interpretation of complicated skin conditions through detailed descriptions given by the patient. For instance, ChatGPT can offer believable differential diagnoses or suggest follow-ups, especially advantageous in a primary-care setting [21], where dermatologists are not readily accessible. However, while this area may be promising, it is imperative to approach the integration of LLMs such as ChatGPT trained on dermatology datasets cautiously, as they are not free from certain limitations, for example, precision-dependent cases or situations requiring critical diagnostic accuracy [22].

2. Image Analysis Using ChatGPT and Its Integration with Diagnostic Tools

In dermatology, image analysis plays a pivotal role in diagnostics, for clinical images, dermoscopic images, histology images, and confocal microscopy images, with the use of instruments like dermoscopy and clinical imaging giving us high-resolution views that not only help with early detection but also facilitate accurate diagnosis of conditions like melanoma and other skin problems [23]. Dermatoscopic images play, in particular, a key role in enabling dermatologists to detect fine changes occurring within skin lesions since such structures are not readily visible to the naked eye. All of that, in turn, practically demonstrates drastic increases in diagnostic accuracy [24]. ChatGPT, being a large language model, is notable for interpreting prompts and formulating self-consistent text descriptions. Thus, processing detailed image-related descriptions for clinicians can assist ChatGPT in providing relevant information for potential diagnostic implications or possibly suggest areas for further examination [25]. Such capability enables dermatologists to use ChatGPT as an auxiliary tool for the generation or refinement of descriptions, notably in cases where immediate computer vision assessment is unavailable.
With descriptive assistance, ChatGPT may help dermatologists generate detailed and relevant descriptions emphasizing clinical signs and, potentially, minimize human errors while improving diagnostic accuracy [26]. ChatGPT can, therefore, serve as an assistance for the clinician in potentially rephrasing observations or suggesting an alternative diagnostic wording to elaborate an uncertain lesion description in a complex case; by adding the input of an expert, increased reliability and diagnosis were achievable [27].
Incorporating computer vision (CV) neural networks and deep machine learning frameworks into ChatGPT might enhance the existing setup of diagnostic workflows in dermatology [28]. The output of CNNs, trained on extensive datasets and influential in the constructive identification of features of skin lesions, could do much better with the input of ChatGPT for its natural processing of language in preparation for interpretive summaries or comments that could provide human interpreters with a little help [28,29]. Thereby, they adopt a symbiotic approach that involves portable knowledge pairing that combines the strengths of both systems, whereby computer vision interrogates the feature identifications and associated descriptions to relate this information to a clinically relevant narrative.
Previous GPT models more exclusively input text data and interpreted visual data with detailed descriptions. Long outputs were often not described properly or lost something vital, especially in areas like dermatology, where visual subtleties have always been in question [30]. With GPT-4’s multimodal capabilities, the limitation has been somewhat circumvented as the model can directly analyze an image together with text. But GPT-4 is still behind the achievements of specialized computer vision models for medical imaging and lacks the ability to cross-check its interpretations. This means that it could be a very handy adjunct but is not useful as an independent diagnostic tool.
The concomitant improvement of new developments in dermatologic diagnostic technologies may lie with the combination of off-the-shelf large language models, like ChatGPT, and next-generation computer vision technologies, thus developing a joint exterior approach, wherein LLMs process visual data properly with clinic parameters of such nature [22]. A conceptual overview of such a multimodal system is illustrated in Figure 1.
The evolving LLMs, in the process of interacting more directly with visual input, could provide predictive insights along with image analysis, streamlining diagnostics and maximizing early detection of dermatologic conditions. Such advances could lead to systems where LLMs like ChatGPT continuously refine image interpretations made by CNNs, culminating in a hybrid diagnostic model incorporating textual and visual analytics in an integrated approach serving clinical support [10,12].

3. Natural Language Processing (NLP) Capabilities of ChatGPT in Dermatology

NLP has already become a disruptive technology in the healthcare domain, mainly in diagnostics and patient management. NLP can analyze unstructured data in EHRs—patient notes, clinical histories, and diagnostics reports—which helps healthcare providers gain impressions that facilitate better and more accurate decisions in diagnosing and planning. In pathology, NLP can help to extract relevant data from the patient history and allow quick analysis of text-based information, enabling diagnostic workflows while easing the workload of a clinician [6,7,8,9].
Another shining example of efficient natural language processing developed by OpenAI in the dermatological field is ChatGPT. The pertinent information that really could guide a clinician’s diagnostic decision has now been extracted, analyzed, and presented on a platter by ChatGPT, using the patient anamnesis and clinical notes. To put it plainly, while scrutinizing patient medical histories, ChatGPT can present dermatologic symptoms along the lines of pruritus, erythema, and lesions, in this way allowing better initial assessments. In addition, ChatGPT is able to parse out relevant dermatologic symptoms among free-text inputs, allowing clinicians a more efficient means of garnering information needed in diagnostic metrics, which leads to smoothening the initial triage process in a wide range of clinical and telemedicine settings [6,9,30,31].
Teledermatology, telemedicine for the diagnosis of skin conditions, has gained ground in recent years, ever since newly found demands for remote healthcare solutions arose. ChatGPT’s new-wave NLP capabilities constitute a boost to documentation tools, wherein automated summaries of patient interactions and symptom reporting are significant aiding features. These features permit dermatologists to concentrate on the case, ensuring that clinically necessary medical records are accurate and comprehensive. Further, it helps communication by making sure the responses generated sufficiently cover the questions, clarifications of instructions, and laying down the treatment plan in a manner a patient can understand, especially when referring to a patient seeking dermatoscopy through virtual modalities [8,9,30,32,33,34].
Recent studies have begun with a glance at the applicability of ChatGPT and its equivalent natural language processing models in dermatology. In one of the studies analyzed, ChatGPT was competent enough to extract disease-specific terminology from dermatological case notes in support of improving the diagnostic work of clinicians [6,7,26]. Another study suggested that ChatGPT and various NLP approaches may further compress and summarize patient records, as well as create documentation errors in other medical methods [26,35,36]. Based on comparisons with traditional dermatological diagnostic means, some studies used OpenAI’s ChatGPT and other models of NLP to help non-specialist providers identify common skin diseases, which would allow these patients more access to dermatological services in under-served areas [9,37].
Like any other natural language processing technologies used in the healthcare system, ChatGPT in dermatology presents significant ethical concerns with regard to patient data, confidentiality, and privacy. The model must conform to the legality of patient confidentiality, meaning all data protection standards must be observed to guarantee patient secrecy, such as compliance with HIPAA in the United States. Further, there lies an ethical dilemma pertaining to the accuracy of NLP: in precise words and interpretation, this may lead to improper diagnoses. In a way, therefore, it presents risk in the automated processing of clinical data in dermatology, and hence appropriate data governance mechanisms and oversight from clinicians are necessary to mitigate risks associated with automated data processing [38,39,40].

4. Machine Learning and Programming Applications in Dermatology Diagnostics

ChatGPT is a powerful tool that researchers use to create and refine ML algorithms [41] oriented toward dermatology diagnostics. These revolve around feature selection, model optimization, and data preprocessing strategies tailored to meet clinical requirements and fit within the data structure [42]. For example, it may stipulate ways to improve performance in convolutional neural networks (CNNs) while analyzing dermatological images, possibly by data augmentation, thereby reducing overfitting [28,43,44]. ChatGPT greatly aids dermatology researchers in developing, optimizing, and debugging their programming skills, so they can perform programming activities more quickly [9,45]. For example, it can help develop Python scripts in training ML models on dermatological image datasets, creating pipelines for data preprocessing on skin lesion image datasets, or solving case problems of integrating libraries like TensorFlow (version 2.19.0) or PyTorch (v2.7.0) into existing workflows. ChatGPT can also provide explanations for coding errors or suggest alternative methods to improve computational efficiency [46,47].

5. Summary of the Literature Findings on Machine Learning and Programming Applications in Dermatology Diagnostics

Recent advances in artificial intelligence widen its applications in dermatology, and ChatGPT emerges as a new tool to aid in diagnosis. This review presents findings from the studies about ChatGPT’s utility in dermatological diagnosis, being compared with traditional AI models, and outlines its reception by dermatologists and patients.
ChatGPT has been evaluated for its potential in dermatology diagnostics in a few studies. For example, [Ferreira et al., 2023] found that ChatGPT offers competent evidence of recognizing some symptoms of conditions like eczema and acne, showing an 88% agreement with expert dermatologists [37]. In the same way, [Lam Hoai & Simonart, 2023; Shapiro et al., 2024] evaluated its attempts at giving treatment advice, reporting a moderate to high degree of accuracy in conformity with current clinical guidelines [6,7].
ChatGPT has recently been compared with traditional AI models in dermatology, bringing various strengths and performance to light. One study comparing ChatGPT and Claude 3 Opus examined their respective capabilities with both malignancies and benign birthmarks of 100 dermoscopic images. ChatGPT was more correct in classifying the top three diagnoses than an alternative technology, being slightly superior at 78% to 76%, while Claude 3 Opus, having higher sensitivity and specificity, was more competent than ChatGPT in malignancy discrimination. ChatGPT, with a better breadth of knowledge and contextual considerations, sometimes failed to classify benign conditions as being malignant, a concern that underlines the necessity for better fine-tuning before clinical use [48].
One of the major areas of study that is focused chiefly on is the user experience of ChatGPT. Based on the findings of Goktas and Grzybowski, ChatGPT was found to be conversationally useful in supporting patient education and decision-making [45,49]. In the meantime, [Alanezi, 2024] validated that the patients had high satisfaction with the use of ChatGPT for preliminary consultations, adding that accessibility and a non-threatening interface were the main strengths [50].
In a recent review to assess the performance of ChatGPT for a multitude of dermatological conditions, it was shown to have varying degrees of good and poor accuracy. According to studies, ChatGPT has been cited with high accuracy in issues such as psoriasis and eczema, yet users complained of poor performance with more complex instances of, say, cutaneous neoplasms, otherwise complicated by the fact that some of these neoplasms require nuance for proper identification. For example, ChatGPT fared better in identifying benign lesions by description but had serious trouble correctly diagnosing malignant lesions compared to image-based AI models.
There was a large variance in the performance, necessitating improvement in the size of the dataset or adjustment to the models of improvement so that the complexity and variability of dermatological manifestations may be dealt with [45].
The reliability of ChatGPT in generating accurate differential diagnoses remains a topic of ongoing debate. Some studies highlight its ability to provide consistent and reasonable diagnostic suggestions for common dermatological conditions, demonstrating its potential as a supportive tool in clinical settings [7,9,45]. However, other research highlights occasional inaccuracies, particularly in cases involving rare or atypical presentations [48,49,51]. These discrepancies underline that while ChatGPT can complement clinical efforts by offering differential diagnoses, it should not be relied upon as a standalone diagnostic tool. Experts consistently emphasize its role as an adjunct to enhance, rather than replace, professional judgment.
To date, several studies on the uses of Generative Pre-trained Transformers (GPTs) in dermatology have explored various domains in diagnostics, therapy recommendations, and patient education. More precisely, Table 1 presents a summary of 18 key studies that picture the variety of GPT applications.
The mainstay across several studies (10) illustrates the capacity of GPTs in dermatologic diagnosis, allowing treatment through text, multimodal, or image-based systems. For instance, GPT-4 impressed with extraordinary diagnostic accuracy across several dermatologic conditions in both text and multimodal examples, but does not yet show a satisfactory level of accuracy in image-based diagnosis, as discussed by Pillai et al. [52]. In comparative studies, such as Liu et al., the strength of GPTs was examined against that of other AI models, showing exceptional areas as well as those needing refinement in melanoma diagnosis [18].
Less research has focused on management and treatment recommendations, but a study by Iqbal et al. indicated that GPT-4 may offer parallel second opinions for dermatology treatments that earned wide approval among dermatologists [53]. Yet it has trouble with more difficult cases, which opens up room for enhancement.
As for educational and examination performance, GPT-4 is notedly more effective than its predecessors, attaining accuracies from 75 to 93% in various dermatological examinations (Elias et al.; Passby et al.) [54,55]. These works show promise in being able to apply GPT-4 to vetting an adjunct to medical education; however, the shortcomings involving high-difficulty questions indicate a need for clinical assessment on its behalf.
Finally, studies on NLP applications have indicated GPT’s participation in conducting an analysis of unstructured patient data and building materials of patient education, to cite examples like Shapiro et al., where GPT-4 provided high accuracy amid psoriasis patient records [6], and Lambert et al., where it generated accessible patient education materials tailored to different reading levels [56].
In general, while most studies center on its diagnostic applications, the growing exploration of GPT in treatment and education reflects the growing interest in its multifaceted potential. However, the consideration of further research into the integration of GPT into dermatology is needed due to various challenges such as ethical concerns, limitations in dealing with complex cases, and variable performance. An overview of the published studies on ChatGPT and other LLMs in dermatology is summarized in Table 2.
Table 1. Categorization of studies on GPT in dermatology.
Table 1. Categorization of studies on GPT in dermatology.
TypeFocus AreaPaper Title/DetailsAuthorsYearSummary of FindingsVersion of GPTImage Dataset SourceDiagnosis Compared toStudy TypeDiagnosis Made Based on Images Alone or MetadataImage no.Types of Images
DiagnosticsImage ProcessingEvaluating the Diagnostic and Treatment Recommendation Capabilities of GPT-4 Vision in Dermatology [52]Pillai et al.2024GPT-4V demonstrated strong diagnostic accuracy for dermatological conditions, especially in text-based scenarios, with 89% accuracy in both text and multimodal setups. However, its image-based diagnosis showed lower performance, highlighting the need for further model development.GPT-4.0Publicly available sources: dermnet.nz and dermatlas.org.Two board-certified dermatologistsN/AA combination of both images and metadata54 images.Images depicting 9 common dermatological conditions, showcasing classic manifestations of these conditions.
Argentine dermatology and ChatGPT: infrequent use and intermediate stance [57]Ko et al.2024A survey of 257 Argentine dermatologists showed 83.7% were familiar with ChatGPT, but 65.4% had never used it. While 74.9% expressed interest in future use, only 5.4% used it frequently. Most were ‘early majority’ adopters.N/AN/AN/AProspectiveN/AN/AN/A
Claude 3 Opus and ChatGPT With GPT-4 in Dermoscopic Image Analysis for Melanoma Diagnosis: Comparative Performance Analysis [48]Liu et al.2024This study compared the diagnostic performance of Claude 3 Opus and ChatGPT for melanoma detection, finding no significant difference in primary diagnosis accuracy but superior malignancy discrimination by Claude 3 Opus. Both models showed potential, but their limitations highlight the need for further development in AI-driven dermatology tools.GPT-4.0The International Skin Imaging Collaboration (ISIC) archive.N/AN/AImage100Dermoscopic images of melanocytic lesions.
ChatGPT versus clinician: challenging the diagnostic capabilities of artificial intelligence in dermatology [58]Stoneham et al.2024ChatGPT correctly diagnosed 56% of cases with expert data and 39% with non-specialist data, lower than dermatologists (83%). It always provided a differential diagnosis but did not significantly improve diagnostic accuracy in primary or secondary care.GPT-4.0N/AThe diagnosis made by ChatGPT was compared to those made by dermatologists (experts) and nonspecialists.ectiveRetrospectiveMetadataN/AN/A
NLP A Qualitative Analysis of Provider Notes of Atopic Dermatitis-Related Visits Using Natural Language Processing Methods [8]Pierce et al.2021This study analyzed provider notes for 133,025 patients with atopic dermatitis (AD), revealing a focus on symptoms (primarily itch) and treatment, but limited documentation of AD’s impact on patients’ work or lifestyle. The findings highlight a care gap that requires further investigation.N/AN/AN/ARetrospectiveN/AN/AN/A
Application of a natural language processing artificial intelligence tool in psoriasis: A cross-sectional comparative study on identifying affected areas in patients’ data [6]Shapiro et al. 2024ChatGPT-4 accurately analyzed unstructured EMR data from psoriasis patients, identifying affected body areas with 92.8% accuracy. It demonstrated high performance in detecting nail and joint involvement, though errors were more common in complex cases.GPT-4.0The study does not involve images; it uses unstructured text data from EMRs.Senior dermatologistRetrospectiveMetadataN/AN/A
Comparing Meta-Analyses with ChatGPT in the Evaluation of the Effectiveness and Tolerance of Systemic Therapies in Moderate-to-Severe Plaque Psoriasis [7]Lam Hoai et al.2023ChatGPT-4 accurately analyzed psoriasis patient data, identifying affected areas with 92.8% accuracy. It performed well in detecting nail and joint involvement, though errors occurred in complex cases.N/AThe study does not mention the use of an image dataset; it focuses on evaluating textual data and conclusions from meta-analyses.ExpertsRetrospectiveThe study did not involve image-based diagnosis; it focused on evaluating textual conclusions from meta-analyses and ChatGPT outputsN/AN/A
Patient InteractionUse of ChatGPT for Query HandlingTrends in Accuracy and Appropriateness of Alopecia Areata Information Obtained from a Popular Online Large Language Model, ChatGPT [59]O’Hagan et al.2023ChatGPT 4.0 demonstrated higher accuracy (4.53/5) than ChatGPT 3.5 (4.29/5) in addressing patient questions about alopecia areata. Responses were rated highly appropriate for general information and moderately suitable for EHR drafts, indicating potential for patient education and clinical use.N/AN/AN/AN/AN/AN/AN/A
Comparing the quality of ChatGPT- and physician-generated responses to patients’ dermatology questions in the electronic medical record [9]Reynolds et al.2024This study was evaluating responses to patient questions, physician-generated responses were preferred over ChatGPT’s, especially for readability and empathy. However, ChatGPT was seen as useful for drafting initial responses and providing educational information.GPT-3.5N/AThe diagnosis was compared to responses from dermatology physicians, as well as nonphysicians (blinded reviewers).RetrospectiveN/AN/AN/A
Generates learning materialsAssessing the Application of Large Language Models in Generating Dermatologic Patient Education Materials According to Reading Level: Qualitative Study [56]Lambert et al.2024LLMs like GPT-4 generate dermatologic patient education materials (PEMs) at specified reading levels, with GPT-4 performing best at the fifth-grade level for both common and rare conditions. PEMs produced by LLMs are generally accurate, easy to read, and understandable for patients, with variable results at the seventh-grade level.ChatGPT-3.5, GPT-4.0.N/AThe diagnosis was compared to 2 blinded dermatology resident trainees.N/AN/AN/AN/A
OthersPerforming examsChatGPT-3.5 and ChatGPT-4 dermatological knowledge level based on the Specialty Certificate Examination in Dermatology [60]Lewandowski et al.2024ChatGPT-4 outperformed ChatGPT-3.5 in dermatology exams, achieving 80–93% accuracy in English and 70–84% in Polish. While effective in clinical decision support, it struggles with high-difficulty questions. Recommended for aiding but not replacing physicians.ChatGPT-3.5, GPT-4.0.N/AThe study compared the performance of ChatGPT to that of a dermatologist with 25 years of experience, who reviewed the questions for compliance with current knowledge.RetrospectiveN/AN/AN/A
Performance of ChatGPT on Specialty Certificate Examination in Dermatology multiple-choice questions [55]Passby et al.2024ChatGPT-4 scored 90% on 84 multiple-choice dermatology questions, outperforming ChatGPT-3.5 (63%). This highlights AI’s potential in clinical decision-making, with caution regarding complex cases and patient safety.ChatGPT-3.5 and ChatGPT-4.N/AN/AN/AN/AN/AN/A
OpenAI’s GPT-4 performs to a high degree on board-style dermatology questions [54]Elias et al. 2024GPT-4 achieved 75% accuracy on dermatology board-style questions, showing potential as an educational tool but requiring improvements in response depth and completeness for unsupervised learning. Its performance was consistent across subspecialties and question difficulty.GPT-4.0N/AThe diagnosis was compared to the correct answers evaluated by two physicians.Cross-sectional studyN/AN/AN/A
Pediatric dermatologists versus AI bots: Evaluating the medical knowledge and diagnostic capabilities of ChatGPT [61]Huang et al.2024This study compares OpenAI’s ChatGPT (versions 3.5 and 4.0) to pediatric dermatologists in answering multiple-choice and case-based questions. Results show that while human clinicians outperformed both AI versions, ChatGPT-4.0 performed comparably in some areas, highlighting AI’s potential with clinician oversight.GPT-3.5 and GPT-4.0The image dataset was not used. For cases with accompanying images, only text descriptions were includedThe diagnosis was compared to pediatric dermatologistsprospectivetext descriptions aloneN/AN/A
managementAn evaluation of ChatGPT compared with dermatological surgeons’ choices of reconstruction for surgical defects after Mohs surgery [62]Cuellar-Barboza et al. 2024This study found that while ChatGPT-4 showed slight concordance with dermatologists in reconstructive decision-making for skin cancer surgery, the agreement was lower than that between dermatologists themselves. The findings highlight the variability in AI-driven medical decisions and the importance of certified expertise.ChatGPT-4.0N/AThe diagnosis was compared to dermatological surgeons’ choices.RetrospectiveN/AN/AN/A
Evaluation of ChatGPT’s acne advice [63]Li et al.2024This study assessed ChatGPT’s responses to acne-related queries, evaluating accuracy, completeness, and relevance. Dermatologists rated the responses as satisfactory, limited, or problematic, revealing variable quality in the answers.ChatGPT-3.5N/AThe diagnosis was compared to two board-certified dermatologistsRetrospectiveN/AN/AN/A
Machine LearningDiagnosticsPre-trained multimodal large language model enhances dermatological diagnosis using SkinGPT-4 [64]Zhou et al.2024SkinGPT-4 is an interactive dermatology diagnostic system that combines a vision transformer and Llama-2-13b-chat, trained on 52,929 skin disease images. It autonomously diagnoses skin conditions, analyzes characteristics, and offers treatment recommendations based on real-life evaluations with dermatologists.
TreatmentCan large language models provide secondary reliable opinion on treatment options for dermatological diseases? [53]Iqbal et al. 2024Proven potential to provide accurate second opinions on dermatological medication recommendations (98.87% approval rate by dermatologists). However, limitations include occasional coding inaccuracies and incomplete data, suggesting the need for domain-specific knowledge integration.
ExamsAssessing large language models’ accuracy in providing patient support for choroidal melanoma [65]Anguita et al. 2024ChatGPT provided the most accurate answers (92%) for medical advice questions about choroidal melanoma compared to Bing AI and DocsGPT. However, inconsistencies highlight the need for fine-tuning and oversight before clinical use. Assessing large language models’ accuracy in providing patient support for choroidal melanoma [65]Anguita et al.2024
Performance of Three Large Language Models on Dermatology Board Examinations [66]Mirza et al.2024GPT-4 outperformed GPT-3.5 and Google Bard in dermatology board-style questions, achieving 81.7% accuracy and passing CORE and APPLIED exams. Challenges included difficulty with higher-order and complex questions. Performance of Three Large Language Models on Dermatology Board Examinations [66]Mirza et al.2024
[53,64] ReviewsReview—diagnosticsAssessing the Impact of ChatGPT in Dermatology: A Comprehensive Rapid Review. [67]Goktas et al.2024ChatGPT shows promise in patient education and teledermatology but faces challenges in diagnosing complex cases and raises ethical concerns regarding data privacy and algorithmic bias. Future research should focus on improving its diagnostic accuracy and addressing these issues.
General review on the potentialChatGPT and dermatology [68]D’AGOSTINO et al.2024This review explores the potential applications of ChatGPT in dermatology, highlighting its role in clinical practice and patient support. It emphasizes the synergy between AI and dermatology, driving innovation in healthcare delivery.
Potential applications of ChatGPT in dermatology [69]Kluger2023Supports clinical decision-making and treatment planning with high accuracy for common conditions. Facilitates patient education, simplifies medical writing, and integrates into teledermatology platforms for consultations and triage. Limitations include challenges in multilingual settings, image interpretation, and ethical concerns.
Ethical considerations for artificial intelligence in dermatology: a scoping review [70]Gordon et al.2024AI applications span mobile apps for skin cancer detection, clinical image analysis, and large language models for diagnostic queries. Ethical concerns include biases, misdiagnosis risks, data privacy, and exacerbation of health disparities in teledermatology. Benefits include improved access, decision-making, and efficiency in clinical practice, but safeguards are necessary for ethical use.
Analyzing potentialChatGPT for healthcare providers and patients: Practical implications within dermatology [71]Jin et al.2023Identified five domains of use in dermatology, including automating administrative tasks, enhancing patient education, supporting medical education, aiding clinical research, and improving health literacy. Ethical challenges include risks of “artificial hallucinations,” biases, and outdated datasets, necessitating systematic validation and usage guidelines.
The Arrival of Artificial Intelligence Large Language Models and Vision-Language Models: A Potential to Possible Change in the Paradigm of Healthcare Delivery in Dermatology [19]Gupta et al.2024The study explores the potential of large language models (LLMs) and vision-language models (VLMs) in dermatology, addressing how AI can improve patient care amidst challenges like workload and staffing shortages. AI technologies, such as ChatGPT and Google Bard, could transform dermatology by integrating text and image inputs.

6. Limitations of Using Large Language Models (LLMs) Like ChatGPT in Dermatology Diagnostics

Though LLMs like ChatGPT have shown possibilities in aiding dermatology diagnostics, they still hold some considerable shortcomings. The inability to interpret dermatoscopic or clinical photographs directly is a major limitation; this is fundamentally central to the diagnosis of most skin conditions. Therefore, their incapacity to make independent diagnoses is largely due to their dependence on other tools or their joint incorporation with image-processing systems [72,73]. Nonetheless, the introduction of GPT-4 has partly resolved this limitation. In this regard, GPT-4, by combining with vision humanoid models or external APIs, has the possibility to process and analyze pictures to some degree, rendering dermatological practice more relevant by joining both text and visual diagnostic workflows [74].
Despite these advancements, the risk of “hallucination”—generating incorrect or misleading suggestions—remains a concern, particularly in high-stakes clinical scenarios [75]. Furthermore, dermatologic conditions with similar visual presentations often require nuanced differential diagnosis, a task that AI-based imaging tools may not reliably perform without contextual clinical information [48,58]. Furthermore, these models still require integration with domain-specific knowledge bases to ensure relevance and accuracy, as their general training data often lack the granularity necessary for specialized medical contexts [76,77].
It is worth mentioning that uploading images to GPT may result in some ethical consequences. Before sharing images, patient permission must be obtained, taking dignity and confidentiality into account. Temporary chat mode—the automatic deletion of the conversation and any uploaded data when the session closes—is preferable in terms of potential risks. This would further avoid abuse and guarantee better privacy protection. Additionally, one such capacity in allowing this temporary and single-session interaction gives evidence of a huge advantage to keeping ethics in place when such AI technologies are used in healthcare [70,78,79].

7. Future Perspectives

The future of integrating LLMs like ChatGPT into dermatology looks bright, with several anticipated developments listed below:
  • Better Integration with Diagnostic Tools:
The combination of ChatGPT with dermatoscopes, smartphone applications, and other diagnostic technologies has the prospective of creating an improved workflow of clinicians, facilitating rapid data interpretation and prompt decision-making. For instance, linking ChatGPT with a smartphone app specially for clinical use would allow for real-time, dialogue-based analyses of patient-reported symptoms in conjunction with visual data from dermatoscopic images.
  • Multimodal Models:
Future AI systems combining text and image analysis stand to provide a synthesis of a holistic diagnostic approach. These multimodal models could better connect the language-functioning domain of ChatGPT with various imaging technologies in improving diagnostic accuracy for those conditions requiring visual assessment, such as melanomas and psoriasis.
  • Personalized Treatment Recommendations:
LLMs could be personalized to incorporate patient-specific data like genetic profiles, medical history, and lifestyle preferences into treatment recommendations. This will enable dermatologists to provide more targeted accompaniment to them.
  • AI-Assisted Diagnostic Frameworks:
The development of frameworks involving models like ChatGPT could reform the practice of dermatology following AI-assisted consultations and a clinical support system for diagnosis that would enhance clinical efficiency and precision.
  • Future Research Directions:
Longitudinal studies assessing the long-term reliability and outcomes of these technologies are needed for the promise of AI in dermatology to be fully realized. These can include real-world implementation projects and clinical trials aimed at their exploration of efficiency and scalability, ensuring AI models produce unbiased functioning across different populations and settings.

8. Conclusions

The integration of ChatGPT and similar LLMs in dermatology is a major step forward toward assistance in diagnosis and clinical support. While the model offers significant advantages, especially in terms of natural language processing and image analysis, it is fraught with limitations that could affect the diagnostic interpretation of visual laboratory tests and assignments. ChatGPT can also serve as a very helpful assistant for dermatologists to speed up the processing of patient information and differential diagnosis particularly in teledermatology and remote care contexts. Nevertheless, we argue for further research and development to optimize such instruments. A more accurate, self-informing tool would still command the highest approach because reputation is at stake as far as these case studies are concerned, alongside all the possible benefits and challenges like data privacy and clinical liability. Owing to the continued evolution of AI technology, the future of dermatology will involve a more seamless integration of AI diagnostic tools for always better efficiencies, accuracy, and equitable patient care improvements.

Author Contributions

Conceptualization, Z.K.; writing—original draft preparation, M.A.; writing—review and editing, B.J.; formal analysis, N.B.; supervision, J.S. All authors contributed equally. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

All the authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Alowais, S.A.; Alghamdi, S.S.; Alsuhebany, N.; Alqahtani, T.; Alshaya, A.I.; Almohareb, S.N.; Aldairem, A.; Alrashed, M.; Saleh, K.B.; Badreldin, H.A.; et al. Revolutionizing healthcare: The role of artificial intelligence in clinical practice. BMC Med. Educ. 2023, 23, 689. Available online: https://bmcmededuc.biomedcentral.com/articles/10.1186/s12909-023-04698-z (accessed on 12 November 2024). [CrossRef] [PubMed]
  2. Alqahtani, T.; Badreldin, H.A.; Alrashed, M.; Alshaya, A.I.; Alghamdi, S.S.; bin Saleh, K.; Alowais, S.A.; Alshaya, O.A.; Rahman, I.; Al Yami, M.S.; et al. The emergent role of artificial intelligence, natural learning processing, and large language models in higher education and research. Res. Soc. Adm. Pharm. 2023, 19, 1236–1242. [Google Scholar] [CrossRef] [PubMed]
  3. De Angelis, L.; Baglivo, F.; Arzilli, G.; Privitera, G.P.; Ferragina, P.; Tozzi, A.E.; Rizzo, C. ChatGPT and the rise of large language models: The new AI-driven infodemic threat in public health. Front. Public Health 2023, 11, 1166120. Available online: https://pubmed.ncbi.nlm.nih.gov/37181697/ (accessed on 12 November 2024).
  4. Du-Harpur, X.; Watt, F.M.; Luscombe, N.M.; Lynch, M.D. What is AI? Applications of artificial intelligence to dermatology. Br. J. Dermatol. 2020, 183, 423–430. Available online: https://pubmed.ncbi.nlm.nih.gov/31960407/ (accessed on 12 November 2024). [CrossRef]
  5. Khalifa, M.; Albadawy, M. AI in diagnostic imaging: Revolutionising accuracy and efficiency. Comput. Methods Programs Biomed. Update 2024, 5, 100146. [Google Scholar] [CrossRef]
  6. Shapiro, J.; Baum, S.; Pavlotzky, F.; Mordehai, Y.B.; Barzilai, A.; Freud, T.; Gershon, R. Application of a natural language processing artificial intelligence tool in psoriasis: A cross-sectional comparative study on identifying affected areas in patients’ data. Clin. Dermatol. 2024, 42, 480–486. [Google Scholar] [CrossRef]
  7. Lam Hoai, X.L.; Simonart, T. Comparing Meta-Analyses with ChatGPT in the Evaluation of the Effectiveness and Tolerance of Systemic Therapies in Moderate-to-Severe Plaque Psoriasis. J. Clin. Med. 2023, 12, 5410. Available online: https://www.mdpi.com/2077-0383/12/16/5410/htm (accessed on 15 November 2024). [CrossRef] [PubMed]
  8. Pierce, E.J.; Boytsov, N.N.; Vasey, J.J.; Sudaria, T.C.; Liu, X.; Lavelle, K.W.; Bogdanov, A.N.; Goldblum, O.M. A Qualitative Analysis of Provider Notes of Atopic Dermatitis-Related Visits Using Natural Language Processing Methods. Dermatol. Ther. 2021, 11, 1305–1318. Available online: https://link.springer.com/article/10.1007/s13555-021-00553-5 (accessed on 15 November 2024). [CrossRef]
  9. Reynolds, K.; Nadelman, D.; Durgin, J.; Ansah-Addo, S.; Cole, D.; Fayne, R.; Harrell, J.; Ratycz, M.; Runge, M.; Shepard-Hayes, A.; et al. Comparing the Quality of ChatGPT-and Physician-Generated Responses to Patients’ Dermatologic Questions in the Electronic Medical Record. Available online: https://academic.oup.com/ced/advance-article-abstract/doi/10.1093/ced/llad456/7511330 (accessed on 15 November 2024).
  10. Fliorent, R.; Fardman, B.; Podwojniak, A.; Javaid, K.; Tan, I.J.; Ghani, H.; Truong, T.M.; Rao, B.; Heath, C. Artificial intelligence in dermatology: Advancements and challenges in skin of color. Int. J. Dermatol. 2024, 63, 455–461. Available online: https://onlinelibrary.wiley.com/doi/full/10.1111/ijd.17076 (accessed on 13 November 2024). [CrossRef]
  11. Soori, M.; Arezoo, B.; Dastres, R. Artificial intelligence, machine learning and deep learning in advanced robotics, a review. Cogn. Robot. 2023, 3, 54–70. [Google Scholar] [CrossRef]
  12. Abhishek, K.; Kawahara, J.; Hamarneh, G. Predicting the clinical management of skin lesions using deep learning. Sci. Rep. 2021, 11, 7769. Available online: https://www.nature.com/articles/s41598-021-87064-7 (accessed on 13 November 2024).
  13. Esteva, A.; Kuprel, B.; Novoa, R.A.; Ko, J.; Swetter, S.M.; Blau, H.M.; Thrun, S. Dermatologist-level classification of skin cancer with deep neural networks. Nature 2017, 542, 115–118. Available online: https://www.nature.com/articles/nature21056 (accessed on 13 November 2024). [CrossRef]
  14. Voon, W.; Hum, Y.C.; Tee, Y.K.; Yap, W.S.; Salim, M.I.M.; Tan, T.S.; Mokayed, H.; Lai, K.W. Performance analysis of seven Convolutional Neural Networks (CNNs) with transfer learning for Invasive Ductal Carcinoma (IDC) grading in breast histopathological images. Sci. Rep. 2022, 12, 19200. Available online: https://www.nature.com/articles/s41598-022-21848-3 (accessed on 13 November 2024). [CrossRef] [PubMed]
  15. Venkatesh, K.P.; Raza, M.M.; Nickel, G.; Wang, S.; Kvedar, J.C. Deep learning models across the range of skin disease. NPJ Digit. Med. 2024, 7, 32. Available online: https://www.nature.com/articles/s41746-024-01033-8 (accessed on 13 November 2024). [CrossRef] [PubMed]
  16. Ozmen Garibay, O.; Winslow, B.; Andolina, S.; Antona, M.; Bodenschatz, A.; Coursaris, C.; Falco, G.; Fiore, S.M.; Garibay, I.; Grieman, K.; et al. Six Human-Centered Artificial Intelligence Grand Challenges. Int. J. Hum. Comput. Interact. 2023, 39, 391–437. Available online: https://www.tandfonline.com/doi/abs/10.1080/10447318.2022.2153320 (accessed on 13 November 2024). [CrossRef]
  17. Hubert, K.F.; Awa, K.N.; Zabelina, D.L. The current state of artificial intelligence generative language models is more creative than humans on divergent thinking tasks. Sci. Rep. 2024, 14, 3440. Available online: https://www.nature.com/articles/s41598-024-53303-w (accessed on 13 November 2024). [CrossRef]
  18. Liu, J.; Wang, C.; Liu, S. Utility of ChatGPT in Clinical Practice. J. Med. Internet Res. 2023, 25, e48568. Available online: https://pmc.ncbi.nlm.nih.gov/articles/PMC10365580/ (accessed on 13 November 2024). [CrossRef]
  19. Gupta, A.K.; Talukder, M.; Wang, T.; Daneshjou, R.; Piguet, V. The Arrival of Artificial Intelligence Large Language Models and Vision-Language Models: A Potential to Possible Change in the Paradigm of Healthcare Delivery in Dermatology. J. Investig. Dermatol. 2024, 144, 1186–1188. Available online: https://www.jidonline.org/article/S0022-202X(24)00004-6/fulltext (accessed on 13 November 2024). [CrossRef]
  20. Mu, Y.; He, D. The Potential Applications and Challenges of ChatGPT in the Medical Field. Int. J. Gen. Med. 2024, 17, 817–826. [Google Scholar] [CrossRef]
  21. Chen, J.; Liu, L.; Ruan, S.; Li, M.; Yin, C. Are Different Versions of ChatGPT’s Ability Comparable to the Clinical Diagnosis Presented in Case Reports? A Descriptive Study. J. Multidiscip. Healthc. 2023, 16, 3825–3831. [Google Scholar] [CrossRef]
  22. Clusmann, J.; Kolbinger, F.R.; Muti, H.S.; Carrero, Z.I.; Eckardt, J.N.; Laleh, N.G.; Löffler, C.M.L.; Schwarzkopf, S.-C.; Unger, M.; Veldhuizen, G.P.; et al. The future landscape of large language models in medicine. Commun. Med. 2023, 3, 141. Available online: https://www.nature.com/articles/s43856-023-00370-1 (accessed on 13 November 2024). [CrossRef]
  23. Sonthalia, S.; Yumeen, S.; Kaliyadan, F. Dermoscopy Overview and Extradiagnostic Applications. Available online: https://www.ncbi.nlm.nih.gov/books/NBK537131/ (accessed on 14 November 2024).
  24. Wang, Z.; Wang, C.; Peng, L.; Lin, K.; Xue, Y.; Chen, X.; Bao, L.; Liu, C.; Zhang, J.; Xie, Y. Radiomic and deep learning analysis of dermoscopic images for skin lesion pattern decoding. Sci. Rep. 2024, 14, 19781. Available online: https://www.nature.com/articles/s41598-024-70231-x (accessed on 14 November 2024). [CrossRef]
  25. Omar, M.; Ullanat, V.; Loda, M.; Marchionni, L.; Umeton, R. ChatGPT for digital pathology research. Lancet Digit. Health 2024, 6, e595–e600. Available online: http://www.thelancet.com/article/S2589750024001146/fulltext (accessed on 14 November 2024). [CrossRef] [PubMed]
  26. Goktas, P.; Grzybowski, A. Assessing the Impact of ChatGPT in Dermatology: A Comprehensive Rapid Review. J. Clin. Med. 2024, 13, 5909. Available online: https://pmc.ncbi.nlm.nih.gov/articles/PMC11477344/ (accessed on 14 November 2024). [CrossRef] [PubMed]
  27. Gabashvili, I.S. ChatGPT in Dermatology: A Comprehensive Systematic Review. 2023. Available online: http://medrxiv.org/lookup/doi/10.1101/2023.06.11.23291252 (accessed on 14 November 2024).
  28. Nazir, A.; Wang, Z. A comprehensive survey of ChatGPT: Advancements, applications, prospects, and challenges. Meta-Radiology 2023, 1, 100022. [Google Scholar] [CrossRef]
  29. Alzubaidi, L.; Zhang, J.; Humaidi, A.J.; Al-Dujaili, A.; Duan, Y.; Al-Shamma, O.; Santamaría, J.; Fadhel, M.A.; Al-Amidie, M.; Farhan, L. Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. J. Big Data 2021, 8, 53. Available online: https://journalofbigdata.springeropen.com/articles/10.1186/s40537-021-00444-8 (accessed on 14 November 2024). [CrossRef]
  30. Ray, P.P. ChatGPT: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope. Internet Things Cyber-Phys. Syst. 2023, 3, 121–154. [Google Scholar] [CrossRef]
  31. Roumeliotis, K.I.; Tselikas, N.D. ChatGPT and Open-AI Models: A Preliminary Review. Future Internet 2023, 15, 192. Available online: https://www.mdpi.com/1999-5903/15/6/192/htm (accessed on 15 November 2024). [CrossRef]
  32. Shapiro, J.; Lyakhovitsky, A. Revolutionizing teledermatology: Exploring the integration of artificial intelligence, including Generative Pre-trained Transformer chatbots for artificial intelligence-driven anamnesis, diagnosis, and treatment plans. Clin. Dermatol. 2024, 42, 492–497. Available online: http://www.cidjournal.com/article/S0738081X24001044/fulltext (accessed on 15 November 2024). [CrossRef]
  33. McKoy, K.; Halpern, S.; Mutyambizi, K. International Teledermatology Review. Curr. Dermatol. Rep. 2021, 10, 55. Available online: https://pmc.ncbi.nlm.nih.gov/articles/PMC8317676/ (accessed on 15 November 2024). [CrossRef]
  34. Campagna, M.; Naka, F.; Lu, J. Teledermatology: An updated overview of clinical applications and reimbursement policies. Int. J. Womens Dermatol. 2017, 3, 176–179. [Google Scholar] [CrossRef]
  35. Alomari, E.A. Unlocking the Potential: A Comprehensive Systematic Review of ChatGPT in Natural Language Processing Tasks. Comput. Model. Eng. Sci. 2024, 141, 43–85. [Google Scholar] [CrossRef]
  36. Bansal, G.; Chamola, V.; Hussain, A.; Guizani, M.; Niyato, D. Transforming Conversations with AI—A Comprehensive Study of ChatGPT. Cognit. Comput. 2024, 16, 2487–2510. Available online: https://link.springer.com/article/10.1007/s12559-023-10236-2 (accessed on 15 November 2024). [CrossRef]
  37. Ferreira, A.L.; Chu, B.; Grant-Kels, J.M.; Ogunleye, T.; Lipoff, J.B. Evaluation of ChatGPT Dermatology Responses to Common Patient Queries. JMIR Dermatol. 2023, 6, e49280. Available online: https://pmc.ncbi.nlm.nih.gov/articles/PMC10692871/ (accessed on 15 November 2024). [CrossRef] [PubMed]
  38. Jerfy, A.; Selden, O.; Balkrishnan, R. The Growing Impact of Natural Language Processing in Healthcare and Public Health. Inq. J. Health Care Organ. Provis. Financ. 2024, 61, 00469580241290095. Available online: https://journals.sagepub.com/doi/full/10.1177/00469580241290095 (accessed on 15 November 2024). [CrossRef] [PubMed]
  39. Thirunavukarasu, A.J.; Ting, D.S.J.; Elangovan, K.; Gutierrez, L.; Tan, T.F.; Ting, D.S.W. Large language models in medicine. Nat. Med. 2023, 29, 1930–1940. Available online: https://www.nature.com/articles/s41591-023-02448-8 (accessed on 15 November 2024). [CrossRef] [PubMed]
  40. Haltaufderheide, J.; Ranisch, R. The ethics of ChatGPT in medicine and healthcare: A systematic review on Large Language Models (LLMs). NPJ Digit. Med. 2024, 7, 183. Available online: https://www.nature.com/articles/s41746-024-01157-x (accessed on 15 November 2024). [CrossRef]
  41. Hariri, W. Unlocking the Potential of ChatGPT: A Comprehensive Exploration of Its Applications, Advantages, Limitations, and Future Directions in Natural Language Processing. Available online: http://arxiv.org/abs/2304.02017 (accessed on 16 November 2024).
  42. OpenAI. ChatGPT (Version GPT-4) [Large Language Model]. OpenAI. 2024. Available online: https://openai.com/chatgpt (accessed on 14 June 2024).
  43. Nayak, I.; Kanagaraj, K.; Shrimali, M.; Kumar, D.; Ghamande, M.V.; Kumar, C.S. Improved Convolutional Neural Network based Approach for Analyzing the ChatGPT and Different Fields Impact. In Proceedings of the 2023 2nd International Conference on Automation, Computing and Renewable Systems (ICACRS), Pudukkottai, India, 11–13 December 2023; pp. 629–635. [Google Scholar]
  44. Gao, Y.; Zhang, L.; Xu, Y. CKG: Improving ABSA with text augmentation using ChatGPT and knowledge-enhanced gated attention graph convolutional networks. PLoS ONE 2024, 19, e0301508. Available online: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0301508 (accessed on 16 November 2024). [CrossRef]
  45. Ahmed, W.; Afzal, M.S.; Khan, M.N. ChatGPT in Dermatology: Evaluating Its Diagnostic Accuracy and Potential Role in Clinical Practice. Clin. Cosmet. Investig. Dermatol. 2023, 16, 1795–1802. [Google Scholar]
  46. DePalma, K.; Miminoshvili, I.; Henselder, C.; Moss, K.; AlOmar, E.A. Exploring ChatGPT’s code refactoring capabilities: An empirical study. Expert Syst. Appl. 2024, 249, 123602. [Google Scholar] [CrossRef]
  47. Using ChatGPT for Coding Challenges|by Mustakim Masum|Medium. Available online: https://medium.com/@m.m.kiron/using-chatgpt-for-coding-challenges-4cab6ca3bdc7 (accessed on 16 November 2024).
  48. Liu, X.; Duan, C.; Kim, M.K.; Zhang, L.; Jee, E.; Maharjan, B.; Huang, Y.; Du, D.; Jiang, X. Claude 3 Opus and ChatGPT With GPT-4 in Dermoscopic Image Analysis for Melanoma Diagnosis: Comparative Performance Analysis. JMIR Med. Inform. 2024, 12, e59273. Available online: https://pubmed.ncbi.nlm.nih.gov/39106482/ (accessed on 17 November 2024). [CrossRef]
  49. Lakdawala, N.; Channa, L.; Gronbeck, C.; Lakdawala, N.; Weston, G.; Sloan, B.; Feng, H. Assessing the Accuracy and Comprehensiveness of ChatGPT in Offering Clinical Guidance for Atopic Dermatitis and Acne Vulgaris. JMIR Dermatol. 2023, 6, e50409. Available online: https://pmc.ncbi.nlm.nih.gov/articles/PMC10685272/ (accessed on 17 November 2024). [CrossRef]
  50. Alanezi, F. Factors influencing patients’ engagement with ChatGPT for accessing health-related information. Crit. Public. Health 2024, 34, 1–20. Available online: https://www.tandfonline.com/doi/abs/10.1080/09581596.2024.2348164 (accessed on 17 November 2024). [CrossRef]
  51. Zhang, D.; Zhao, X. Understanding adoption intention of virtual medical consultation systems: Perceptions of ChatGPT and satisfaction with doctors. Comput. Human Behav. 2024, 159, 108359. [Google Scholar] [CrossRef]
  52. Pillai, A.; Parappally Joseph, S.; Hardin, J. Evaluating the Diagnostic and Treatment Recommendation Capabilities of GPT-4 Vision in Dermatology. medRxiv 2024. [Google Scholar] [CrossRef]
  53. Iqbal, U.; Lee, L.T.J.; Rahmanti, A.R.; Celi, L.A.; Li, Y.C.J. Can large language models provide secondary reliable opinion on treatment options for dermatological diseases? J. Am. Med. Inform. Assoc. 2024, 31, 1341–1347. Available online: https://pubmed.ncbi.nlm.nih.gov/38578616/ (accessed on 20 November 2024). [CrossRef] [PubMed]
  54. Elias, M.L.; Burshtein, J.; Sharon, V.R. OpenAI’s GPT-4 performs to a high degree on board-style dermatology questions. Int. J. Dermatol. 2024, 63, 73–78. Available online: https://pubmed.ncbi.nlm.nih.gov/38131454/ (accessed on 20 November 2024). [CrossRef] [PubMed]
  55. Passby, L.; Jenko, N.; Wernham, A. Performance of ChatGPT on Specialty Certificate Examination in Dermatology multiple-choice questions. Clin. Exp. Dermatol. 2024, 49, 722–727. Available online: https://pubmed.ncbi.nlm.nih.gov/37264670/ (accessed on 20 November 2024). [CrossRef]
  56. Lambert, R.; Choo, Z.Y.; Gradwohl, K.; Schroedl, L.; De Luzuriaga, A.R. Assessing the Application of Large Language Models in Generating Dermatologic Patient Education Materials According to Reading Level: Qualitative Study. JMIR Dermatol. 2024, 7, e55898. Available online: https://pubmed.ncbi.nlm.nih.gov/38754096/ (accessed on 20 November 2024). [CrossRef]
  57. Ko, E.A.; Torre, A.C.; Hernandez, B.; Bibiloni, N.; Covián, E.; Salerni, G.; Alonso, C.; Ochoa, A.K.; Mazzuoccolo, L.D. Argentine dermatology and ChatGPT: Infrequent use and intermediate stance. Clin. Exp. Dermatol. 2024, 49, 734–736. [Google Scholar] [CrossRef]
  58. Stoneham, S.; Livesey, A.; Cooper, H.; Mitchell, C. ChatGPT versus clinician: Challenging the diagnostic capabilities of artificial intelligence in dermatology. Clin. Exp. Dermatol. 2024, 49, 707–710. Available online: https://pubmed.ncbi.nlm.nih.gov/37979201/ (accessed on 20 November 2024). [CrossRef]
  59. O’Hagan, R.; Kim, R.H.; Abittan, B.J.; Caldas, S.; Ungar, J.; Ungar, B. Trends in Accuracy and Appropriateness of Alopecia Areata Information Obtained from a Popular Online Large Language Model, ChatGPT. Dermatology 2023, 239, 952–957. Available online: https://pubmed.ncbi.nlm.nih.gov/37722370/ (accessed on 20 November 2024). [CrossRef]
  60. Lewandowski, M.; Łukowicz, P.; Świetlik, D.; Barańska-Rybak, W. ChatGPT-3.5 and ChatGPT-4 dermatological knowledge level based on the Specialty Certificate Examination in Dermatology. Clin. Exp. Dermatol. 2024, 49, 686–691. Available online: https://pubmed.ncbi.nlm.nih.gov/37540015/ (accessed on 20 November 2024). [CrossRef] [PubMed]
  61. Huang, C.Y.; Zhang, E.; Caussade, M.C.; Brown, T.; Stockton Hogrogian, G.; Yan, A.C. Pediatric dermatologists versus AI bots: Evaluating the medical knowledge and diagnostic capabilities of ChatGPT. Pediatr. Dermatol. 2024, 41, 831–834. Available online: https://pubmed.ncbi.nlm.nih.gov/38721744/ (accessed on 20 November 2024). [CrossRef] [PubMed]
  62. Cuellar-Barboza, A.; Brussolo-Marroquín, E.; Cordero-Martinez, F.C.; Aguilar-Calderon, P.E.; Vazquez-Martinez, O.; Ocampo-Candiani, J. An evaluation of ChatGPT compared with dermatological surgeons’ choices of reconstruction for surgical defects after Mohs surgery. Clin. Exp. Dermatol. 2024, 49, 1367–1371. Available online: https://pubmed.ncbi.nlm.nih.gov/38738492/ (accessed on 20 November 2024). [CrossRef]
  63. Li, K.; Hsu, J.T.S.; Li, M.K. Evaluation of ChatGPT’s acne advice. Clin. Exp. Dermatol. 2024, 49, 746–749. [Google Scholar] [CrossRef] [PubMed]
  64. Zhou, J.; He, X.; Sun, L.; Xu, J.; Chen, X.; Chu, Y.; Zhou, L.; Liao, X.; Zhang, B.; Afvari, S.; et al. Pre-trained multimodal large language model enhances dermatological diagnosis using SkinGPT-4. Nat. Commun. 2024, 15, 5649. Available online: https://www.nature.com/articles/s41467-024-50043-3 (accessed on 19 November 2024). [CrossRef]
  65. Anguita, R.; Downie, C.; Ferro Desideri, L.; Sagoo, M.S. Assessing large language models’ accuracy in providing patient support for choroidal melanoma. Eye 2024, 38, 3113–3117. Available online: https://pubmed.ncbi.nlm.nih.gov/39003430/ (accessed on 20 November 2024). [CrossRef]
  66. Mirza, F.N.; Lim, R.K.; Yumeen, S.; Wahood, S.; Zaidat, B.; Shah, A.; Tang, O.Y.; Kawaoka, J.; Seo, S.-J.; DiMarco, C.; et al. Performance of Three Large Language Models on Dermatology Board Examinations. J. Investig. Darmatol. 2024, 144, 398–400. Available online: https://pubmed.ncbi.nlm.nih.gov/37541614/ (accessed on 20 November 2024). [CrossRef]
  67. Diazo, J.M.; Ngo, D.Q.; Wetter, D.A.; Davis, D.M.R.; Tollefson, M.M. Performance of ChatGPT in Dermatology: A Systematic Review. JAAD Int. 2024, 14, 100137. [Google Scholar]
  68. D’Agostino, M.; Feo, F.; Martora, F.; Genco, L.; Megna, M.; Cacciapuoti, S.; Villani, A.; Potestio, L. ChatGPT and dermatology. Ital. J. Dermatol. Venereol. 2024, 159, 566–571. Available online: https://pubmed.ncbi.nlm.nih.gov/39039954/ (accessed on 19 November 2024). [CrossRef]
  69. Kluger, N. Potential applications of ChatGPT in dermatology. J. Eur. Acad. Dermatol. Venereol. 2023, 37, e941–e942. Available online: https://onlinelibrary.wiley.com/doi/full/10.1111/jdv.19152 (accessed on 20 November 2024). [CrossRef]
  70. Gordon, E.R.; Trager, M.H.; Kontos, D.; Weng, C.; Geskin, L.J.; Dugdale, L.S.; Samie, F.H. Ethical considerations for artificial intelligence in dermatology: A scoping review. Br. J. Dermatol. 2024, 190, 789–797. Available online: https://pubmed.ncbi.nlm.nih.gov/38330217/ (accessed on 20 November 2024). [CrossRef]
  71. Jin, J.Q.; Dobry, A.S. ChatGPT for healthcare providers and patients: Practical implications within dermatology. J. Am. Acad. Dermatol. 2023, 89, 870–871. Available online: https://pubmed.ncbi.nlm.nih.gov/37315798/ (accessed on 20 November 2024). [CrossRef]
  72. Jabour, T.B.F.; Ribeiro, J.P.; Fernandes, A.C.; Honorato, C.M.A.; Queiroz Mdo, C.A.P. ChatGPT: Performance of artificial intelligence in the dermatology specialty certificate examination. An. Bras. Dermatol. 2023, 99, 277. Available online: https://pmc.ncbi.nlm.nih.gov/articles/PMC10943280/ (accessed on 19 November 2024). [CrossRef]
  73. Ullah, E.; Parwani, A.; Baig, M.M.; Singh, R. Challenges and barriers of using large language models (LLM) such as ChatGPT for diagnostic medicine with a focus on digital pathology—A recent scoping review. Diagn. Pathol. 2024, 19, 43. Available online: https://diagnosticpathology.biomedcentral.com/articles/10.1186/s13000-024-01464-7 (accessed on 19 November 2024). [CrossRef]
  74. OpenAI. GPT-4 Technical Report. arXiv 2023, arXiv:2303.08774. [Google Scholar]
  75. Bender, E.M.; Gebru, T.; McMillan-Major, A.; Shmitchell, S. On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, Toronto, ON, Canada, 3–10 March 2021; pp. 610–623. Available online: https://dl.acm.org/doi/10.1145/3442188.3445922 (accessed on 19 November 2024).
  76. Kololgi, S.P.; Lahari, C. Harnessing the Power of Artificial Intelligence in Dermatology: A Comprehensive Commentary. Indian J. Dermatol. 2024, 68, 678. Available online: https://pmc.ncbi.nlm.nih.gov/articles/PMC10868991/ (accessed on 19 November 2024). [CrossRef]
  77. Jeong, H.K.; Park, C.; Henao, R.; Kheterpal, M. Deep Learning in Dermatology: A Systematic Review of Current Approaches, Outcomes, and Limitations. JID Innov. 2023, 3, 100150. [Google Scholar] [CrossRef]
  78. Busik, V. How artificial intelligence and large language models are revolutionizing dermatology. Dermatologie 2024, 75, 743–746. Available online: https://pubmed.ncbi.nlm.nih.gov/38900290/ (accessed on 19 November 2024). [CrossRef]
  79. Porter, E.; Murphy, M.; O’Connor, C. Chat GPT in dermatology: Progressive or problematic? J. Eur. Acad. Dermatol. Venereol. 2023, 37, e943–e944. Available online: https://pubmed.ncbi.nlm.nih.gov/37150950/ (accessed on 19 November 2024). [CrossRef]
Figure 1. Illustration of a multimodal large language model (LLM) system integrating both image and text data. The image pathway is encoded via convolutional neural networks (CNNs) or vision transformers (ViTs), while text inputs (e.g., patient history) are tokenized and processed by a large language model (e.g., GPT-4). A projection module aligns visual embeddings with the text domain, enabling the model to generate comprehensive outputs such as diagnostic suggestions, patient education, and clinical explanations.
Figure 1. Illustration of a multimodal large language model (LLM) system integrating both image and text data. The image pathway is encoded via convolutional neural networks (CNNs) or vision transformers (ViTs), while text inputs (e.g., patient history) are tokenized and processed by a large language model (e.g., GPT-4). A projection module aligns visual embeddings with the text domain, enabling the model to generate comprehensive outputs such as diagnostic suggestions, patient education, and clinical explanations.
Diagnostics 15 01529 g001
Table 2. Volume of articles on ChatGPT vs. other LLMs in dermatology diagnostics.
Table 2. Volume of articles on ChatGPT vs. other LLMs in dermatology diagnostics.
ModelVolume of ArticlesApplications in Dermatology
ChatGPT (OpenAI)High (50+ articles)Used for diagnostic support, clinical note analysis, and patient interaction. Achieves 88% accuracy in handling common queries.
Google Bard (PaLM)Low (2–5 articles)No specific dermatology focus found; primarily applied to general conversational AI and integration with Google systems.
Claude by AnthropicLow (1–2 articles)Limited mentions in healthcare; prioritized for safe, ethical applications in sensitive tasks.
Microsoft Copilot (Powered by GPT-4)Moderate (5–10 articles)Integrated in teledermatology workflows via Office Suite for clinical reporting and diagnostic data organization
LLaMA (Meta)Low (<5 articles)Primarily used in researchno specific dermatology-related applications identified.
MistralNone FoundNo known dermatology-related applications.
CohereNone FoundPrimarily enterprise knowledge management; no dermatology-specific use cases identified.
Amazon BedrockNone FoundFocus on general enterprise flexibility; no dermatol no dermatology-specific use cases identified.
xAI (Grok)None foundStill emerging; no dermatology-specific use cases identified.
IBM Watson AssistantLow (1–2 articles)Some use cases in patient engagement and healthcare support, but minimal focus on dermatology.
Jasper AINone FoundNo known applications in dermatology; focused on content creation.
Character.aiNone FoundUsed for entertainment; no healthcare or dermatology use cases.
DeepMind GeminiEmergingPromising capabilities in diagnostics, but still in development with no dermatology applications yet.
Perplexity.aiNone FoundFocused on information retrieval with no dermatology-specific applications.
Table 2 displays that among the evaluated LMMs, very few have published studies on their use in dermatology. In this regard, ChatGPT had the most number of articles published: over 50 studies on various applications, including diagnostic support, clinical note analysis, and patient interaction, with an 88% accuracy rating for handling common queries. Microsoft Copilot is powered by GPT-4 with 5–10 articles published, mainly about its integration into teledermatology workflows, clinical reporting, and organizing diagnostic data. Google Bard (PaLM) has 2–5 published studies, and Claude by Anthropic has 1–2 published studies; however, these two do not specifically address dermatology, as they have been more generally applied to broader conversational AI tasks. Other models with limited publication on dermatology include IBM Watson Assistant, LLaMA, and emerging models such as DeepMind Gemini. Many models, such as Mistral, Cohere, and Amazon Bedrock, have no known applications in the field of dermatology and have focused on other domains, such as enterprise knowledge management and more general AI versatility. Importantly, the comparatively higher number of studies examining GPT-based models reinforces their established role in dermatology research while raising the opportunity for additional explorative work around their diagnostic potential.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Khamaysi, Z.; Awwad, M.; Jiryis, B.; Bathish, N.; Shapiro, J. The Role of ChatGPT in Dermatology Diagnostics. Diagnostics 2025, 15, 1529. https://doi.org/10.3390/diagnostics15121529

AMA Style

Khamaysi Z, Awwad M, Jiryis B, Bathish N, Shapiro J. The Role of ChatGPT in Dermatology Diagnostics. Diagnostics. 2025; 15(12):1529. https://doi.org/10.3390/diagnostics15121529

Chicago/Turabian Style

Khamaysi, Ziad, Mahdi Awwad, Badea Jiryis, Naji Bathish, and Jonathan Shapiro. 2025. "The Role of ChatGPT in Dermatology Diagnostics" Diagnostics 15, no. 12: 1529. https://doi.org/10.3390/diagnostics15121529

APA Style

Khamaysi, Z., Awwad, M., Jiryis, B., Bathish, N., & Shapiro, J. (2025). The Role of ChatGPT in Dermatology Diagnostics. Diagnostics, 15(12), 1529. https://doi.org/10.3390/diagnostics15121529

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop