1. Introduction
Melanoma outcomes hinge on early detection because prognosis is tightly coupled to pathological stage and Breslow thickness at diagnosis. The AJCC 8th edition formalized this with evidence-based stage refinements and risk stratification that directly translate thickness, ulceration, and nodal status into survival differences relevant to both patients and planners [
1,
2]. Beyond thickness and AJCC stage, molecular alterations (e.g., BRAF/NRAS mutations, gene-expression signatures) and features of the host immune response (such as tumor-infiltrating lymphocytes) further stratify risk and inform systemic therapy decisions; although, these factors are rarely available at the point of initial referral and triage. European incidence has risen over recent decades, with aging populations, intermittent high-intensity UV exposure, improved ascertainment, and broader behavioral and socioeconomic factors—such as skin-phototype distribution, sun-seeking and tanning behaviors, occupational and recreational exposures, and variable uptake of opportunistic screening—driving a steady case load that challenges specialist capacity; GLOBOCAN 2020 estimates underscore melanoma’s disproportionate mortality burden relative to its incidence share [
3]. Against this backdrop, service models that can accelerate evaluation of suspicious lesions—without compromising diagnostic safety—are public-health priorities.
Across many health systems, suspected-cancer pathways aim to compress time from referral to specialist assessment and from decision-to-treat to treatment. Within the United Kingdom, for example, NICE NG12 specifies features warranting urgent referral from primary care to dermatology [
4], and national cancer-waiting-time standards articulate a 62-day referral-to-treatment benchmark and a 31-day decision-to-treatment benchmark, with an additional 28-day ‘Faster Diagnosis Standard’ for time to rule-in/rule-out cancer [
5]. These UK standards are not uniformly replicated across Europe but serve as illustrative time-bound targets when considering pathway performance. These policy anchors are useful comparators when judging whether novel triage tools such as teledermatology (TD) and teledermoscopy (TDS) deliver pathway-level benefits that matter to patients (earlier decisions) and systems (on-target throughput).
Why speed matters is quantifiable. A large meta-analysis across multiple cancers estimates that each four-week delay in curative-intent treatment increases mortality risk, highlighting the population-level stakes of reducing system delays [
6]. Melanoma-specific datasets point the same way: analyses of national and state cohorts associate longer surgical intervals with worse survival, particularly when treatment extends beyond commonly cited windows (e.g., >6–8 weeks or >90 days) [
7,
8]. While confounding and selection must be considered in observational designs, the consistency of directionality supports timeliness as a legitimate quality domain for melanoma pathways and a pertinent endpoint for TD/TDS evaluations.
Teledermatology’s diagnostic performance for skin cancer has been synthesized across pre-pandemic and contemporary eras. Earlier systematic reviews reported variable but generally high agreement with in-person care for cancer triage, while acknowledging heterogeneity in study quality and reference standards [
9,
10]. Post-pandemic updates continue to report acceptable accuracy for pigmented lesion triage with store-and-forward and hybrid models, though reliability varies by image quality, lesion selection, and workflow integration—reinforcing the need for robust implementation studies and standardized reporting [
11,
12]. These syntheses are limited by substantial heterogeneity in design and reference standards, selective publication of better-performing implementations, and the fact that many studies did not apply histopathology to all triaged lesions, which may overestimate accuracy. Practical reviews and focused analyses of neoplasm triage suggest accuracy around the 75–80% range in many programs, approaching face-to-face benchmarks when dermoscopy and structured protocols are used [
13]. Framing TD/TDS against pathway targets (faster-diagnosis and 62-day standards) clarifies its role as a front-end accelerator rather than a universal substitute for dermato-oncology examination and biopsy.
Pandemic disruptions provided a stress test of pathway resilience. Across Europe, service interruptions and reduced face-to-face access were associated with fewer melanoma diagnoses during lockdowns, followed by a rebound enriched for thicker tumors and adverse features; modeling suggests substantial years-of-life lost and macroeconomic costs linked to these delays [
14,
15]. These findings sharpen our study focus: the value proposition of TD/TDS is not generic convenience but targeted pathway decongestion that preserves diagnostic safety while pulling forward the moment of decision for lesions at genuine risk of invasion.
Accordingly, the current study aims to address the following melanoma-related issues: (i) timeliness (referral-to-dermatologist and time-to-excision), (ii) diagnostic performance (sensitivity/specificity, PPV, false-negative rates) using appropriate reference standards, and (iii) initial prognosis at diagnosis (Breslow thickness and stage). By mapping these outcomes to pathway targets and staging-anchored risk, we aim to identify when and how TD/TDS improves patient-important outcomes, where reporting needs standardization, and what policy levers can translate evidence into scalable cancer-pathway design.
2. Materials and Methods
2.1. Protocol and Registration
This review was planned, conducted, and reported according to PRISMA-2020 guidance [
16]. The protocol prespecified the review questions, eligibility criteria, outcomes, data extraction fields, and bias-appraisal tools, and was prospectively registered on the Open Science Framework (OSF code osf.io/kbpt6). Protocol fidelity was verified by a second reviewer who was not involved in initial screening, through audit of the final eligibility criteria, search strategies, and extracted data fields against the OSF-registered protocol and by checking that no post hoc changes were introduced that would alter study selection or primary outcomes.
2.2. Review Question and Framework
The review asked whether image-enabled teledermatology, including teledermoscopy, for patients referred with suspicious cutaneous lesions improves pathway timeliness, diagnostic performance, and initial prognostic severity at diagnosis compared with conventional non-image-based referral. The population comprised individuals of any age evaluated in primary care, urgent cancer pathways, or mixed dermatology referral streams where melanoma outcomes were reported at the study or subgroup level. The intervention was asynchronous or synchronous teledermatology using clinical and/or dermoscopic images captured in the community or clinic, including smartphone workflows, virtual lesion clinics, and regional or national teledermoscopy programs. Comparators were standard paper or electronic referrals without images, face-to-face first assessments, or pre/post implementation periods.
Because the overarching policy question concerned whether TD/TDS can safely ‘pull forward’ decisions along melanoma pathways, pathway timeliness (days to first specialist decision, biopsy or excision, and histopathology report) and diagnostic performance (sensitivity, specificity, positive predictive value, number-needed-to-excise, and false-negative rate) were prespecified as co-primary outcome domains when an appropriate reference standard was used. Initial prognosis at diagnosis (Breslow thickness in millimeters, ulceration, and AJCC 8th-edition stage distribution) was treated as a secondary mechanistic domain, used to explore potential stage-related consequences of earlier triage.
2.3. Eligibility Criteria
Eligible designs were randomized trials; prospective or retrospective comparative cohorts; controlled before–after evaluations; service evaluations with extractable arm-level or patient-level data for at least one prespecified primary outcome; and diagnostic accuracy or triage studies using verification by histopathology or structured follow-up. Ineligible records were case reports or very small series with fewer than ten melanoma cases; editorials, letters, narrative reviews, or technical validation without clinical pathway endpoints; pure algorithm papers without human clinical decision-making; and non-human research. No limits by language, geography, or publication year were applied at search stage. Studies that reported mixed “skin cancer” cohorts were included only when melanoma-specific results were extractable or verifiably incorporated into the endpoint.
2.4. Information Sources and Dates
We interrogated PubMed/MEDLINE, Embase (Elsevier), and Scopus (Elsevier) from database inception through 1 September 2025. Syntax and controlled vocabulary were translated across databases using validated filters where applicable. Searches were last executed on 1 September 2025.
PubMed/MEDLINE (inception to 1 September 2025). (“Melanoma”[Mesh] OR melanoma*[tiab] OR “skin neoplasm*”[tiab] OR “pigmented lesion*”[tiab]) AND (teledermatolog*[tiab] OR “tele-dermatolog*”[tiab] OR teledermoscop*[tiab] OR “tele-dermoscop*”[tiab] OR ((dermoscop*[tiab] OR “dermoscopy”[Mesh]) AND (telemedicin*[tiab] OR smartphone*[tiab] OR mobile[tiab] OR “store-and-forward”[tiab] OR asynchronous[tiab] OR virtual[tiab] OR “two-week wait”[tiab] OR 2WW[tiab] OR “urgent cancer”[tiab]))).
Embase (Elsevier; incept–1 September 2025; Emtree mapped). (‘melanoma’/exp OR melanoma*:ti,ab OR ‘skin neoplasm*’:ti,ab OR ‘pigmented lesion*’:ti,ab) AND (teledermatology:ti,ab OR ‘tele-dermatology’:ti,ab OR teledermoscopy:ti,ab OR ‘tele-dermoscopy’:ti,ab OR (dermoscopy:ti,ab OR ‘dermoscopy’/exp) AND (telemedicine:ti,ab OR ‘telemedicine’/exp OR smartphone*:ti,ab OR ‘smartphone’/exp OR mobile:ti,ab OR ‘store and forward’:ti,ab OR asynchronous:ti,ab OR virtual:ti,ab OR ‘two week wait’:ti,ab OR 2ww:ti,ab OR ‘urgent cancer’:ti,ab)).
Scopus (Elsevier; incept–1 September 2025; TITLE-ABS-KEY). TITLE-ABS-KEY(melanoma* OR “skin neoplasm*” OR “pigmented lesion*”) AND TITLE-ABS-KEY(teledermatolog* OR tele-dermatolog* OR teledermoscop* OR tele-dermoscop* OR (dermoscop* AND (telemedicin* OR smartphone* OR mobile OR “store-and-forward” OR asynchronous OR virtual OR “two-week wait” OR 2WW OR “urgent cancer”))).
2.5. Study Selection
Records were exported to EndNote X20 for deduplication and then into Rayyan QCRI for blinded dual screening. Two reviewers independently screened titles and abstracts against a piloted form, resolving disagreements by consensus or third-party adjudication. Calibration on a 10% random subset achieved a Cohen’s κ of at least 0.80 for inclusion decisions, after which the screening form was locked. The PRISMA flow is as follows: 655 records were identified across databases (PubMed 218, Embase 241, Scopus 196). After deduplication of 39 records, 616 unique records underwent title and abstract screening, of which 593 were excluded as irrelevant by condition, intervention, or article type. Twenty-three full texts were assessed for eligibility; eleven were excluded due to absent melanoma-specific extractable data in mixed cohorts (
n = 5) or ineligible design/setting without pathway endpoints (
n = 6). Mixed ‘skin cancer’ cohorts without extractable melanoma-specific outcomes were excluded to avoid misattributing non-melanoma endpoints (e.g., basal cell or squamous cell carcinoma) to melanoma. This approach inevitably favors settings with more granular diagnostic coding and reporting; as a result, our included sample may under-represent programs where melanoma outcomes are embedded in broader skin cancer metrics. Twelve studies met all criteria and were included in the narrative synthesis (
Figure 1).
2.6. Risk-of-Bias Assessment and Synthesis
Risk of bias was appraised independently by two reviewers at the outcome level using RoB 2 for randomized trials, ROBINS-I for non-randomized comparative designs and service evaluations, and QUADAS-2 for diagnostic accuracy and triage studies. Prespecified confounders for ROBINS-I included secular trends, concurrent pathway reconfiguration, differential imaging quality, and variation in verification intensity. Any discrepancies were resolved by consensus.
Given anticipated clinical and methodological heterogeneity, we prespecified a structured narrative synthesis. For each study, we extracted the operational definitions used for timeliness (referral-to-first specialist contact vs. referral-to-biopsy), diagnostic performance (per-lesion vs. per-patient denominators, inclusion of benign nevi), and prognostic severity (Breslow thickness distribution, AJCC stage). These definitions were tabulated and compared qualitatively. Because of non-overlapping time points, differing denominators, and variable reference standards, we did not perform meta-analytic pooling; instead, we focused on the direction and consistency of effects within outcome domains.
Risk-of-bias assessments were performed for all twelve included studies [
17,
18,
19,
20,
21,
22,
23,
24,
25,
26,
27,
28], as summarized in
Table 1,
Table 2 and
Table 3. The single randomized trial [
27] was judged to have some concerns due to limited information on allocation concealment and small sample size. Most non-randomized comparative and service-evaluation studies were at serious risk of bias for confounding under ROBINS-I, reflecting before–after designs without formal adjustment for secular trends, pathway reconfiguration, or changes in referral behavior. Selection of participants into TD versus conventional pathways was frequently clinician- or patient-driven rather than random. Outcome measurement (Breslow thickness, time stamps) was generally at low risk of bias, but diagnostic-accuracy studies commonly had unclear or high risk in QUADAS-2 verification domains because not all triaged lesions underwent histopathology. These limitations constrain causal inference, particularly for prognostic endpoints.
For ROBINS-I, prespecified confounders included secular trends in melanoma incidence and pathway performance, concurrent organizational changes (e.g., introduction of faster-diagnosis standards or additional theatre lists), case-mix differences between teledermatology and conventional streams, and differential verification intensity. During risk-of-bias assessment, we recorded whether studies reported adjustment for these factors (multivariable regression, difference-in-differences analyses, or stratified reporting); none fully accounted for all prespecified confounders. Risk of bias is presented in
Table 1,
Table 2 and
Table 3.
We explored the feasibility of pooling melanoma triage sensitivity and positive predictive value (PPV) but found that only two real-world programs reported numerators and denominators in a sufficiently comparable fashion, with differences in inclusion of in situ disease and verification strategies. A meta-analysis based on such sparse and heterogeneous data was deemed unlikely to yield a meaningful pooled estimate, and results are therefore presented descriptively.
3. Results
Across 12 melanoma-relevant evaluations, settings ranged from single-service pilots to national programs, with several providing sizable melanoma samples. The 12 evaluations comprised one randomized controlled trial [
27], several prospective comparative or before–after cohorts [
20,
25], diagnostic-agreement and cross-sectional studies [
21,
28], and multiple retrospective cohorts and service evaluations [
17,
18,
19,
22,
23,
24,
26], reflecting the predominantly implementation-oriented nature of the evidence base. A Spanish retrospective cohort focused exclusively on melanoma (
n = 201) compared store-and-forward teledermatology (TD) with conventional referral and prespecified prognosis endpoints [
17]. Early feasibility and workflow papers established smartphone teledermoscopy (TDS) in Sweden and a Virtual Lesion Clinic (VLC) in New Zealand for mixed referrals that included melanoma subsets [
18,
19]. In Sweden, countywide TDS correctly handled 94/95 melanomas (sensitivity 98.9%; false-negative rate 1.10%) over the observed follow-up [
28], and a mature Virtual Lesion Clinic in New Zealand reported a similarly low false-negative rate (~1.2%) in a high-throughput triage context [
22]. However, neither program provided long-term linkage to subsequent advanced melanoma diagnoses among initial false negatives.
A UK prospective comparative study examined time to clinic within an urgent cancer pathway [
20], while a Swedish agreement study assessed diagnostic concordance and direct-to-surgery booking using TDS referrals versus paper referrals [
21]. Service-level triage performance was reported from a mature VLC (PPV/FNR endpoints) [
22], and a retrospective review from New Zealand quantified throughput (402 referrals; 19 melanomas) and time-to-advice [
23]. A country-wide Estonian TDS program contributed 4748 cases with diagnostic and plan accuracy benchmarking [
24]. A UK two-week wait (2WW) evaluation compared virtual versus face-to-face outcomes [
25]. A U.S. institutional comparison contrasted TD-first versus in-person-first melanoma pathways with head-to-head clinical endpoints [
26]. A French randomized trial studied smartphone photo relay from primary care and its effect on time to dermatology consult [
27], and a Swedish countywide analysis reported melanoma triage sensitivity in a real-world TDS rollout (
n = 135 melanomas; 95 via TDS) [
28], as presented in
Table 4.
The included evaluations spanned heterogeneous designs and settings, ranging from single-center pilots and feasibility studies to regional and national services. Operational definitions for timeliness varied (time from referral to first dermatology contact, time to biopsy, or time to histology report), as did the staging metrics (Breslow thickness as a continuous variable, dichotomous in situ/T1 vs. >T1, or AJCC stage categories). Diagnostic performance was variously reported per lesion or per patient, and verification strategies ranged from systematic histopathology to selective biopsy with clinical follow-up. This heterogeneity precluded meaningful quantitative pooling and informed our choice of a narrative synthesis.
In Spain, a regional store-and-forward TD network was associated with a lower mean Breslow thickness (1.06 mm vs. 1.64 mm) and a higher proportion of in situ/T1 melanomas (70.1% vs. 56.9%) in the TD arm compared with conventional referral [
17]. In contrast, the U.S. institutional cohort by Jaklitsch et al. [
26] reported that melanomas detected through TD-first pathways were, on average, thicker and more often ulcerated than those detected through in-person-first pathways, reflecting selective channeling of more clinically worrisome lesions into TD rather than a simple, uniform ‘thinning’ effect across all settings.
TD/TDS compressed early pathway intervals and, in some contexts, was associated with more favorable staging at diagnosis. In the UK, photo-triage halved the median time to clinic (14 vs. 24 days, melanoma subset) without breaching the 62-day standard for nearly all patients [
20]. In Spain, TD was associated with a lower mean Breslow thickness (1.06 mm vs. 1.64 mm) and a higher proportion of in situ/T1 melanomas (70.1% vs. 56.9%) compared with conventional referral [
17]. Operationally, a New Zealand review reported dermatologist advice at ≈1.02 days but a downstream “time to action” around 64.8 days, highlighting post-triage capacity constraints [
23]. A French RCT showed a shorter delay to dermatology consultation when GPs relayed smartphone photos versus usual scheduling [
27]. In a U.S. cohort, TD-first accelerated evaluation and biopsy (both
p < 0.001), although TD-first melanomas presented thicker and more often ulcerated—consistent with selection of higher-risk lesions into the TD-first stream [
26], as seen in
Table 5.
In Sweden, countywide TDS correctly handled 94/95 melanomas (sensitivity 98.9%) with a 1.10% false-negative rate, supporting safe large-scale triage [
28]. A mature VLC in New Zealand reported an overall PPV ≈49% for suspected melanoma referrals with an ≈1.2% false-negative rate—acceptable in a high-throughput triage context [
22]. A Swedish agreement study observed higher diagnostic concordance and more direct-to-surgery bookings with TDS referrals than with paper-based referrals, indicating improved operational precision [
21]. Estonia’s national TDS service (4748 cases) showed diagnostic and management plan accuracy comparable to experimental settings, suggesting strong external validity at scale [
24]. Additional service reports underscored speed-of-advice (~1 day) with relatively few melanomas within total referrals [
23], maintenance of 2WW breach times with a notable 42.9% increase in SCC detection under virtual pre-assessment (
p = 0.03) [
25], and feasibility signals from early smartphone-based TDS that later informed pathway design [
18] (
Table 6).
UK photo-triage reduced median time to clinic to 14 days versus 24 days while meeting 62-day treatment targets [
20]. In Spain, a regional TD network showed thinner melanomas and a higher in situ/T1 share in the TD arm [
17]. Swedish pilots demonstrated that GP-captured smartphone dermoscopy could return expert reads often within 24 h and supported multicenter scaling [
18,
21]. New Zealand VLC models typically delivered advice in ~1 day with moderate PPV and low FNR, reflecting robust triage but revealing downstream capacity as the main bottleneck [
19,
22,
23]. A UK 2WW virtual pathway maintained breach times and improved SCC detection [
25]. Estonia’s nationwide TDS embedded 1–2 day dermatologist turnaround within the e-health infrastructure and reproduced experimental-level diagnostic/plan accuracy at national scale [
24]. A French RCT confirmed faster access to dermatology consultation via smartphone photo relay from primary care [
27], and a U.S. academic system showed TD-first accelerated evaluation and biopsy while triaging a clinically higher-risk case-mix [
26]. A Swedish countywide rollout documented triage sensitivity of 98.9% for melanoma, reinforcing safety at scale [
28] (
Table 7).
Triage safety was high where quantified, with melanoma sensitivity of 98.9% in a county-wide teledermoscopy (TDS) rollout (false-negative rate ≈ 1.1%) in Sweden [
28], while a mature Virtual Lesion Clinic reported PPV ≈ 49% and FNR ≈ 1.2% for suspected-melanoma triage [
22]. Operational speed at the front end was rapid: median time to dermatologist advice via store-and-forward teledermatology was ~1.02 days in a New Zealand service [
23]. For pathway timeliness, a UK prospective study showed a median time to first clinic of 14 days with photo-triage versus 24 days via conventional urgent referral, illustrating upstream acceleration with TD/TDS [
20], as presented in
Figure 2.
Front-end advice occurred at ~1.02 days via TD in New Zealand [
23], and photo-triage halved time to first clinic (14 vs. 24 days) in the UK [
20]. In contrast, later milestones were longer: a national-scale Estonian program reported median ~45.5 days to excision and ~67.4 days to histology for triaged lesions [
24], while a New Zealand service observed ~64.8 days to “time to action,” indicating that gains from rapid triage can be attenuated by surgical and histopathology throughput constraints downstream [
23,
24] (
Figure 3).
Across the 12 studies, reporting of equity-relevant variables was limited. Few evaluations provided outcomes stratified by socioeconomic status, ethnicity, rural versus urban residence, or digital literacy, and none reported triage performance separately in these strata. Lesion-level detail (anatomic site, histologic subtype) was also variably reported, limiting subgroup analyses.