Next Article in Journal
ProtoPGTN: A Scalable Prototype-Based Gated Transformer Network for Interpretable Time Series Classification
Previous Article in Journal
The Counterfactual–Dialectical Optimization Framework: A Prescriptive Approach to Employee Attrition Management with Empirical Validation
Previous Article in Special Issue
Multimodal Models in Healthcare: Methods, Challenges, and Future Directions for Enhanced Clinical Decision Support
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Machine Learning-Enhanced Architecture Model for Integrated and FHIR-Based Health Data

1
Institute for High Performance Computing and Networking-National Research Council of Italy (ICAR-CNR), 80131 Napoli, Italy
2
Department of Diagnostic Imaging, University of Naples “Federico II”, 80138 Napoli, Italy
3
Pineta Grande Hospital, 81030 Castel Volturno, Italy
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Information 2025, 16(12), 1054; https://doi.org/10.3390/info16121054
Submission received: 7 November 2025 / Revised: 26 November 2025 / Accepted: 30 November 2025 / Published: 2 December 2025
(This article belongs to the Special Issue Artificial Intelligence-Based Digital Health Emerging Technologies)

Abstract

The widespread fragmentation of patient information across heterogeneous systems and the lack of standardized integration mechanisms hinder efficient and comprehensive medical diagnostics. To address these limitations, this work presents an architecture framework designed to support physicians in the diagnostic process by integrating clinical and socio-health information (patient medical histories), structured documents extracted from Health Information System (HIS), and data automatically extracted from diagnostic images using Artificial Intelligence (AI) techniques. The proposed architecture is made by several modules, in particular a Decision Support System (DSS) that enables risk assessment related to specific patient’s clinical conditions. In addition, the clinical information retrieved is aggregated, standardized, and transmitted to external systems for follow up. Standardization and data interoperability are ensured through the adoption of the international HL7 Fast Healthcare Interoperability Resources (FHIR) standard, which facilitates seamless connection with HIS. An Android application has been developed to communicate with different HISs in order to: (i) retrieve information, (ii) aggregate clinical data, (iii) calculate patient risk scores using AI algorithms, (iv) display results to healthcare professionals, and (v) generate and share relevant clinical information with external systems in a standardized format. To demonstrate architecture’s applicability, a case study on breast cancer diagnosis is presented. In this context, an AI-based Risk Assessment module was developed using the Breast Ultrasound Images Dataset (BUSI), which includes benign, malignant, and normal cases. Machine Learning algorithms were applied to perform the classification task. Model performance was evaluated using a 4-fold cross-validation strategy to ensure robustness and generalizability. The best results were achieved using the Multilayer Perceptron method, with a competitive F1-score of 0.97.

Graphical Abstract

1. Introduction

In recent years, digital healthcare has witnessed an exponential growth in the volume and variety of clinical data generated from heterogeneous sources such as Health Information Systems (HISs), wearable devices, imaging systems, laboratories, and mobile applications. However, this abundance of information has not translated into effective clinical integration: data often remain fragmented, inaccessible, or non-interoperable across different solutions, platforms, and companies [1,2,3].
To address these issues, HL7 Fast Healthcare Interoperability Resources (FHIR) (https://www.hl7.org/fhir/, accessed on 26 July 2025) has emerged as a modern and modular standard designed to support interoperable health data exchange through well-known web technologies such as RESTful APIs, JSON, and XML. FHIR organizes clinical information into granular and reusable “resources” (e.g., Patient, Observation, Condition), enabling flexible integration among HISs, clinical systems, and computational models [4], including those based on Artificial Intelligence (AI). Before the adoption of such standards, the absence of structured and semantically consistent data hindered the full potential of AI-based tools and Decision Support Systems (DSS), which rely on standardized inputs to produce reliable, reproducible, and personalized recommendations [5]. By standardizing diagnostic metadata and clinical information, FHIR facilitates the development of interoperable, reusable, and scalable DSS, accelerating knowledge transfer across diverse clinical environments [6]. According to a recent systematic review, over 98% of DSS tools developed between 2018 and 2021 adopted FHIR as the main interoperability standard [7].
AI-based DSS and Computer-Aided Diagnosis (CAD) tools can be used to support clinicians in their daily diagnosis process, for example, by automatically analyzing medical images. In the context of breast cancer, systems like the Automated Breast Ultrasound System (ABUS) have demonstrated advantages over handheld Ultrasound (US) images, enhancing lesion visualization and reducing operator dependency [8]. However, diagnosing breast cancer through US imaging remains a challenging task. Although the US is a non-invasive and widely available technique, its interpretation strongly depends on the operator’s expertise and is often influenced by inter-observer variability and image noise. Furthermore, the morphological similarity between benign and malignant lesions can make accurate discrimination difficult, particularly in dense breast tissues [9,10,11]. To address these challenges, Machine Learning (ML)-based CAD systems are being increasingly developed to support clinicians in the automatic analysis of US images [12,13,14,15]. Many studies use ML approaches, combined with Deep Learning (DL) models, aiming to obtain interpretable models, improving performance at the same time [16,17,18,19]. Other methods are based only on deep networks [20,21]. Existing FHIR-based architectures provide significant but partial contributions: some focus on interoperability and integration with EHRs, others illustrate how FHIR can enable AI-driven clinical workflows, while others define oncology-specific profiles.
Our research aims to improve the integration of AI-based DSSs into clinical practice by defining an architecture that incorporates several modules capable of communicating with each other and with heterogeneous external eHealth systems using the interoperable HL7 FHIR standard.
In this work, we demonstrate the use of the architecture through a case study related to the automated diagnosis of breast cancer via US image analysis based on AI approaches. Finally, we describe the patient-centered mobile application developed for the architecture, called InferCare.
The main innovations and distinctive aspects of our approach are:
  • Data Interoperability: Existing clinical systems do ot provide an integrated and patient-centered ecosystem that unifies anamnesis, structured clinical documents, diagnostic imaging, and AI-based decision support within a single workflow. The proposed architecture addresses this gap by integrating patient historical records, diagnostic imaging data, and risk assessment information originating from heterogeneous data sources. Interoperability is ensured through compliance with the HL7 FHIR standard, facilitating standardized data exchange and semantic consistency across HISs.
  • Standard-driven AI integration: AI modules are integrated in compliance with recognized interoperability and transparency standards, ensuring that all AI-based results remain traceable, interpretable, and consistent with the overall system workflow.
  • Bidirectional data flow: The clinical information collected from patients, encompassing medical history and diagnostic results, is semantically structured and encoded as HL7 FHIR resources. This approach enables standardized representation and supports bidirectional interoperability with heterogeneous external HIS, allowing both the retrieval of existing clinical data and the transmission of processed or newly generated information.
  • User interaction: The InferCare application offers an intuitive interface that enables seamless interaction between patients and clinicians, effectively bridging mobile health (mHealth) solutions and institutional HIS.
  • Clinical focus and validation: The interactions among the modules within the proposed architecture are demonstrated through a breast cancer case study based on US images analysis. This evaluation highlights the feasibility of integrating modules implementing ML algorithms with interoperability modules, thereby validating the system’s capacity to support clinically relevant diagnostic workflows.
This combination of (i) interoperability, (ii) AI-based results, and (iii) patient-centered design represents a comprehensive and scalable architecture capable of bridging the gap between fragmented healthcare data and DSS.

2. Related Works

The urgent need to adopt standardized data models, such as the HL7 FHIR standard, has stimulated a wide range of initiatives addressing different aspects of healthcare data management, from interoperability with legacy systems to AI-driven clinical applications and oncology-specific profiling. However, these contributions are often fragmented, each addressing a specific challenge, such as data exchange, integration of mobile health applications, or definition of oncology data elements, without providing a comprehensive framework that unifies anamnesis, structured data (e.g., diagnostic imaging, medical reports, etc.), and DSS.
The use of the FHIR standard into HISs has favored the development of solutions aimed at professionals, enhancing data accessibility, quality, and interoperability.
Within the field of interoperability, one of the most influential initiatives is “SMART on FHIR” [22]. This approach proposes a modular framework that not only fosters interoperability but also facilitates the secure integration of third-party applications into HISs, and in particular into Electronic Health Records (EHRs). Through semantically constrained FHIR profiles, OAuth2 (https://oauth.net/, accessed on 12 September 2025) authorization, and OpenID Connect (https://openid.net/, accessed on 12 September 2025) authentication, the platform SMART on FHIR enables the development of reusable clinical applications, easily integrated into healthcare professionals’ workflows.
However, despite its flexibility, the framework focuses mainly on application interoperability, without addressing semantic consistency or integration of heterogeneous data sources such as imaging and patient-generated content.
Another proposal that uses the FHIR standard in an integrated way within a framework architecture is Drishti [23] that extends the “Open mHealth” framework with a modular sense–plan–act architecture, designed to enable personalised behavioural interventions in mHealth. It connects data collection, planning, and alert delivery modules via RESTful APIs and FHIR-compatible backends by integrating FHIR resources such as Observation and CarePlan. This supports seamless integration with clinical systems such as “OpenMRS” (https://openmrs.org/, accessed on 12 September 2025).
Open mHealth uses FHIR as the canonical format for data exchange and storage, promoting interoperability, reusability, and modular development across different mobile health applications.
Nevertheless, this architectural proposal is primarily oriented toward data collection and monitoring patient behavior, offering limited support for complex clinical workflows and multimodal data integration.
The “MDIRA” initiative [24] offers a vendor-neutral, standards-based reference architecture for clinical device interoperability. Using IEEE 11073 semantics, IHE profiles, and HL7 FHIR messaging, MDIRA enables the seamless and secure integration of medical devices in hospital, home, and hospital-in-home settings. MDIRA supports both peer-to-peer and ICE-style (https://mdpnp.mgh.harvard.edu/astra-portfolio/ice-standard-integrated-clinical-environment, accessed on 26 July 2025) peer-to-aggregator communications and facilitates the development of autonomous, reliable, and reusable systems for critical care delivery.
Although MDIRA is effective in integrating medical devices, its scope remains strictly limited to devices in hospital, home, and hospital-in-home settings. It does not take into account other sources of clinical data, such as patient-generated information or diagnostic images, nor does it fully support complex clinical workflows.
Complementing these approaches, an ECG stream analysis framework [25] demonstrates how FHIR can support AI-driven healthcare applications in a cloud native environment. Using the Google Cloud Healthcare API, it securely stores FHIR-encoded ECG data and processes it using tools such as Scikit-Learn and PyTorch (https://pytorch.org/), thereby bridging the gap between clinical data interoperability and advanced AI real-time analytics and personalized monitoring.
The ECG Stream Analysis Framework effectively demonstrates how FHIR can enable AI-based analysis of real-time physiological data in a cloud-native setting, but its scope is limited to unidimensional ECG signals. It does not address multimodal integration, semantic standardization, or comprehensive decision-support workflows, which are instead central to the architecture proposed in this work.
In the field of precision medicine, the mCODE initiative [26] defines standardized FHIR profiles for oncology data, promoting reuse in both clinical systems and research. This model has been extended to international contexts [27], showcasing its adaptability to different healthcare systems. Similarly, the OSIRIS project [28] proposes a FHIR-compatible framework to enhance the sharing and analysis of clinical and genomic data in oncology.
Despite their contribution to oncology data standardization, these initiatives focus on defining specific FHIR profiles rather than proposing integrated architectures that connect clinical, imaging, and AI-driven components.
Major et al. [29] demonstrate how a FHIR back-end, integrated with the “Epic” system, which is one of the most widely used EHR systems in the world in hospital setting [30], allows real-time retrieval and analysis of clinical notes, medications, and vital signs to support AI-based models, enhancing timely clinical decision-making. Similarly to our approach, this demonstrates how FHIR-structured data can support decision-making processes across diverse clinical scenarios.
Although the literature demonstrates significant progress in standardizing healthcare data and promoting interoperability, particularly through the adoption of HL7 FHIR, existing solutions remain fragmented, each addressing only a portion of the clinical data ecosystem. This work fills this gap by proposing an end-to-end, FHIR-native architecture that consolidates heterogeneous patient information, embeds a modular DSS for risk assessment, and supports standardized communication with external HISs. Such an architecture is necessary because it provides a unified, FHIR-native workflow that ensures the standardized and integrated data required for reliable AI-based clinical decision support.

3. Integrated Patient Decision Support System—Architecture and Methodology

This section illustrates the definition of the architecture of the Integrated Patient Decision Support System, namely IPDSS, a solution designed to centralize, integrate, and standardize heterogeneous patient information in order to provide effective decision support to healthcare professionals and facilitate integration with existing HISs. Figure 1 presents a comprehensive view of the proposed modular architecture.
The architecture is designed as a modular, standards-based ecosystem that captures clinical and imaging data from multiple sources, applies algorithmic analysis and decision-support logic, and presents consolidated outputs via a user-centric interface. All information is formalized as HL7 FHIR resources and governed by a dedicated FHIR Implementation Guide, ensuring alignment with external HIS/EHR systems for seamless interoperability and future reuse.
The main objectives of IPDSS are:
  • Data Aggregation and Collection: Aggregating information from different sources, such as (i) information collected from interviews or through forms filled in by patients (medical and family history); (ii) data derived from diagnostic image analysis systems and reports; (iii) structured data obtained from HIS.
  • Data Standardization: Formalizing all data (source, integrated, and processed) using the international HL7 FHIR standard to realize an interoperable solution.
  • Intuitive and fast visualization of interest information: Providing integrated information of interest for healthcare professionals through an easy-to-navigate and understand user interface. Only information deemed useful is displayed for the specific diagnostic case.
  • Decision Support: Generating a concise summary of the patient’s health status based on the integrated data and the use of AI algorithms to propose an AI-based risk assessment.
  • Data Integration and Communication: Following data processing and integration, the architecture builds and manages HL7 FHIR-compliant resources that represent patient information, diagnostic results, and derived information. These resources are then sent to external systems such as HIS and EHR, contributing to the enrichment of patient clinical information and the overall increase in clinical knowledge related to patient health status.
In the following, we go into the details of the architecture’s description, with reference to functional requirements, data model, data flow, and Diagnostic Image Analysis Module.

3.1. Architecture

The proposed solution adopts a modular architecture based on logically separated modules or components that communicate with each other.
In Figure 1, the front-end components are represented by colored boxes, while the overlapped black boxes indicate the back-end modules of the architecture. Input components are represented with red dashed arrows and red boxes, whereas output components use green dotted arrows and green boxes.
The front-end modules include:
  • Front-end Patient
    It allows the patient to independently collect a range of anamnestic information (possibly specific to the clinical condition). The module allows for the management of data obtained by a form presented to the patient for the collection of medical history and other relevant information (allergies, medication, family history, symptoms, etc.). This module also allows the patient to retrieve any information present in the system through interaction with the HL7 FHIR-HIS interface module.
  • Diagnostic Image Analysis Module (DIAM)
    It performs processing and analysis of diagnostic images (e.g., detection, segmentation, classification, feature extraction) using ML algorithms or specific analysis tools. The aim of this module is to return an AI-based risk assessment.
  • HL7 FHIR Formalization Module (HFFM)
    This module allows for formalizing, through international standard HL7 FHIR profiles, the data provided by the patient during the anamnesis phase (coming from the patient’s anamnesis form), and the results of image analysis (from the DIAM). This module uses the FHIR profiles and resources (e.g., Patient, Observation, Diagnostic Report, Allergy Intolerance, Medication Statement), which are appropriately defined to ensure compliance with FHIR standards. It manages the creation of FHIR resources and carries out validation for the aim of proposed architecture.
  • Clinical Information of Interest Presentation Module (CIIPM)
    This module allows identifying and collecting all and only the clinical information of interest for a specific diagnosis. It thus enables the presentation of an integrated and intuitive view of the information: personal data, structured medical history, and image analysis results.
  • Health Status Summary Module
    It applies logical rules and algorithms to extract and analyze integrated data (history, image results, other available data), and generates a concise summary of the patient’s health status, highlighting key information, potential risks (AI-based risk assessment), and recommendations.
  • HL7 FHIR-HIS Interface
    It enables the CIIPM to communicate with the existing platform (EHR) using the HL7 FHIR standard and supports FHIR operations such as Create, Read, Update of Patient resources, Observation, Diagnostic Report, and other relevant ones.
The workflow illustrated in Figure 1 begins with the collection of data from multiple sources and proceeds through aggregation, analysis, standardization, and the presentation of results, ensuring interoperability via HL7 FHIR. Patients actively contribute by providing personal and clinical information through the system, including structured anamnesis, demographic data, family medical history, symptoms, allergies, medications, lifestyle factors, consent documentation, and optional clinical attachments. Physicians access patient data either entered directly by patients or retrieved from HIS/EHR systems via an interface that communicates through XDS with external HIS/EHRs. Additional input data come from diagnostic images associated with the patient, processed by the DIAM using ML algorithms, and from structured documents available in HIS/EHR systems. These documents may be generated by physicians in external systems and retrieved through the HL7 FHIR interface (XDS). The decision support module generates a concise summary of the patient’s health status based on integrated data and AI algorithms, providing an AI-driven risk assessment. Results are presented to physicians through the CIIPM, displaying only the most relevant clinical information, including integrated medical history, image results, and risk scores. All collected data are processed by the HFFM, which converts them into FHIR resources (e.g., patient, observation, diagnostic report) and validates them according to the relevant Implementation Guides (IGs). Finally, standardized FHIR data are transmitted to external HIS/EHR systems to support data enrichment and follow-up, completing the care cycle with healthcare facilities and local physicians.

Data Model

The data used in the architecture presented in the previous section are anamnestic data, i.e., a collection of information about the patients and their medical history, carried out by the physician to better understand the situation and make an accurate diagnosis. These data help the physician to identify possible diseases, risk factors, and family predispositions, which are essential for appropriate treatment. In addition to the medical history data, some data related to patient images are provided to the clinicians to support the diagnosis (Pathological Features). Moreover, other data features are also extracted from the images (Hand-crafted Features) that allow the computation of the AI-based risk assessment. Section 4 explains the process for the computation of AI-based risk assessment.

3.2. Architecture Modules Details

The following paragraph provides the flow of data within the proposed IPDSS architecture.

3.2.1. Front-End Patient

This module is designed to enable patients to autonomously provide a wide range of anamnestic information, which can be tailored to their specific clinical condition. Through an intuitive user interface, patients are presented with a customizable form where they can input relevant data regarding their personal and family medical history. This includes, but is not limited to, information about existing or past symptoms, current medications, known allergies, past diagnoses, and hereditary conditions.
The data can be collected either directly through the dedicated mobile application or remotely during teleconsultation interviews with physician, ensuring flexibility and accessibility for different clinical contexts.
The collected data are systematically managed and stored by the module, ensuring that healthcare professionals can access accurate and up-to-date patient information. Furthermore, the module is integrated with HIS/EHR, allowing patients to retrieve any relevant data already present in the system.
This bidirectional communication ensures that both patient-provided information and existing clinical records are seamlessly synchronized, enhancing the completeness and reliability of the patient’s health profile.

3.2.2. Diagnostic Image Analysis Module (DIAM)

DIAM aims to provide two types of support to the healthcare professionals:
  • Some specific features related to the pathology, namely Pathological Features (PFs), are automatically extracted from the images by using properly designed Computer Vision algorithms. These features are shown to the healthcare professionals and are useful to support the diagnosis.
  • A risk assessment, computed by using suitably designed ML algorithms. For the computation of AI-based risk assessment, the ML methods use a set of features automatically extracted from US images, larger than PF, and called Hand-crafted Features (HFs).

3.2.3. HL7 FHIR Formalization Module (HFFM)

HFFM transforms clinical and contextual data collected by the user into resources that comply with the HL7 FHIR standard. Specifically, the module processes three main categories of information:
  • Anamnestic data: information provided directly by the patient through the Front-end Patient module. This includes, for example, self-reported symptom medical conditions, self-reported symptoms, ongoing treatments, and relevant lifestyle factors.
  • Clinical data derived from structured documents: information provided by external systems such as HIS/EHR.
  • Clinical information derived from diagnostic images: includes results obtained through the automated analysis of diagnostic images (e.g., US images) by the DIAM. This information is then translated into formalized clinical observations according to the HL7 FHIR standard.
In line with international approaches that enable information management through the FHIR standard and the formal definition of a project-specific Implementation Guide (e.g., [31]), which specifies the rules and constraints governing the interoperable use of health data and information, we adopted the IG presented in [32]. In the proposed IG, profiles, value sets, and system codes were defined and formalized. The developed IG also incorporates concepts from internationally recognized guides, such as the International Patient Summary (IPS) [33], which establishes an interoperable structure for summarizing a patient’s clinical information, thereby facilitating the exchange of essential health data. Within the proposed architecture, the IG defined in [34,35] is leveraged to support the structuring of the information used by our solution. To guarantee a uniform mapping between the heterogeneous input data and the HL7 FHIR-based representation, a structured correspondence was defined. Table 1 summarizes this mapping, linking each clinical concept with its data source and the corresponding FHIR profile used in the IG. The information originates from various contexts, such as IPS CDA 2.0 sections, patient interviews, or the DIAM, and is semantically aligned with the related FHIR resources.
In addition, to exemplify the transformation process carried out by the HFFM, Table 2 illustrates the detailed correspondence between specific CDA 2.0 elements and the equivalent FHIR elements for the AllergyIntolerance resource. This mapping provides a concrete example of how information from CDA-based documents is converted into structured FHIR resources by the formalization module.

3.2.4. Clinical Information of Interest Presentation Module (CIIPM)

This module is responsible for identifying and aggregating all the clinical information that is specifically relevant to a given diagnosis, based on the patient’s current clinical condition and medical history. Its goal is to filter out non-essential data and focus only on what is diagnostically significant, thereby reducing information overload for healthcare professionals.
The module presents this curated information through an integrated and user-friendly interface. It combines various data sources, including personal and demographic information, structured and categorized medical history, and the results obtained from medical image analysis, into a coherent and accessible overview. This facilitates faster and more informed clinical decision-making by providing a comprehensive yet focused snapshot of the patient’s health status.

3.2.5. Health Status Summary Module

This module performs a synthesis of the patient’s overall health condition by applying predefined logical rules and advanced algorithms to the available integrated data (diagnostic image analysis results, and any other clinically relevant data available in the system).
Based on this analysis, the module generates a clear and concise summary of the patient’s health status. It highlights key clinical findings, flags potential health risks through AI-based risk assessment models, and provides actionable recommendations when appropriate. This summary is designed to support healthcare providers by offering a quick yet comprehensive overview, aiding in both diagnosis and treatment planning.

3.2.6. HL7 FHIR—HIS Interface

This module facilitates seamless communication between the CIIPM and the existing HIS, such as the EHR platform. It leverages the HL7 FHIR standard, which is widely adopted for the secure and efficient exchange of healthcare data.
The interface supports a range of core FHIR operations, including the creation, retrieval, and updating of key healthcare resources such as Patient, Observation, DiagnosticReport, and other relevant resource types. By implementing these operations, the module ensures interoperability with other systems and enables real-time synchronization and sharing of clinical information across the healthcare infrastructure. This interoperability enhances the consistency, accuracy, and accessibility of patient data throughout the clinical workflow.

4. Case Study: Breast Cancer

This work proposes a case study related to the clinical condition of breast cancer.
In the following section, we describe the modules, DIAM and HFFM, that require customization in relation to the selected case study. The use of the specific scenario allowed us to validate the definition of the entire architecture, showing the relationships between the various components and therefore the flow of heterogeneous data, its use and processing in order to obtain a result (risk assessment) in terms of summarized information produced and how this information can be standardized and sent to external systems using the FHIR standard.

4.1. Implementation of DIAM in the Breast Cancer Case

  • Dataset description
In this work, the Breast Ultrasound Images Dataset (https://www.kaggle.com/datasets/aryashah2k/breast-ultrasound-images-dataset, accessed on 19 March 2025) (BUSI) is used for the evaluation of the classification of US images. The dataset includes 780 breast US images of women aged between 25 and 75 years old. A total of 210 images contain malignant lesions, 437 are annotated as benign breast cancer, and 133 are normal breast cancer images. Only malignant and benign cases are included in this study. In detail, for each US image, a ground truth with the annotated lesion is associated.
As an example, Figure 2 illustrates two breast US images: one showing a malignant lesion and the other a benign one. In both cases, the lesions are contoured in red by the expert radiologists, which allowed us to extract the key morphological and color-related features directly in that annotated portion and use them for the classification process.
  • AI-based Risk assessment calculation
For the selected case study, the DIAM provides the PF set, reported in the Table 3. These features are among the most used by radiologists to provide a diagnosis of breast cancer [36]. So, we show them to the clinician to support the diagnostic decision process.
For the AI-based risk assessment, a classification of the lesion as benign or malignant is proposed. For the classification purpose, we chose to use handcrafted features. Unlike DL approaches, the use of handcrafted features has the advantage of producing an interpretable model that can show the clinician which features were used to achieve that result. Thus, we use the HF set, composed of morphological and color-related features, reported in Table 4, extracted from the annotated lesions.
A feature selection step is applied in order to assess the HFs that are more important for the classification. Recursive Feature Elimination (RFE) using Logistic Regression (LR) results in being the best method for the feature selection phase [37]. Subsequent to the feature selection phase, a classification pipeline is performed with the objective of assessing the predictive capabilities of various ML algorithms. Specifically, four well-established classifiers are considered: Decision Tree (DT) [38], Multi-Layer Perceptron (MLP) [39], Naive Bayes (NB) [40], Random Forest (RF) [41]. Model performance is evaluated using a four-fold cross-validation approach, where the dataset is randomly divided into four equally sized folds. At each iteration, three folds are used for training and one for testing.
In order to mitigate the issue of class imbalance inherent in the dataset, the Synthetic Minority Oversampling Technique (SMOTE) [42] is applied. Unlike traditional random oversampling methods, which tend to increase the risk of overfitting by merely duplicating existing minority class instances, SMOTE addresses the imbalance by synthetically generating new samples. This is achieved through interpolation between existing minority class samples that are in close proximity within the feature space. Such an approach fosters a more balanced class distribution and enhances the generalization ability of the trained models. For each image, the predicted class is the one associated with the highest probability score assigned by the ML algorithm. The risk assessment proposed to the clinician consists of the predicted class (benign or malignant) and the associated probability for that class.
Typically, clinicians interpret risk scores by mapping the numerical output of the AI model to well-defined clinical categories that reflect the likelihood of a specific condition, thereby guiding subsequent decision-making steps. In detail, clinicians combine three complementary elements: (i) the predicted class (benign or malignant) and its associated probability; (ii) the set of handcrafted features that contributed to the prediction, such as perimeter regularity, axis ratio, solidity, and circularity, which correspond to well-established radiological criteria; and (iii) the clinical context derived from the patient anamnesis and structured data. This interpretability enables clinicians to validate the AI output against their own visual assessment.
  • Results
In Table 5, the obtained results, together with the standard deviation across the folds, are reported. They highlight the importance of carefully selecting discriminative handcrafted features for breast lesion classification from US images. The use of RFE proved particularly effective in identifying the most informative subset of features, suggesting that not all morphological descriptors contribute equally to the discrimination task. The fact that only three features—perimeter regularity, axis ratio, and solidity—were sufficient to achieve high performance indicates that these characteristics capture complementary aspects of lesion morphology that are highly relevant for distinguishing benign from malignant patterns. This is also consistent with radiological practice, where border irregularity, asymmetry, and spiculation are well-established hallmarks of malignancy.
Among the classifiers, MLP outperformed the others, achieving almost perfect discrimination with an accuracy close to 98% and a F1-score of 97%. This reinforces the idea that neural networks, even in relatively simple architectures, are well suited to model nonlinear feature interactions. While perimeter regularity or solidity alone can already provide meaningful information, their joint contribution, along with axis ratio, can be more effectively exploited by MLP compared to other models. In contrast, NB, constrained by its independence assumption, showed limitations in handling such interactions, which likely explains its relatively lower recall.
DT performed adequately but was more affected by the dataset size and potential noise. This behavior is expected, as DTs are prone to overfitting when trained on small datasets and may fail to generalize well. RF, on the other hand, mitigated some of these issues by averaging multiple DTs, which resulted in solid performance (97.2% accuracy and 96.8% F1-score). However, RF still did not surpass MLP, suggesting that ensembles of shallow learners might not capture subtle, higher-order nonlinear relationships as effectively as neural networks.
Overall, these findings suggest that the careful combination of feature selection and the use of flexible learning models such as MLP can yield highly accurate breast cancer classification systems, even when relying solely on handcrafted features rather than deep representations. This is a particularly relevant result for settings where computational resources are limited or where large annotated datasets required for DL are not available. At the same time, the high performance obtained with a small feature set also improves interpretability, since the decision-making process can be directly related to clinically meaningful morphological descriptors.

4.2. Implementation of HFFM in the Breast Cancer Case

This section presents FHIR profiles developed to ensure the structured and interoperable representation of clinical and anamnestic information collected through an IG. Derived from the adaptation of international FHIR resources to the project context, these profiles cover several areas ranging from patient history to oncology risk analysis. In Table 6, a summary of a single profile of IG is reported, while in Figure 3, an overview of the Information Model and Profiled FHIR Resources is presented.

5. InferCare Android Application

To facilitate interaction with the IPDSS system by both patients and physicians, the InferCare mobile app was developed. The app’s name is derived from the combination of “Inference” (referring to AI inference) and “Care” (patient care). This section explains the interface and main features of the app, which allow guided completion of the medical history form, display of structured information, and support of diagnosis through summary views for physicians.

5.1. IPDSS Functional Requirements

The section specifies the Functional Requirements (FRs) of the IPDSS, organized according to a progressive nomenclature (FR01, FR02, etc.). These requirements, as you can see in the Table 7 below, define the expected behaviors of the system, both in terms of secure data acquisition and management and in terms of accessibility, user interface, and interoperability with external systems (such as EHR, HIS, and telemedicine systems).

5.2. InferCare

The mobile app, InferCare, is designed to provide an intuitive, interactive interface that supports patients and physicians throughout the entire information flow. From the patient’s perspective, an interface is implemented that enables the guided completion of a multi-section medical history form, covering symptoms, medications, allergies, and habits. The form can be temporarily saved locally and can be resumed later. Once the information is validated through the HFFM module, it is visible to the patient through the interface.
From the physician’s perspective, an interface is implemented that provides authenticated and secure access to the patient dashboard, serving as a centralized entry point for all clinical information. Once logged in, the physician can view the list of waiting patients, available appointments, and a summary of each patient’s health status. The Figure 4 and Figure 5 show both the patient’s and physician’s viewpoints from the app interface. In the next section, the Sequence Diagrams for patient and physician interaction with the app are described.

5.3. User Actions and Interactions

The sequences of actions performed by the patient are shown in Figure 6:
  • The patient starts the mobile application (StartApp);
  • A request is sent to the Front-end Patient to access the data entry form (RequestCompilationForm);
  • The patient completes a digital form through the Front-end Patient interface;
  • The Front-end Patient form sends the data to the Backend via a REST API (POST(data anamnesis module));
  • The “RESTful API” forwards the “POST (data anamnesis module)” to the “HL7 FHIR Formalization Module”;
  • The HL7 FHIR formalization module transforms the received data into FHIR resources and sends them to the “FHIR Server with IG” for data validation (“validate data”);
  • FHIR Server with IG validates and stores the FHIR Resources, then data send a “Response FHIR Resources” back to the HL7 FHIR Formalization module;
  • Finally, the “Front-End Patient” receives a “Response Resource FHIR” back to the mobile application as View Form anamnsesis.
The sequence in Figure 7 describes the back-end process triggered when a user accesses the mobile application and requests the analysis of diagnostic images. The system handles image retrieval, clinical feature extraction via AI, and data transformation into HL7 FHIR format using a structured pipeline supported by RESTful communication and standardized data representation.
  • The user accesses the mobile application (AccessApp).
  • The application sends an image analysis request to the backend via RESTful API (RequestElaborationImage).
  • The RESTful API performs a POST request to the module that queries the US image database.
  • The database receives the request (RequestDiagnosticImage), retrieves the required US images (Images recovery), and returns them.
  • The US images are sent to the AI-based image analysis module.
  • The AI module extracts the clinical features from the ultrasound images (extract features images).
  • The extracted data are sent to the HL7 FHIR formalization module (send data features).
  • The HL7 FHIR module validates and transforms the data into FHIR format (validate data).
  • Finally, the FHIR Server with IG stores the validated features (validation and store features images resource).
The sequences of actions performed by the physician are as follows and shown in Figure 8:
  • The physician interacts with the mobile app to request the summary view (RequestViewSummary).
  • The app forwards the request to the Clinical Information of Interest presentation module (RequestSummary).
  • This module queries the FHIR Server through the RESTful API with a query containing the relevant data (Query FHIR Resources (DataAnamnesis, featuresImage)).
  • The FHIR Server with IG retrieves the requested clinical information and returns the FHIR Resources.
  • The Clinical Information of Interest presentation module extracts the main data from the received resources, creates a summary with a Health Status Summary and AI-based risk assessment, and highlights the principal information;
  • The summary view is presented to the physician through the mobile interface (View Summary).

5.4. Data Flow Within the Architecture Enabled by the Developed InferCare

This section presents the flow of data within the proposed architecture, illustrated in Figure 9, showing how the InferCare mobile application interacts with heterogeneous HIS to ensure the interoperability, traceability, and semantic consistency of clinical data. The architecture integrates medical information from various sources, including EHRs, HISs, PACS, and patient-collected data, while adhering to international standards such as IHE-XDS, IHE-XDS-I, and HL7 FHIR. This multilevel integration enables structured access to, and the secure exchange and validation of clinical and imaging information. Ultimately, it generates a unified and interoperable health summary that is accessible to physicians and patients. The operational workflow, detailed in Figure 9, unfolds as follows.
  • Patient data collection: the process begins with the patient completing a digital medical history form through the InferCare mobile application. The guided interface enables the structured collection of personal and clinical information, including demographic data, symptoms and reasons for consultation, family and medical history, allergies, medications in use, informed consent, and the upload of relevant clinical documents.
  • Retrieval of Clinical Documents (EHR–IHE-XDS): the InferCare application interfaces with the EHR through the IHE Cross-Enterprise Document Sharing (XDS) standard. Using the Registry Stored Query [ITI-18] IHE transaction, the app queries the XDS Registry to identify structured clinical documents associated with the patient. The registry provides the Query Response, allowing the system to retrieve the corresponding document set (Retrieve Document Set [ITI-43]) from the XDS Repository.
  • Integration with Diagnostic Imaging Systems (IHE XDS-I/PACS): when a new diagnostic request is issued, the EHR communicates with the architecture to identify relevant imaging studies stored in multiple PACS systems. The application queries each PACS using the DICOM standard to identify and retrieve imaging objects associated with the patient. These objects are then referenced and managed using the IHE XDS-I profile, which ensures image accessibility within the EHR ecosystem. Once retrieved, the imaging data are processed by the DIAM. This module performs automated preprocessing, lesion detection, and feature extraction.
All information, including clinical, anamnesis, and imaging data, is aggregated and validated using the HFFM. The CIIPM then calculates health status indicators and risk assessment metrics to support decision-making processes. All FHIR data resources are registered in the HIS using the Register Document Set-B [ITI-42] IHE transaction to ensure full interoperability and traceability across institutional repositories.

6. Conclusions

In this work, we utilized the HL7 FHIR standard [43] and all the standard tools [44,45] provided by the HL7 community to develop IPDSS, an architecture that enables the integration of data from diagnostic images, patient-collected data, and data from heterogeneous sources. We chose a case study that identifies AI-based risk assessment related to a clinical condition of breast cancer based on the analysis of US images by using ML models.
To manage the exchange of heterogeneous data and support multiple interactions with external modules and systems, the InferCare Android application was developed and seamlessly integrated within the proposed architecture. By leveraging the HL7 FHIR standard for data representation and communication, the application ensures interoperability across disparate HIS while providing an interactive graphical interface that facilitates diagnostic workflows and enhances user interaction for both patients and clinicians.
It also allows patients to record their medical history and physicians to obtain a summary of the most important information for an effective and rapid diagnosis, enabling risk assessments associated with the patient’s clinical condition and other relevant information to be transformed into standard resources to be sent to external systems.
Furthermore, the goal of the architecture is to obtain useful information from different sources (EHR, PACS, medical history, etc.), standardize it, and make it available to all healthcare professionals who will be treating the patient (through EHR, HIS, etc. systems).
Thus, in the future, we intend to extend our work toward a more comprehensive and generalizable adoption of the proposed architecture, enabling the integration of unstructured clinical documents (from HIS, EHR, etc.), image-derived features across different diagnostic modalities, and information collected through patient-facing applications. Further research will focus on evaluating and validating the clinical impact of multimodal data fusion within the defined framework, assessing its contribution to diagnostic accuracy, optimizing clinical workflow, and providing personalized decision support. Moreover, future investigations could explore the incorporation of federated learning and privacy-preserving mechanisms to enable distributed AI model training across institutions without compromising data confidentiality. The integration of explainable AI components and continuous learning strategies will also be considered to enhance the transparency, adaptability, and long-term reliability of the proposed system in real-world clinical environments.

Author Contributions

Conceptualization, N.B. and M.S.; methodology, N.B. and M.S.; software, T.C. and M.R.; validation, N.B., M.S., T.C., M.R. and S.D.P.; formal analysis, N.B. and M.S.; investigation, T.C. and M.R.; resources, T.C., M.R. and S.D.P.; data curation, T.C., M.R. and S.D.P.; writing—original draft preparation, N.B. and M.S.; writing—review and editing, N.B., M.S., T.C., M.R. and S.D.P.; supervision, N.B. and M.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the Project “RIGOLETTO: Creation of intelligent management platform for oncology patients”—Bando “Accordi per l’innovazione”, del Ministero delle Imprese e del Made in Italy, mimit.AOO_IAI.REGISTRO INTERNO.R.0001470.08-05-2023, for providing support for this study.

Institutional Review Board Statement

This study does not involve human participants, human data, or human biological samples. The manuscript presents a theoretical framework and a conceptual architecture for integrating and processing health data within an FHIR-based system. Although examples referring to patient data collection are discussed, these are purely illustrative, and no real patient data were collected, accessed, or analyzed. Therefore, approval from an Institutional Review Board or Ethics Committee was not required.

Informed Consent Statement

Not applicable. This study does not involve human participants or the use of identifiable or non-identifiable human data.

Data Availability Statement

The image dataset BUSI, used for the experiments, is publicly available at https://www.kaggle.com/datasets/aryashah2k/breast-ultrasound-images-dataset (accessed on 19 March 2025). The Test IG “Remote Anamnesis” is publicly available at http://remote-anamnesis.na.icar.cnr.it (accessed on 8 August 2025).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

AIArtificial Intelligence
ABUSAutomated Breast Ultrasound System
APIApplication Programming Interface
CADComputer-Aided Diagnosis
DSSClinical Decision Support System
CIIPMClinical Information of Interest Presentation Module
CICClinical Information Council
CIMIClinical Information Modeling Initiative
DLDeep Learning
DIAMDiagnostic Image Analysis Module
DTDecision Tree
EHRElectronic Health Record
ECGElectrocardiogram
EMRElectronic Medical Record
FHIRFast Healthcare Interoperability Resources
FRFunctional Requirement
HFFMHL7 FHIR Formalization Module
HFHand-crafted Features
HISHealth Information System
HL7Health Level Seven International
ICEIntegrated Clinical Environment
ICHOMInternational Consortium for Health Outcomes Measurement
IGImplementation Guide
IHEIntegrating the Healthcare Enterprise
IPDSSIntegrated Patient Decision Support System
IPSInternational Patient Summary
JSONJavaScript Object Notation
LRLogistic Regression
mCODEMinimal Common Oncology Data Elements
MLMachine Learning
MLPMulti-Layer Perceptron
MDIRAMedical Device Interoperability Reference Architecture
mHealthMobile Health
NBNaive Bayes
OSIRISOpen Standards for Interoperable and Reusable Information in Oncology
PFPathological Features
RFRandom Forest
RFERecursive Feature Elimination
RESTRepresentational State Transfer
SMOTESynthetic Minority Oversampling Technique
USUltrasound
XMLExtensible Markup Language
XDSCross-Enterprise Document Sharing

References

  1. Adler-Milstein, J.; Holmgren, A.J.; Kralovec, P. Electronic Health Record Adoption and Interoperability among U.S. Hospitals. Health Aff. 2021, 40, 1287–1296. [Google Scholar]
  2. Lentini, S.; Grosso, E.; Masala, G.L. A Comparison of Data Fragmentation Techniques in Cloud Servers. In Proceedings of the Advances in Internet, Data & Web Technologies; Barolli, L., Xhafa, F., Javaid, N., Spaho, E., Kolici, V., Eds.; Springer: Cham, Switzerland, 2018; pp. 560–571. [Google Scholar]
  3. Mercy, W.; Annabel, L.S.P. Secure Electronic Health Record Storage in the Cloud Based on Multiple Fragmentation and Reconstruction Using Blockchain with Cryptographic Techniques. In Proceedings of the 2024 4th International Conference on Ubiquitous Computing and Intelligent Information Systems (ICUIS), Gobichettipalayam, India, 12–13 December 2024; pp. 1232–1236. [Google Scholar] [CrossRef]
  4. Sreejith, R.; Senthil, S. Smart Contract Authentication assisted GraphMap-Based HL7 FHIR architecture for interoperable e-healthcare system. Heliyon 2023, 9, e15180. [Google Scholar] [CrossRef] [PubMed]
  5. Ramgopal, S.; Sanchez-Pinto, L.N.; Horvat, C.M.; Carroll, M.S.; Luo, Y.; Florin, T.A. Artificial intelligence-based clinical decision support in pediatrics. Pediatr. Res. 2023, 93, 334–341. [Google Scholar] [CrossRef] [PubMed]
  6. Duda, S.N.; Kennedy, N.; Conway, D.; Cheng, A.C.; Nguyen, V.; Zayas-Cabán, T.; Harris, P.A. HL7 FHIR-based tools and initiatives to support clinical research: A scoping review. J. Am. Med. Inform. Assoc. 2022, 29, 1642–1653. [Google Scholar] [CrossRef]
  7. Taber, P.; Radloff, C.; Del Fiol, G.; Staes, C.; Kawamoto, K. New standards for clinical decision support: A survey of the state of implementation. Yearb. Med Inform. 2021, 30, 159–171. [Google Scholar] [CrossRef]
  8. Zhang, X.; Lin, X.; Tan, Y.; Zhu, Y.; Wang, H.; Feng, R.; Tang, G.; Zhou, X.; Li, A.; Qiao, Y. A multicenter hospital-based diagnosis study of automated breast ultrasound system in detecting breast cancer among Chinese women. Chin. J. Cancer Res. 2018, 30, 231. [Google Scholar] [CrossRef]
  9. Moon, W.K.; Lee, Y.W.; Ke, H.H.; Lee, S.H.; Huang, C.S.; Chang, R.F. Computer-aided diagnosis of breast ultrasound images using ensemble learning from convolutional neural networks. Comput. Methods Programs Biomed. 2020, 190, 105361. [Google Scholar] [CrossRef]
  10. Badawy, S.M.; Mohamed, A.E.N.A.; Hefnawy, A.A.; Zidan, H.E.; GadAllah, M.T.; El-Banby, G.M. Classification of breast ultrasound images based on convolutional neural networks-a comparative study. In Proceedings of the 2021 International Telecommunications Conference (ITC-Egypt), Alexandria, Egypt, 13–15 July 2021; pp. 1–8. [Google Scholar]
  11. Seiler, S.J.; Neuschler, E.I.; Butler, R.S.; Lavin, P.T.; Dogan, B.E. Optoacoustic imaging with decision support for differentiation of benign and malignant breast masses: A 15-reader retrospective study. Am. J. Roentgenol. 2023, 220, 646–658. [Google Scholar] [CrossRef]
  12. Kwon, H.; Oh, S.H.; Kim, M.G.; Kim, Y.; Jung, G.; Lee, H.J.; Kim, S.Y.; Bae, H.M. Enhancing Breast Cancer Detection through Advanced AI-Driven Ultrasound Technology: A Comprehensive Evaluation of Vis-BUS. Diagnostics 2024, 14, 1867. [Google Scholar] [CrossRef]
  13. Wang, S.; Zhao, Z.; Ouyang, X.; Liu, T.; Wang, Q.; Shen, D. Interactive computer-aided diagnosis on medical image using large language models. Commun. Eng. 2024, 3, 133. [Google Scholar] [CrossRef]
  14. Azam, S.; Montaha, S.; Raiaan, M.A.K.; Rafid, A.R.H.; Mukta, S.H.; Jonkman, M. An automated decision support system to analyze malignancy patterns of breast masses employing medically relevant features of ultrasound images. J. Imaging Inform. Med. 2024, 37, 45–59. [Google Scholar] [CrossRef]
  15. Ragab, M.; Albukhari, A.; Alyami, J.; Mansour, R.F. Ensemble deep-learning-enabled clinical decision support system for breast cancer diagnosis and classification on ultrasound images. Biology 2022, 11, 439. [Google Scholar] [CrossRef] [PubMed]
  16. Daoud, M.I.; Abdel-Rahman, S.; Bdair, T.M.; Al-Najar, M.S.; Al-Hawari, F.H.; Alazrai, R. Breast tumor classification in ultrasound images using combined deep and handcrafted features. Sensors 2020, 20, 6838. [Google Scholar] [CrossRef] [PubMed]
  17. Cruz-Ramos, C.; Garcia-Avila, O.; Almaraz-Damian, J.A.; Ponomaryov, V.; Reyes-Reyes, R.; Sadovnychiy, S. Benign and malignant breast tumor classification in ultrasound and mammography images via fusion of deep learning and handcraft features. Entropy 2023, 25, 991. [Google Scholar] [CrossRef]
  18. Khan, M.U.; Bianconi, F.; Du, H.; Jassim, S. Hand-crafted Vs. deep CNN features to distinguish benign from malignant lesions in breast ultrasound images. In Proceedings of the 2025 International Conference on Control, Automation and Diagnosis (ICCAD), Barcelona, Spain, 1–3 July 2025; pp. 1–6. [Google Scholar]
  19. Abhisheka, B.; Biswas, S.K.; Purkayastha, B.; Das, S. Integrating deep and handcrafted features for enhanced decision-making assistance in breast cancer diagnosis on ultrasound images. Multimed. Tools Appl. 2025, 84, 43263–43285. [Google Scholar] [CrossRef]
  20. Taheri, F.; Rahbar, K. Improving breast cancer classification in fine-grain ultrasound images through feature discrimination and a transfer learning approach. Biomed. Signal Process. Control 2025, 106, 107690. [Google Scholar] [CrossRef]
  21. Ellis, J.; Appiah, K.; Amankwaa-Frempong, E.; Kwok, S.C. Classification of 2d ultrasound breast cancer images with deep learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 17–18 June 2024; pp. 5167–5173. [Google Scholar]
  22. Mandel, J.C.; Kreda, D.A.; Mandl, K.D.; Kohane, I.S.; Ramoni, R.B. SMART on FHIR: A standards-based, interoperable apps platform for electronic health records. J. Am. Med. Inform. Assoc. 2016, 23, 899–908. [Google Scholar] [CrossRef]
  23. Eapen, B.R.; Archer, N.; Sartipi, K.; Yuan, Y. Drishti: A Sense-Plan-Act Extension to Open mHealth Framework Using FHIR. In Proceedings of the 2019 IEEE/ACM 1st International Workshop on Software Engineering for Healthcare (SEH), Montreal, QC, Canada, 27 May 2019; pp. 49–52. [Google Scholar]
  24. Sloane, E.B.; Cooper, T.; Silva, R. MDIRA: IEEE, IHE, and FHIR Clinical Device and Information Technology Interoperability Standards, bridging Home to Hospital to “Hospital-in-Home”. In Proceedings of the SoutheastCon 2021, Virtual, 10–14 March 2021; pp. 1–4. [Google Scholar]
  25. Lee, J.; Kim, J. Design of an ECG Stream Analysis Framework Based on FHIR Data Model. In Proceedings of the 2024 Fifteenth International Conference on Ubiquitous and Future Networks (ICUFN), Budapest, Hungary, 2–5 July 2024; pp. 567–569. [Google Scholar]
  26. Osterman, T.J.; Terry, M.; Miller, R.S. Improving Cancer Data Interoperability: The Promise of the Minimal Common Oncology Data Elements (mCODE) Initiative. JCO Clin. Cancer Inform. 2020, 4, 993–1001. [Google Scholar] [CrossRef]
  27. Chen, J.; Chiang, Y. Applying the Minimal Common Oncology Data Elements (mCODE) to the Asia-Pacific Region. JCO Clin. Cancer Inform. 2021, 5, 252–253. [Google Scholar] [CrossRef]
  28. Guérin, J.; Laizet, Y.; Le Texier, V.; Chanas, L.; Rance, B.; Koeppel, F.; Lion, F.; Gourgou, S.; Martin, A.L.; Tejeda, M.; et al. OSIRIS: A Minimum Data Set for Data Sharing and Interoperability in Oncology. JCO Clin. Cancer Inform. 2021, 5, 256–265. [Google Scholar] [CrossRef]
  29. Major, V.J.; Wang, W.; Aphinyanaphongs, Y. Enabling AI-Augmented Clinical Workflows by Accessing Patient Data in Real-Time with FHIR. In Proceedings of the 2023 IEEE 11th International Conference on Healthcare Informatics (ICHI), Houston, TX, USA, 26–29 June 2023; pp. 531–533. [Google Scholar]
  30. Chishtie, J.; Sapiro, N.; Wiebe, N.; Rabatach, L.; Lorenzetti, D.; Leung, A.A.; Rabi, D.; Quan, H.; Eastwood, C.A. Use of Epic electronic health record system for health care research: Scoping review. J. Med. Internet Res. 2023, 25, e51003. [Google Scholar] [CrossRef] [PubMed]
  31. Peretokin, V.; Basdekis, I.; Kouris, I.; Maggesi, J.; Sicuranza, M.; Su, Q.; Acebes, A.; Bucur, A.; Mukkala, V.J.R.; Pozdniakov, K.; et al. Overview of the SMART-BEAR Technical Infrastructure. In Proceedings of the 8th International Conference on Information and Communication Technologies for Ageing Well and e-Health, Online, 23–25 April 2022; SciTePress: Setúbal, Portugal, 2022; pp. 117–125. [Google Scholar] [CrossRef]
  32. Conte, T.; Sicuranza, M. Sistema FHIR-Based per la Raccolta Strutturata dell’Anamnesi in Remoto: Approccio, Standard e Implementazione; Rapporto Tecnico RT-ICAR-NA-2025-05; CNR-ICAR: Napoli, Italy, 2025. [Google Scholar]
  33. HL7 International. International Patient Summary (IPS) Implementation Guide. Available online: https://build.fhir.org/ig/HL7/fhir-ips/ (accessed on 25 June 2025).
  34. Teresa Conte and Mario Sicuranza. Remote Anamnesis Implementation Guide for Technical Report. Available online: https://anamnesi.na.icar.cnr.it/ (accessed on 8 August 2025).
  35. Teresa Conte and Mario Sicuranza. Remote Anamnesis Implementation Guide. Available online: http://remote-anamnesis.na.icar.cnr.it (accessed on 8 August 2025).
  36. Guo, R.; Lu, G.; Qin, B.; Fei, B. Ultrasound imaging technologies for breast cancer detection and management: A review. Ultrasound Med. Biol. 2018, 44, 37–70. [Google Scholar] [CrossRef]
  37. Kuhn, M.; Johnson, K. Applied Predictive Modeling; Springer: Berlin/Heidelberg, Germany, 2013; Volume 26. [Google Scholar]
  38. Magee, J.F. Decision Trees for Decision Making; Harvard Business Review: Brighton, MA, USA, 1964. [Google Scholar]
  39. Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning Internal Representations by Error Propagation; Technical Report; Institute for Cognitive Science, California University San Diego: La Jolla, CA, USA, 1985. [Google Scholar]
  40. Langley, P.; Iba, W.; Thompson, K. An analysis of Bayesian classifiers. In Proceedings of the AAAI, San Jose, CA, USA, 12–16 July 1992; Citeseer: University Park, PA, USA, 1992; Volume 90, pp. 223–228. [Google Scholar]
  41. Ho, T.K. Random decision forests. In Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, QC, Canada, 14–16 August 1995; Volume 1, pp. 278–282. [Google Scholar]
  42. Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
  43. Standard HL7 FHIR. HL7 Documentation. Available online: https://www.hl7.org/fhir/documentation.html (accessed on 15 May 2025).
  44. HL7. SUSHI – SUSHI Unshortens Short Hand Inputs. Available online: https://build.fhir.org/ig/HL7/fhir-shorthand/overview.html (accessed on 15 May 2025).
  45. HL7. IG Publisher. Available online: https://confluence.hl7.org/plugins/servlet/mobile?contentId=35718627#content/view/35718627 (accessed on 15 May 2025).
Figure 1. Architecture system. The front-end components are represented by colored boxes, while the overlapped black boxes indicate the back-end modules of the architecture. Input components are represented with red dashed arrows and red boxes, whereas output components use green dotted arrows and green boxes.
Figure 1. Architecture system. The front-end components are represented by colored boxes, while the overlapped black boxes indicate the back-end modules of the architecture. Input components are represented with red dashed arrows and red boxes, whereas output components use green dotted arrows and green boxes.
Information 16 01054 g001
Figure 2. US breast images. Example of two US images of the breast, one with a malignant lesion (A) and one with a benign lesion (B). The images show the lesions contoured in red by the expert radiologists.
Figure 2. US breast images. Example of two US images of the breast, one with a malignant lesion (A) and one with a benign lesion (B). The images show the lesions contoured in red by the expert radiologists.
Information 16 01054 g002
Figure 3. Overview of the Information Model and Profiled FHIR Resources. The asterisk “*” indicates that the element can have multiple occurrences.
Figure 3. Overview of the Information Model and Profiled FHIR Resources. The asterisk “*” indicates that the element can have multiple occurrences.
Information 16 01054 g003
Figure 4. Patient’s view.
Figure 4. Patient’s view.
Information 16 01054 g004
Figure 5. Physician’s view.
Figure 5. Physician’s view.
Information 16 01054 g005
Figure 6. Sequence diagram, patient’s view.
Figure 6. Sequence diagram, patient’s view.
Information 16 01054 g006
Figure 7. Sequence diagram image analysis.
Figure 7. Sequence diagram image analysis.
Information 16 01054 g007
Figure 8. Sequence diagram, physician’s view.
Figure 8. Sequence diagram, physician’s view.
Information 16 01054 g008
Figure 9. This diagram illustrates the data flow within the InferCare application and the interaction between heterogeneous medical data and external systems.
Figure 9. This diagram illustrates the data flow within the InferCare application and the interaction between heterogeneous medical data and external systems.
Information 16 01054 g009
Table 1. Mapping between concepts, source data, and FHIR profiles.
Table 1. Mapping between concepts, source data, and FHIR profiles.
Concept to Be MappedSource DataFHIR Profile
Allergy or IntoleranceIPS CDA 2.0AllergyIntolerance_Patient
AppointmentPhysician interview
and source app
Appointment_Patient
Risk AssessmentDIAMCancerRiskAssessment_Patient
Care PlanIPS CDA 2.0CarePlan_Patient
ConditionIPS CDA 2.0Condition_Patient
Consent of PatientApp and EHR sourceConsent_Patient
Diagnostic ReportIPS CDA 2.0DiagnosticReport_Patient
Family Member HistoryIPS CDA 2.0
and physician interview
FamilyMemberHistory_Patient
MedicationIPS CDA 2.0Medication_Patient/
MedicationStatement_Patient
Feature Major Axis, Minor AxisDIAMObservation_Axis
Observation (symptoms, etc.)IPS CDA 2.0Observation_Patient
Feature OrientationDIAMObservation_Orientation
Feature CircularityDIAMObservation_Circularity
Feature Perimeter RegularityDIAMObservation_PerimeterRegularity
PatientPatient appAnamnesis_Patient
PhysicianApp or interviewPractitioner_Anamnesis
ProcedureIPS CDA 2.0Procedure_Patient
Table 2. Mapping between CDA elements and FHIR AllergyIntolerance resource attributes.
Table 2. Mapping between CDA elements and FHIR AllergyIntolerance resource attributes.
CDAFHIR
ClinicalDocument/structuredBody/component
/section/id
AllergyIntolerance/identifier
ClinicalDocument/structuredBody/component
/section/text
AllergyIntolerance/text
ClinicalDocument/structuredBody/component
/section/entry/act/statusCode
AllergyIntolerance/status
ClinicalDocument/structuredBody/component
/section/entry/act/effectiveTime
AllergyIntolerance/recordedDate
ClinicalDocument/structuredBody/component
/section/entry/act/entryRelationship
/observation/code
AllergyIntolerance/type
ClinicalDocument/structuredBody/component
/section/entry/act/entryRelationship
/observation/effectiveTime
AllergyIntolerance/onset
ClinicalDocument/structuredBody/component
/section/entry/act/entryRelationship/
observation/participant/participantRole/
playingEntity/code
AllergyIntolerance/substance
ClinicalDocument/structuredBody/component/
section/entry/act/entryRelationship/
observation/entryRelationship/
observation/effectiveTime
AllergyIntolerance/reaction/onset
ClinicalDocument/structuredBody/component
/section/entry/act/entryRelationship/
observation/entryRelationship/observation/value
AllergyIntolerance/reaction/manifestation
ClinicalDocument/structuredBody/component
/section/entry/act/entryRelationship/
observation/entryRelationship/observation/value
AllergyIntolerance/reaction/certainty
ClinicalDocument/structuredBody/component
/section/entry/act/entryRelationship/
observation/entryRelationship/act/text
AllergyIntolerance/note
Table 3. Description of PF set.
Table 3. Description of PF set.
FeatureDescription
Major Axis, Minor AxisProvides intuitive size and shape information
Perimeter RegularityIrregular contours may be indicative of malignancy
OrientationAn angle greater than 45° may suggest a malignant nature
CircularityLower circularity can reflect irregular lesion shapes
Table 4. Description of HF set.
Table 4. Description of HF set.
FeatureDescription
EccentricityIndicates how elliptical the lesion is
(0 = perfect circle, 1 = highly elongated ellipse)
CircularityMeasures how similar the lesion is to a circle
(1 = perfect, <1 = more irregular)
Perimeter RegularityEvaluates the complexity of the lesion boundary
Axis RatioRelationship between the axes of an ellipse or ellipsoid
SolidityRatio between the actual area and the convex area
(1 = compact, <1 = irregular)
ExtentRatio between the lesion area and its bounding box
ElongationIndicates whether the lesion is stretched
along a particular direction
Fractal DimensionQuantifies the complexity of the lesion’s contour
AreaNumber of pixels comprising the lesion
PerimeterLength of the lesion’s contour
Convex AreaArea of the convex hull surrounding the lesion
Equivalent DiameterDiameter of a circle having the same area as the lesion
KurtosisMeasures the “peakedness” of the intensity distribution
SkewnessMeasures the asymmetry of the intensity distribution
EntropyIndicates the degree of randomness
in the pixel intensity distribution
ContrastMeasures local intensity variation
HomogeneityQuantifies how similar neighboring pixels are
Table 5. Results of the different ML algorithms used with the three key features. In bold, the best results for each measure.
Table 5. Results of the different ML algorithms used with the three key features. In bold, the best results for each measure.
AccuracyPrecisionRecallF1-Score
DT 0.9440 ± 0.028 0.9333 ± 0.027 0.9439 ± 0.028 0.9377 ± 0.028
MLP 0.9767 ± 0.008 0.9746 ± 0.008 0.9729 ± 0.008 0.9736 ± 0.008
NB 0.9488 ± 0.003 0.9596 ± 0.003 0.9254 ± 0.003 0.9397 ± 0.003
RF 0.9720 ± 0.015 0.9690 ± 0.014 0.9683 ± 0.015 0.9684 ± 0.015
Table 6. Description of profiles.
Table 6. Description of profiles.
ProfileDescription
AllergyIntolerance_PatientUsed to represent the patient’s known allergies and intolerances
Anamnesis_PatientUsed to represent the personal identity and demographic information of the patient subject to the anamnesis
Appointment_PatientUsed to describe information about scheduled clinical appointments
CancerRiskAssessment_PatientUsed to represent the patient’s risk of developing cancer based on anamnesis and extracted clinical features
CarePlan_PatientUsed to represent a patient care or treatment plan
Condition_PatientUsed to represent a particular clinical condition of the patient
Consent_PatientUsed to represent the patient’s informed consent regarding the use and sharing of their clinical data
DiagnosticReport_PatientUsed to document diagnostic reports associated with the patient, such as US images
FamilyMemberHistory_PatientUsed to document the patient’s family medical history, with special reference to inherited diseases in family members
Medication_PatientUsed to describe the characteristics of drugs such as active ingredient, pharmaceutical form, and dosage
MedicationStatement_PatientUsed to represent the set of medications taken by the patient
Observation_AxisUsed to represent the orientation axis of a clinical image or structure, extracted from US or imaging data.
Observation_PerimeterRegularityUsed to represent the regularity of the perimeter of a lesion or structure observed in diagnostic imaging
Observation_OrientationUsed to represent the spatial orientation of a lesion as detected in clinical imaging.
Observation_CircularityUsed to represent the circularity of a lesion or structure, derived from diagnostic image analysis
Observation_PatientUsed to record observations of the patient’s health status such as vital parameters, symptoms, etc.
Practitioner_AnamnesisUsed to represent the health professional involved in patient care, e.g., the physician in charge of the examination.
Procedure_PatientUsed to represent medical procedures undergone by the patient, such as surgeries, biopsies, or diagnostic interventions
Table 7. Functional requirements of the IPDSS.
Table 7. Functional requirements of the IPDSS.
Requirement IDDescription
FR01IPDSS should enable patients to complete a structured form capturing personal data, teleconsultation history, family medical history, allergies, current medications, lifestyle habits, ongoing symptoms, and other clinically relevant information necessary for initial assessment.
FR02The form should support multiple input types, including free text fields, multiple-choice options, checkboxes, and date pickers, to ensure flexibility and completeness of data collection.
FR03IPDSS should implement client-side validation mechanisms to ensure data consistency, accuracy, and completeness before submission.
FR04IPDS should allow patients to save their progress during form completion and resume the process at a later time without data loss.
FR05IPDSS should ensure the confidentiality and integrity of the submitted information during data transmission through secure communication protocols (e.g., HTTPS, encryption).
FR06A secure user authentication mechanism should be provided (where applicable) to control access to the form, particularly when enabling partial form saving or editing features.
FR07IPDSS should be responsive and accessible across various devices, including tablets and smartphones, ensuring usability and inclusivity.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Brancati, N.; Conte, T.; De Pietro, S.; Russo, M.; Sicuranza, M. Machine Learning-Enhanced Architecture Model for Integrated and FHIR-Based Health Data. Information 2025, 16, 1054. https://doi.org/10.3390/info16121054

AMA Style

Brancati N, Conte T, De Pietro S, Russo M, Sicuranza M. Machine Learning-Enhanced Architecture Model for Integrated and FHIR-Based Health Data. Information. 2025; 16(12):1054. https://doi.org/10.3390/info16121054

Chicago/Turabian Style

Brancati, Nadia, Teresa Conte, Simona De Pietro, Martina Russo, and Mario Sicuranza. 2025. "Machine Learning-Enhanced Architecture Model for Integrated and FHIR-Based Health Data" Information 16, no. 12: 1054. https://doi.org/10.3390/info16121054

APA Style

Brancati, N., Conte, T., De Pietro, S., Russo, M., & Sicuranza, M. (2025). Machine Learning-Enhanced Architecture Model for Integrated and FHIR-Based Health Data. Information, 16(12), 1054. https://doi.org/10.3390/info16121054

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop