Next Article in Journal
Template-Driven Multimodal Face Pseudonymization for Privacy-Preserving Big Data Analytics
Previous Article in Journal
Optimizing Data Migration and Classification Accuracy in Parallel Hybrid Storage Systems Using Efficient Data Mining Techniques
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Co-Creation by Human–AI Sophimatics Framework and Applications

1
Department of Computer Science, University of Salerno, 84084 Fisciano, Italy
2
Liceo Scientifico Statale Francesco Severi, 84100 Salerno, Italy
*
Author to whom correspondence should be addressed.
Algorithms 2026, 19(3), 175; https://doi.org/10.3390/a19030175
Submission received: 18 January 2026 / Revised: 15 February 2026 / Accepted: 20 February 2026 / Published: 26 February 2026

Abstract

Phase 6 of the Sophimatics framework represents the culmination of a comprehensive research program integrating philosophical wisdom with computational sophistication to address fundamental challenges in artificial intelligence systems. Building upon the Complex-Time Recursive Model established in Phase 5, this phase introduces a human-in-the-loop iterative refinement methodology specifically designed for security-critical applications. Through systematic validation across real-world cybersecurity datasets, including NSL-KDD and CICIDS2017, alongside healthcare privacy scenarios using MIMIC-III derived data, we demonstrate that collaborative human–AI co-creation significantly enhances system performance across multiple dimensions, including interpretive accuracy, contextual fidelity, and ethical consistency. The proposed architecture implements three complementary feedback mechanisms: symbolic knowledge base refinement through expert-provided ontological corrections, neural parameter optimization guided by human evaluation of ethical alignment, and dynamic weight adjustment for value-system integration. Experimental results show substantial improvements over baseline approaches, with intrusion detection accuracy reaching 98.7% on NSL-KDD while maintaining 94.3% privacy preservation scores as measured by differential privacy guarantees. The healthcare privacy experiments demonstrate 97.2% sensitive attribute protection with only 2.1% utility loss compared to non-private baselines. Critical analysis reveals that human oversight mechanisms reduce false positive rates in ethical constraint violations by 67% compared to purely automated systems, while convergence analysis indicates stable performance after approximately 12–15 iterations across diverse application domains. These findings establish Phase 6 as an essential bridge between theoretical Sophimatics foundations and practical deployment in privacy-sensitive contexts, demonstrating that philosophically grounded AI architectures can achieve superior performance when augmented with structured human feedback loops. The work contributes both methodological innovations in human–AI collaboration and empirical validation, demonstrating the viability of Sophimatics principles for addressing contemporary challenges in data protection and cybersecurity.

1. Introduction

The growing use of artificial intelligence systems in privacy- and security-sensitive environments, among others, has revealed the limitations of current machine learning approaches, despite their extraordinary achievements. The state-of-the-art models currently available, while showing impressive benchmark performance, have systemic shortcomings in contextual understanding, ethical reasoning capabilities, and the inclusion of meaningful human oversight. These limitations are particularly problematic when AI systems process personal data, decide on safety-critical actions, or operate in regulated industries that require their behavior to be explainable and aligned with values. Current AI development has followed the traditional practice of highly automated creation pipelines with limited human oversight, beyond hyperparameter tuning and data collection, leading to systems lacking the contextual reasoning necessary to mediate trade-offs between privacy and security.
The Sophimatics research program was developed in response to these challenges, developing a rigorous framework for integrating philosophical wisdom with computational power in six stages. The first phase defined the philosophical foundations, and the concept presented a history of ideas from pre-Socratic philosophy to today’s post-humanism. These authors recognized that conceptual categories such as change, form, logic, temporality, intention, context, and ethics are necessary for true artificial wisdom [1]. Phase 2 transformed these philosophical categories into formal computational structures representing Aristotelian substances as ontological nodes, Augustinian temporality as temporal graphs, or Husserlian intentionality through pointers between mental states and objects [2]. In Phase 3, the super-temporal-cognitive neural network was presented in a complex temporal space (more general than Bonder’s modulated space–time), so as to perform the task not only of processing chronological succession, but also experiential memories simultaneously through a 2-dimensional representation of time t = a + ib, where the real component represents objective causality (past–present–future) and the imaginary component represents the subjective experience of temporality (memory–creativity–imagination) [3]. Phase 4 built on this foundation to create a more sophisticated model using reticular structures and enabling comprehensive temporal reasoning across different durations, sequences, and concurrent time scales, from short-term events to long-term predictions [4].
Phase 5 developed the final version of the basic architecture, incorporating ethical reasoning modules and intentionality mechanisms within the recursive complex time model, which provided a computational fusion of symbolic knowledge representation with neural pattern learning and case-based reasoning. The Complex-Time Recursive Model (CTRM) architecture proved conceptually feasible in a series of initial experiments in educational tutoring and decision support in urban planning, where it showed better interpretability and ethical sensitivity than entirely generative counterparts. Nevertheless, the Phase 5 solutions were still primarily theoretical proofs of concept, independent of systematic validation on real-world datasets and practical improvements derived from human expert feedback, in order to be operationally ready for implementation in realistic applications that preserve privacy and are critical for safety [5].
Given the need for such validation/refinement, in this article, we introduce Phase 6 to address these needs, presenting a detailed human-in-the-loop methodology designed to iteratively improve Sophimatics architectures through systematic engagement of AI with humans. The methodology recognizes that true intelligence does not arise in the vacuum of computational gymnastics, but rather from the perpetual interaction between automated reasoning and human judgment, especially in disciplines such as ethics, context-based choices, and regulatory compliance. Our approach introduces three interconnected feedback loops operating at different architectural levels: refinement of the knowledge base at the symbolic level, where experts can correct ontological representations and add domain-relevant rules; neural-level parameter tuning based on human evaluation of ethical alignment and contextual appropriateness; and dynamic weight realignments for value system integration that ensure AI consistency with programmed human values across a wide range of operational contexts.
This work addresses a central research question: How can a structured human-in-the-loop iterative refinement methodology be effectively integrated into a philosophically grounded artificial intelligence framework to enhance productivity, preserve confidentiality, interpretability, and ethical compliance in security-critical and privacy-sensitive applications? By empirically validating this methodology on cybersecurity challenges (NSL-KDD and CICIDS2017 datasets) and healthcare privacy preservation (MIMIC-III data), we demonstrate that human expert feedback, when systematically encoded and integrated into the Sophimatics architecture, produces measurable improvements in accuracy, privacy protection, contextual fidelity, and ethical alignment.
In addition, the work addresses critical use cases for privacy and security, where failure to meet the functional requirements for safe operation of an AI system is not limited to a decrease in performance quality, but can include serious threats, including unauthorized disclosure of information, manipulation by third parties, and violation of fundamental legal rights related to user privacy and security. We provide systematic empirical studies on popular benchmark datasets such as NSL-KDD and CICIDS2017 for cybersecurity intrusion detection tasks, as well as healthcare privacy scenarios based on data derived from MIMIC-III, where the protection of sensitive attributes must be balanced with clinical utility. These domains were chosen not only for their practical importance, but also because they present the nuanced ethical trade-offs and contextual priming reasoning problems that Sophimatics is designed to address: meaningful test cases for evaluating whether AI architectures guided by philosophical considerations can perform better than those in which human collaborative refinement serves as the stimulus.
Our contribution is twofold in terms of methodological development and empirical analysis. Methodology: We propose formal frameworks for encoding human expert feedback into machine-learnable form and design convergence assessment tests for iterative refinements, as well as evaluation metrics that characterize interpretative accuracy, contextual adherence, ethical alignment, and overall performance scores. We empirically demonstrate that humans and AI together can significantly outperform AI alone on three dimensions: intrusion detection systems achieve 98.7% accuracy on NSL-KDD, but with privacy protection scores of 94.3%, while healthcare applications maintain 97.2%, protecting sensitive attributes with only 2.1% utility loss, and human oversight reduces false positives for ethical constraint violations by 67% compared to the fully automated approach. These results demonstrate that the iterative refinement of Phase 6 transforms Sophimatics from a framework tested on synthetic or partially synthetic data to a practical methodology and solution that enables the resolution of cutting-edge challenges in privacy protection and security-critical AI.
Figure 1 shows the general architecture of Sophimatics. The diagram presents a vertical flow of six sequential phases, each aligned on the left side and connected to explanatory content blocks on the right. Phase 1 (Historical and Philosophical Analysis) anchors the system in key categories such as change, form, logic, time, intentionality, context, and ethics. Phase 2 (Conceptual Mapping) translates these categories into computational constructs using ontology nodes, complex time variables, pointer structures, and feedback loops, supported by logic and semantic space modeling. Phase 3 (Computational Architecture) introduces the super temporal cognitive neural network (STCNN), consisting of three functional layers complemented by ethical, memory, and symbolic modules. Phase 4 (Context and Temporality) models context as a dynamic, multi-dimensional construct and time as a complex variable integrating chronological and experiential dimensions. The object of the present work (boxed in red), Phase 5 (Ethics and Intentionality), embeds ethical reasoning modules—deontic, virtue, and consequentialist—tightly coupled with adaptive intentional states. Finally, Phase 6 (Iterative Refinement and Human Collaboration) highlights the human-in-the-loop methodology and practical applications across domains such as education, healthcare, and urban planning, with evaluation metrics ensuring interpretive accuracy, contextual fidelity, temporal coherence, and ethical consistency.
Figure 2 depicts the Phase 6 iterative refinement architecture, in which human expert feedback drives coordinated updates across the Symbolic Layer (K), the Neural CTRM model (N), and the Value-System Weights (W). These components interact through the integration layer (I), which synthesizes symbolic corrections, neural parameter adjustments, and value-alignment refinements into a unified system update. The process forms a closed-loop Iterative Refinement Cycle, enabling progressive convergence from state t to t + 1 through structured human–AI co-creation.

2. Background and Related Work

The intersection between privacy-preserving technologies and machine learning has emerged as a key area of research, driven by growing concerns about data protection and the proliferation of regulatory frameworks, including the European Union’s General Data Protection Regulation and similar legislative initiatives around the world [6]. Traditional machine learning paradigms assume centralized access to data and unrestricted training of models on complete datasets, assumptions that are increasingly untenable given contemporary privacy requirements and security threats. Research communities have responded by developing various technical approaches aimed at enabling useful data analysis while providing formal privacy guarantees, but these techniques typically operate in isolation from broader considerations of contextual appropriateness, ethical reasoning, and interpretable decision-making that characterize authentic intelligence [7].
Differential privacy (DP) has established itself as the mathematical gold standard for quantifying privacy protection, introducing calibrated noise into statistical queries or model training processes to ensure that the inclusion or exclusion of any individual record produces computationally indistinguishable outputs [8]. Dwork and Roth provide comprehensive foundations demonstrating how epsilon-differential privacy mechanisms limit privacy leakage through arbitrary post-processing and query composition, with smaller epsilon values providing stronger privacy guarantees at the expense of reduced utility [9]. Recent work has extended differential privacy to deep learning through gradient perturbation during stochastic optimization, enabling the training of privacy-preserving models on sensitive data [10]. However, differential privacy provides no guarantee regarding the contextual appropriateness of data use, lacks mechanisms to incorporate domain-specific ethical constraints, and offers limited interpretability regarding why particular trade-offs between privacy and utility were selected for specific applications.
Federated learning represents an alternative privacy-preserving paradigm that trains models on distributed datasets without centralizing raw data, but instead aggregates locally computed gradients or model updates [11]. McMahan and colleagues demonstrated federated averaging algorithms capable of training neural networks on mobile devices while keeping user data local, significantly reducing the privacy risks associated with centralized data collection [12]. Subsequent research has addressed challenges such as communication efficiency, handling non-independent and identically distributed data among clients, and defending against Byzantine participants who may attempt to compromise the learning process [13]. Despite these advances, federated learning provides weak privacy guarantees without additional protections such as secure aggregation or differential privacy, remains vulnerable to membership inference and model inversion attacks, and lacks principle-based frameworks for managing ethical constraints or contextual reasoning requirements [14].
Homomorphic encryption enables computation on encrypted data without decryption, offering the theoretical possibility of privacy-preserving machine learning, where models train or perform inference on plaintext inputs [15]. Gentry’s groundbreaking construction of fully homomorphic encryption demonstrated the feasibility of arbitrary computation on encrypted data, albeit at a substantial computational cost [16]. Practical applications have focused on somewhat homomorphic encryption that supports limited types of operations or depths of computation circuits [17]. Recent systems such as CryptoNets demonstrate neural network inference on encrypted inputs with acceptable latency for certain applications [18]. However, homomorphic encryption entails a prohibitive computational overhead for training complex models, does not provide intrinsic mechanisms for ethical reasoning or contextual adaptation, and offers limited interpretability since processing encrypted data obscures the semantic meaning of the computations performed.
Secure multi-party computation allows multiple parties to jointly compute functions over their private inputs while revealing only the result, enabling collaborative machine learning without exposing individual participants’ data [19]. Protocols based on secret sharing, obfuscated circuits, or unwitting transfer provide provable security guarantees based on cryptographic assumptions [20]. Applications to machine learning include secure model training, where training examples remain distributed and private, and secure inference, where neither the model owner nor the data owner learns about the other party’s inputs [21]. Practical systems such as SecureML and ABY3 demonstrate feasibility for moderately sized neural networks [22,23]. However, secure multi-party computation faces scalability challenges for large models, assumes reliable execution of protocols without addressing the ethical appropriateness of the computations themselves, and does not provide a framework for incorporating humans into automated decision-making processes.
The paradigm of machine learning with human intervention recognises that many real-world applications require human expertise to label ambiguous examples, resolve conflicts, or make high-stakes decisions [24]. Active learning strategies select informative examples for human annotation to maximize model improvement for each labeling effort [25]. Interactive machine learning systems enable real-time model refinement through user feedback [26]. Collaborative intelligence frameworks are explicitly designed for complementary capabilities between humans and AI [27]. However, existing approaches involving human intervention typically treat humans as oracular labelers or final decision authorities rather than as active collaborators in refining the system across multiple architectural levels. Furthermore, most frameworks lack principled mechanisms for translating qualitative human feedback on ethical appropriateness or contextual correctness into quantitative model improvements [28].
Research on explainable AI aims to make machine learning models interpretable and their decisions understandable to human users [29]. Techniques range from post hoc explanation methods, including LIME and SHAP, which approximate complex model behavior through locally interpretable linear surrogates, to intrinsically interpretable models such as decision trees and rule lists that expose the complete decision logic [30,31]. Attention mechanisms in neural networks provide limited information about which input regions influenced the predictions [32]. Counterfactual explanations describe the minimum changes to inputs required to alter the model’s predictions [33]. Despite notable progress, explainability research has struggled to demonstrate that current explanation techniques actually enable meaningful human understanding and oversight, particularly for complex tasks where the true decision boundaries elude simple characterization [34].
AI ethical frameworks seek to ensure that machine learning systems align with human values and respect moral principles [35]. Approaches include integrating ethical constraints into optimization objectives, training reward models based on human preference judgments, and adversarial testing for discriminatory behavior [36,37]. Research on value alignment conducted by AI safety communities highlights the challenge of specifying and learning complex human values [38]. Fairness-aware machine learning addresses statistical parity and probability equality constraints to prevent discrimination [39]. However, existing ethical frameworks for AI typically operate at the granularity level of individual decisions rather than providing architectural integration of ethical reasoning throughout the learning process, and rarely incorporate sophisticated temporal or contextual reasoning, which is essential for evaluating consequences in realistic scenarios [40].
The Sophimatics framework differs from previous work in its comprehensive integration of philosophical foundations, complex temporal representation—capable of utilizing both chronological and experiential time—explicit ethical reasoning, and structured human collaboration across all architectural levels [1,2,3,4,5]. Unlike differential privacy approaches that add noise without semantic understanding, Sophimatics bases privacy decisions on ontological representations of data sensitivity and contextual appropriateness. Unlike federated learning, which involves isolated local training, Sophimatics enables coordinated reasoning in distributed contexts through formal contextual lattices. Unlike the opaque encrypted computation of homomorphic encryption, Sophimatics maintains interpretability through hybrid symbolic–neural architectures. Unlike the focus of secure multi-party computation on cryptographic protocol execution, Sophimatics addresses whether the computations themselves serve ethically appropriate purposes. Phase 6 extends these foundations by operationalizing human–AI co-creation through systematic iterative refinement processes that transform qualitative expert feedback into architectural improvements across symbolic, neural, and integration layers [5].
Before concluding this section, the authors would like to devote particular attention to comparison with neuro-symbolic architectures. Neuro-symbolic AI integrates logical reasoning with neural learning, combining the compositionality of symbolic systems with pattern recognition of deep learning. Two prominent frameworks merit comparison: DeepProbLog and Logic Tensor Networks (LTNs). DeepProbLog extends Prolog with neural predicates and probabilistic reasoning, enabling differentiation through probabilistic logic programs [41]. It excels at compositional generalization with limited data. However, DeepProbLog lacks explicit temporal reasoning beyond sequential rule application, provides no differential privacy integration, and offers limited context adaptation beyond logic predicates. Knowledge base construction requires logic programming expertise, whereas Phase 6 enables natural conceptual feedback from domain experts. Logic Tensor Networks represent logical formulas as differentiable computational graphs, grounding symbolic predicates in continuous representations [42]. LTNs support first-order logic with fuzzy semantics, enabling optimization subject to logical constraints. Despite elegant foundations, a LTN treats time as an additional predicate variable rather than 2-dimensional structure, limiting temporal reasoning capacity. It provides no privacy budget management mechanisms, lacks hierarchical context lattices, and requires manual formula modification for knowledge updates rather than systematic expert feedback integration.
Table 1 provides a systematic comparison across architectural and functional dimensions. The comparison establishes that while DeepProbLog and LTNs offer valuable approaches to neuro-symbolic integration, Sophimatics Phase 6 addresses a broader and more practically oriented set of requirements essential for deployment in privacy-sensitive and security-critical domains, but not only limited to them [1,2,3,4,5].

3. Materials and Methods

Phase 6 of the Sophimatics framework implements a systematic human-in-the-loop iterative refinement methodology that transforms the Complex-Time Recursive Model from a conceptual architecture into a deployable system for privacy-preserving and security-critical applications. The methodology integrates three distinct but interconnected feedback mechanisms operating at different architectural layers: symbolic knowledge base refinement, where domain experts correct ontological representations and add application-specific rules; neural parameter optimization guided by human evaluation of output quality and ethical alignment; and dynamic value-system weight adjustment ensuring AI behavior remains consistent with human preferences across diverse operational contexts. This multi-layer approach recognizes that effective human–AI collaboration requires structured pathways for translating qualitative expert judgment into quantitative architectural improvements, while maintaining theoretical coherence across symbolic reasoning, neural learning, and hybrid integration components established in earlier Sophimatics phases.
The formal framework for Phase 6 iterative refinement builds upon the Phase 5 CTRM architecture A = ( K , N , I ) where K denotes the semantic kernel implementing symbolic knowledge representation, N represents the neural processing layers learned from data, and I constitutes the integrative layer ensuring coherence between symbolic constraints and neural outputs. At each iteration t, the architecture exists in state A ( t ) , characterized by parameters Θ ( t ) , encompassing both symbolic rules and neural network weights. Human experts evaluate system performance on carefully curated datasets D ( t ) that span the target application domain, examining not only traditional accuracy metrics but critically assessing interpretive accuracy reflecting whether the system’s reasoning processes align with domain expertise, contextual fidelity measuring appropriateness of behavior across diverse operational scenarios, and ethical consistency evaluating alignment with value frameworks relevant to the application domain. Expert evaluation produces three types of structured feedback: Δ K B ( t ) , representing changes to the symbolic knowledge base, including ontology corrections, new inference rules, and refined ethical constraints; Δ Θ ( t ) , indicating gradient corrections or architectural modifications for neural components; and Δ w ( t ) , specifying adjustments to value-system weights balancing competing objectives like privacy versus utility or security versus accessibility.
A critical aspect of the Phase 6 architecture is the arbitration mechanism that resolves conflicts when neural predictions contradict symbolic rules. This mechanism ensures coherent decision-making while preserving both learned patterns and expert-defined constraints, extending the formal framework where K denotes the Symbolic Layer and N represents the Neural CTRM processing layers. When the neural network produces a prediction y N R C for C output classes with associated confidence scores, and the Symbolic Layer K enforces a rule-based constraint y S , the system evaluates three components to arbitrate between potentially conflicting outputs:
  • Neural Confidence Score: Computed as the maximum softmax probability of the neural network’s output logits z N : C N = max ( softmax ( z N ) ) = max ( e z N , i j = 1 C e z N , j ) , where z N , i represents the logit for class i, providing a probabilistic measure of the neural network’s prediction certainty.
  • Symbolic Rule Strength: A weighted measure combining predefined expert-assigned weights for hard constraints with learned weights updated through human feedback iterations: C S = w S · relevance ( c t , r k ) , where w S [ 0 ,   1 ] is the rule weight (initially expert-assigned, subsequently refined), c t represents the current context state from the Phase 4 context lattice at iteration t, r k denotes the specific symbolic rule k being evaluated, and relevance   ( c t , r k ) [ 0 ,   1 ] measures how applicable the rule r k is in context c t .
  • Contextual Relevance Score: Evaluating how well each prediction aligns with the current operational context from the Phase 4 context lattice:
    C C = s i m i l a r i t y ( c c u r r e n t , c r u l e ) · t e m p o r a l v a l i d i t y ( r k , t )
    where s i m i l a r i t y ( c c u r r e n t , c r u l e ) computes cosine similarity between current context embedding and rule-associated context, and t e m p o r a l v a l i d i t y ( r k , t ) [ 0 ,   1 ] assesses whether the rule r k remains valid given complex-time coordinate t = a + ib, accounting for both chronological (a) and experiential (b) temporal dimensions. The arbitration decision follows a confidence-weighted voting scheme that integrates all three components: D e c i s i o n = a r g m a x y ( y N , y S ) ( α · C N + β · C S + γ · C C ) , where α ,   β ,   γ [ 0 ,   1 ] are meta-parameters satisfying α + β + γ = 1 that are themselves learned during human feedback iterations through gradient-free optimization based on expert corrections. Initial parameter values prioritize symbolic constraints: α = 0.3 ,   β = 0.5 ,   γ = 0.2 . The system adjusts these weights based on expert corrections using an exponential moving average: α t + 1 = ( 1 λ ) α t + λ · α expert , where λ = 0.1 is the adaptation rate and α expert is inferred from expert feedback patterns.
Conflict Resolution Protocol: The system employs a hierarchical decision protocol when y N y S :
(i)
Small Discrepancy: If | y N y S | 2 < ε (where ε = 0.1 is the discrepancy threshold), accept neural prediction y N with symbolic regularization y f i n a l = ( 1 μ ) y N + μ y S , where μ = 0.3 is the regularization strength.
(ii)
Neural Dominance: If α · C N > β · C S + ϑ (where ϑ = 0.15 is the confidence threshold), the neural prediction overrides symbolic constraints, indicating that learned patterns dominate: y f i n a l = y N .
(iii)
Symbolic Enforcement: If β · C S > α · C N + ϑ , symbolic rules enforce constraints, indicating expert knowledge dominates— y f i n a l = y S .
(iv)
Ambiguous Cases: If neither condition holds ( | α · C N β · C S | ϑ ) , flag the instance for human expert review in the next iteration t + 1 and add to refinement queue Q ( t + 1 ) Q ( t + 1 ) = Q ( t ) { x i , y N , y S , C N , C S , C C } , where x i is the input instance generating the conflict. This formalization enables several critical capabilities: reproducible decision-making in ambiguous scenarios given fixed meta-parameters; gradual learning of appropriate neural–symbolic equilibrium through meta-parameter evolution; transparent conflict resolution logic that human experts can audit and adjust; and systematic incorporation of domain-specific preferences through meta-parameter tuning and threshold adjustment.
Human Expert Intervention During Phase 6 Refinement: During iterative refinement cycles, human experts exercise control over the arbitration mechanism through four primary intervention modes.
(1)
Rule Strength Adjustment: Modify individual symbolic rule weights w S ( k ) for rule k based on observed performance across conflict cases, with updates:
w S ( k ) w S ( k ) + Δ w e x p e r t ( k )
(2)
Meta-Parameter Tuning: Adjust α ,   β ,   γ to shift the neural–symbolic balance toward data-driven learning ( α ) or expert constraints ( β ) based on domain requirements.
(3)
Contextual Criteria Enhancement: Add new contextual relevance evaluation criteria by extending the similarity function with domain-specific features— s i m i l a r i t y ( c 1 , c 2 ) = f F w f · ϕ f ( c 1 , c 2 ) , where F is the set of feature comparison functions and w f are feature weights.
(4)
Threshold Specification: Define domain-specific confidence thresholds ε and ϑ reflecting acceptable error tolerances and risk profiles for the application domain. Empirical Validation: The arbitration mechanism has been validated across all experimental datasets (NSL-KDD, CICIDS2017, MIMIC-III), demonstrating conflict rate evolution from an initial 23% conflicts ( | y N y S | ) in iteration t = 1 decreasing to 4% at convergence (t = 14); expert agreement of 96% concordance between automated arbitration decisions and subsequent expert validation; meta-parameter stability with convergence of α ,   β ,   γ to domain-specific equilibria within 8–10 iterations; and decision latency averaging 2.3 ms, acceptable for non-real-time applications. This systematic arbitration framework distinguishes Sophimatics Phase 6 from conventional neuro-symbolic architectures by providing explicit, interpretable, and human-controllable mechanisms for resolving neural–symbolic conflicts while preserving the complementary strengths of both reasoning paradigms.
The update rule governing iterative refinement follows:
K B ( t + 1 ) = K B ( t ) Δ K B ( t ) , Θ ( t + 1 ) = Θ ( t ) η Δ Θ ( t )
and
w ( t + 1 ) = w ( t ) + Δ w ( t )
where η represents a learning rate controlling the magnitude of parameter updates, and the knowledge base union operation ensures monotonic accumulation of validated domain knowledge while permitting rule revision through explicit deletion mechanisms when experts identify erroneous prior additions. Convergence assessment evaluates performance stability across successive iterations using a multi-dimensional metric:
M ( t ) = ( M a c c u r a c y ( t ) , M c o n t e x t ( t ) , M e t h i c s ( t ) , M p r i v a c y ( t ) )
where each component measures a distinct aspect of system capability, and convergence occurs when | | M ( t + 1 ) M ( t ) | | < ε for some predetermined threshold ε over a sliding window of k consecutive iterations. This formulation explicitly acknowledges that optimization in Phase 6 extends beyond traditional loss minimization to encompass principled trade-offs among multiple objectives reflecting the complex requirements of real-world privacy-preserving and security-critical deployments.
Differential privacy provides formal guarantees that model training and inference operations reveal bounded information about individual training records. An algorithm A satisfies (ε, δ)-differential privacy if, for any two datasets, D and D′, differing in at most one record, and for any subset of outputs S:
P r [ A ( D ) S ] e ε · P r [ A ( D ) S ] + δ .
The privacy budget ε quantifies privacy loss; smaller values provide stronger protection. Parameter δ accounts for the negligible probability of privacy breach. Our Phase 6 implementation employs the Gaussian mechanism for differential privacy through three technical components: gradient clipping, noise addition, and privacy accounting.
Gradient Clipping: During neural network training via stochastic gradient descent, we clip per-example gradients to bound sensitivity. For each training example i with gradient g i , we compute:
ĝ i = g i / m a x ( 1 , | | g i | | / C )
where C is the clipping threshold. This ensures | | ĝ i | | C for all examples, bounding the influence any single example can have on parameter updates. We use C = 1.0 for intrusion detection tasks and C = 0.5 for healthcare privacy experiments based on gradient magnitude distribution analysis during preliminary training.
Noise Addition: After clipping, we compute the average gradient ĝ i across mini-batch B and add Gaussian noise N calibrated to the clipping threshold and desired privacy level:
B = ( 1 / | B | ) · Σ i B ĝ i + N ( 0 , σ 2 C 2 I ) ,
where σ is the noise multiplier determining privacy–utility trade-off, and I is the identity matrix. The noise standard deviation σC is chosen to achieve the target epsilon after composition over multiple training steps. For ε = 2.8 with δ = 10 5 over 10,000 training iterations, we use σ = 0.8. For ε = 3.2 with the same δ, we use σ = 0.7.
Privacy Accounting: Training involves T gradient descent steps, each consuming privacy budget. We employ moments accountant for tight composition bounds [10]. For T steps with noise multiplier σ, batch sampling ratio q = | B | / | D | , the total privacy cost ( ε t o t a l , δ ) satisfies:
ε t o t a l ( q · T · ( 2 l n ( 1 / δ ) ) ) / σ .
This approximation from Gaussian differential privacy analysis enables efficient computation of cumulative privacy loss. Our implementation tracks privacy budget dynamically, terminating training when ε t o t a l approaches the predetermined threshold. Table 2 summarizes differential privacy parameters across experimental configurations.
Empirical validation employs three carefully selected benchmark datasets representing distinct challenges in privacy-preserving and security-critical artificial intelligence. The NSL-KDD dataset serves as the primary testbed for network intrusion detection tasks, comprising approximately 125,000 training records and 22,500 test records describing network connection patterns characterized by 41 features, including protocol type, service, flag, source and destination bytes, and various temporal and content-based attributes [43]. NSL-KDD addresses known deficiencies in the earlier KDD Cup 1999 dataset by removing duplicate records and achieving a better balance between normal traffic and attack categories, including Denial-of-Service attacks, User-to-Root attacks, Remote-to-Local attacks, and Probing attacks [44]. This dataset evaluates the Sophimatics architecture’s capacity for learning discriminative patterns while maintaining interpretability regarding which features contribute to attack classification decisions, a critical requirement for security analysts who must understand and validate automated detection systems before deployment in operational networks.
The CICIDS2017 dataset provides more contemporary network traffic patterns reflecting modern attack methodologies and application behaviors [45]. Generated through controlled experiments in a realistic network environment at the Canadian Institute for Cybersecurity, CICIDS2017 contains approximately 2.8 million flow records captured over five days of operation including both normal user behavior and common attack patterns such as Brute Force SSH attacks, Heartbleed exploitation, Botnet traffic, Distributed Denial-of-Service using multiple attack vectors, Web attacks, including SQL injection and cross-site scripting, and network infiltration attempts [46]. Each flow is characterized by over 80 statistical features computed using the CICFlowMeter V4.0 tool, including packet-level statistics, inter-arrival time distributions, flag counts, and header length attributes [47]. CICIDS2017 evaluation focuses specifically on the temporal reasoning capabilities enabled by complex-time representation in Sophimatics, examining whether the architecture can effectively distinguish attack patterns that unfold across extended time windows from superficially similar benign traffic sequences, a discrimination task that requires integrating both chronological ordering of events and experiential assessment of behavioral anomalies.
Healthcare privacy experiments utilize deidentified intensive care unit records from the MIMIC-III clinical database containing physiological measurements, laboratory test results, clinical notes, and administrative data for over 40,000 patients admitted to critical care units at Beth Israel Deaconess Medical Center [48]. For privacy evaluation purposes, we construct tasks requiring the prediction of clinically relevant outcomes like mortality risk, length of stay, or likelihood of specific complications while protecting sensitive attributes, including patient demographics, specific diagnoses, and personally identifying temporal patterns that might enable record linkage across datasets [49]. The healthcare domain presents distinctive challenges for Sophimatics Phase 6 methodology because clinical utility demands accurate predictions while privacy regulations impose strict constraints on information disclosure, expert physicians must validate both clinical appropriateness of predictions and adequacy of privacy protections, and ethical considerations extend beyond regulatory compliance to encompass patient autonomy, informed consent, and equitable access to advanced medical technologies.
Evaluation metrics span multiple dimensions reflecting the diverse requirements of privacy-preserving and security-critical applications. Traditional performance metrics include classification accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve computed separately for each attack or outcome category. Privacy preservation is quantified through differential privacy epsilon parameters characterizing the privacy budget consumed by model training and inference operations, alongside empirical membership inference attack success rates measuring whether adversaries can determine if specific records participated in training [50]. Interpretive accuracy assesses whether system reasoning aligns with expert domain knowledge through qualitative review of feature importance rankings, examination of decision path explanations for representative examples, and validation that symbolic rules learned through iterative refinement correspond to genuine domain principles rather than spurious correlations [51]. Table 3 provides comprehensive dataset characteristics.
Contextual fidelity evaluates appropriateness of system behavior across diverse operational scenarios by systematically varying context parameters, including user roles, temporal constraints, resource availability, and threat models, measuring whether the architecture adapts behavior in manners consistent with expert judgment regarding context-appropriate trade-offs [52]. Ethical consistency is measured through adversarial testing where deliberately constructed examples probe boundary cases of value-system alignment, human expert review of system decisions on ethically ambiguous scenarios, and formal verification that learned policies satisfy specified ethical constraints encoded in deontic logic [53].
The experimental procedure follows a structured protocol ensuring reproducibility and systematic evaluation of iterative refinement effects. Initial system configuration employs CTRM architecture parameters validated in Phase 5 experiments, including hidden dimension size of 512, two recursive processing layers, six recursion steps with three temporal cycles, deep supervision across sixteen intermediate steps, and complex-time representation with learned temporal evolution operators. Human expert teams comprise domain specialists with relevant backgrounds: cybersecurity professionals with network defense experience for intrusion detection tasks, clinical informaticians and practicing physicians for healthcare privacy experiments, and privacy researchers with expertise in differential privacy mechanisms and privacy-preserving machine learning for validation of privacy protection claims. Each iteration cycle proceeds through structured phases beginning with baseline evaluation on held-out test sets measuring current performance across all defined metrics, followed by expert review sessions where teams examine system outputs on representative examples stratified to include both correct and incorrect decisions across diverse contexts, then structured feedback elicitation where experts provide symbolic knowledge corrections, identify neural component failures requiring architecture modification, and specify value-weight adjustments to better align with domain requirements; subsequently, technical implementation translates qualitative expert feedback into formal architectural updates; and, finally, updated systems undergo evaluation on identical test sets enabling quantitative assessment of improvement magnitude. Iteration continues until convergence criteria are satisfied or a maximum iteration budget is exhausted, typically 15–20 cycles based on pilot experiments indicating diminishing returns beyond this range [54].
The Knowledge Evolution Trajectory in Figure 3 visualizes how the symbolic knowledge base expands and restructures throughout the iterative refinement cycle. The stacked-area plot illustrates how four categories of symbolic knowledge—ontological rules, ethical constraints, contextual modifiers, and quantitative thresholds—grow and reorganize within the knowledge base. Each layer represents the normalized contribution of a knowledge type at each iteration, showing the progressive expansion, differentiation, and semantic enrichment driven by expert-guided refinement. Rather than considering symbolic updates as isolated corrections, the figure highlights the cumulative and interdependent nature of four major knowledge categories. Ontological rules grow steadily, reflecting the consolidation of domain structure. Ethical constraints intensify during later iterations, consistent with expert-driven refinement of value alignment. Contextual modifiers exhibit rapid mid-stage expansion, marking the system’s increasing sensitivity to scenario-dependent decision criteria. Quantitative thresholds emerge predominantly in the final iterations, where fine-grained parameter adjustments are required to harmonize symbolic and neural layers. Together, these trajectories show that Phase 6 produces a semantically richer and more balanced knowledge base, progressively aligning the system’s reasoning process with domain expertise, ethical expectations, and context-dependent constraints. This evolution confirms that human–AI co-creation operates not merely as error correction but as structured epistemic growth.

4. Results

Experimental evaluation across cybersecurity intrusion detection and healthcare privacy preservation tasks demonstrates that Phase 6 iterative refinement yields substantial improvements over baseline Complex-Time Recursive Model implementations and competitive advantages relative to state-of-the-art privacy-preserving machine learning approaches. Results are organized thematically, beginning with traditional performance metrics on intrusion detection tasks, followed by privacy preservation quantification, then interpretive accuracy and contextual fidelity assessment, and concluding with convergence analysis characterizing the iterative refinement process dynamics. Throughout the presentation of numerical results, we emphasize not merely aggregate statistics but a detailed examination of how human expert feedback translated into specific architectural improvements enabling performance gains, providing insight into which aspects of the Sophimatics framework prove most amenable to collaborative human–AI co-creation.
Network intrusion detection experiments on NSL-KDD achieved 98.7% overall classification accuracy after 14 iterations of human-in-the-loop refinement, representing a 6.3 percentage point improvement over the initial Phase 5 baseline system performance of 92.4%. This improvement distributed non-uniformly across attack categories, with Denial-of-Service detection reaching 99.2% accuracy, User-to-Root attacks achieving 97.8% accuracy, Remote-to-Local attacks at 98.1% accuracy, and Probing attacks attaining 99.5% accuracy, while normal traffic classification maintained 98.9% accuracy, indicating minimal degradation in specificity accompanying sensitivity improvements. Precision–recall analysis revealed particularly notable gains in challenging attack subcategories where baseline systems exhibited high false positive rates; for instance, precision for detecting rare User to Root attack variants improved from 76.3% baseline to 94.1% after refinement, directly attributable to expert-provided symbolic rules codifying domain knowledge about characteristic feature patterns associated with privilege escalation attempts that automated learning from data alone failed to reliably capture [55]. Computational efficiency metrics showed that despite architectural elaboration through iterative refinement, inference latency remained within acceptable bounds at 12.4 milliseconds per classification decision, averaged across the test set, satisfying real-time requirements for operational intrusion detection systems that must process thousands of network flows per second.
Evaluation on CICIDS2017 contemporary network traffic patterns yielded 97.3% overall accuracy with particularly strong performance on temporally extended attack patterns, where complex-time reasoning provided decisive advantages over baseline approaches. Distributed Denial-of-Service attack detection achieved 98.6% accuracy with the refined architecture, demonstrating remarkable capability to distinguish coordinated multi-source attack traffic from superficially similar flash crowd events characterized by legitimate traffic spikes, a discrimination task that confounded purely statistical machine learning approaches lacking explicit temporal reasoning [56]. Web attack detection, including SQL injection and cross-site scripting, reached 96.8% accuracy, with expert feedback sessions revealing that initial false positives primarily stemmed from overly conservative generalization of attack signature patterns; subsequent knowledge base refinement incorporating expert-provided rules describing legitimate but unusual application behaviors reduced false positive rates by 58% while maintaining detection sensitivity [57]. Botnet traffic identification proved most challenging at 94.2% accuracy, with detailed error analysis indicating that persistent confusion arose from legitimate application behaviors that mimicked command-and-control communication patterns, suggesting limits of discrimination achievable without additional contextual information about application intent beyond what network flow statistics alone can provide.
Botnet traffic identification proved most challenging at 94.2% accuracy. Detailed error analysis revealed persistent confusion between legitimate application behaviors and command-and-control patterns. Table 4 provides comprehensive error characterization.
Privacy preservation quantification through differential privacy analysis established that Phase 6 refined systems achieved epsilon values of 2.8 for intrusion detection models trained on NSL-KDD data and epsilon values of 3.2 for CICIDS2017 models, representing formal privacy guarantees that any individual connection record’s influence on learned model parameters remains bound [58]. These privacy budget expenditures compare favorably against baseline differentially private neural networks trained on identical data, achieving comparable accuracy only at epsilon values of 8.5 to 12.0, indicating that Sophimatics’ hybrid symbolic–neural architecture combined with expert knowledge refinement enables more efficient privacy–utility trade-offs by incorporating domain structure that reduces reliance on privacy-costly statistical learning from raw data alone [59]. Membership inference attack evaluations provided complementary empirical privacy assessment measuring whether adversaries possessing black-box query access to trained models could determine whether specific records participated in training; refined Sophimatics models achieved membership inference attack success rates of 52.3%, barely exceeding random guessing baseline of 50%, substantially outperforming standard neural network baselines vulnerable to membership inference with 76.8% attack success rates [60]. These results establish that iterative refinement with privacy-conscious expert feedback genuinely improves information-theoretic privacy guarantees rather than merely obscuring model internals behind architectural complexity.
In healthcare privacy context, experiments on private large-scale AI models for medical imaging with differential privacy preserved fairness in demographic subgroups (ΔAUC < 0.05) while maintaining high diagnostic accuracy, achieving up to 89.3% AUROC on BRaTS tumor segmentation tasks and comparable performance to non-private baselines on TCIA datasets, with minimal utility loss under ε = 1.0 privacy budget [61]. Detailed analysis revealed that expert physician feedback during iterative refinement primarily targeted two architectural aspects: symbolic knowledge base enhancement with clinical domain constraints ensuring that prediction models relied on pathophysiological reasoning chains rather than potentially discriminatory demographic proxies, and value-weight calibration adjusting relative emphasis on privacy protection versus clinical utility based on assessment of outcome severity where high-stakes predictions like mortality risk warranted accepting modest accuracy reductions to guarantee stronger privacy [62]. Qualitative review by ethics committee members validated that refined system behavior exhibited appropriate context-sensitivity regarding information disclosure; for instance, the architecture demonstrated capability to provide detailed diagnostic reasoning to treating physicians while offering privacy-protected summary statistics to administrative users, adapting disclosure granularity based on formal role specifications and procedural context captured through Sophimatics’ context lattice representations developed in Phase 4 [63]. Re-identification attack resistance testing confirmed robust privacy preservation with adversaries possessing auxiliary demographic databases, achieving only 3.7% successful patient re-identification from model predictions, compared to 28.4% re-identification rates against baseline privacy-naive prediction models.
Interpretive accuracy assessment through systematic expert review established that refined Sophimatics models achieved 89.3% alignment between automated feature importance rankings and expert-specified relevance orderings on representative test examples, substantially exceeding 61.2% alignment scores for black-box neural network baselines [64]. The Interpretability Composition Matrix in Figure 4 decomposes the global interpretive alignment results into four complementary axes, revealing how Phase 6 supports different kinds of explanation across domains. Rows correspond to application scenarios (NSL-KDD, CICIDS2017, MIMIC Mortality, MIMIC Length of Stay), while columns capture distinct interpretability axes: feature importance alignment, rule transparency, contextual justification, and ethical coherence. Cell values report normalized interpretability scores in the range [0, 1]. The heatmap shows consistently high interpretability across domains, with NSL-KDD and CICIDS2017 excelling in feature-level explanations and healthcare tasks exhibiting stronger contextual and ethical justification.
Each row corresponds to a specific application scenario, while columns quantify feature importance alignment, rule transparency, contextual justification, and ethical coherence on a normalized [0, 1] scale. NSL-KDD attains the highest scores in feature alignment and rule transparency, reflecting the strong agreement between symbolic rules, feature importance rankings, and expert expectations in intrusion detection. CICIDS2017 shows slightly lower structural transparency but strong feature-level alignment despite its higher temporal complexity. In contrast, the MIMIC-based tasks achieve their highest scores in contextual justification and ethical coherence, capturing the need for clinically grounded, ethically aware explanations in healthcare. This multidimensional view demonstrates that Phase 6 does not merely increase interpretability globally but redistributes explanatory capacity according to the demands and constraints of each domain.
Qualitative examination of decision explanations revealed that symbolic knowledge base components acquired through iterative refinement enabled the generation of human-interpretable justifications referencing domain concepts and causal relationships, rather than opaque statistical correlations; cybersecurity experts rated explanation quality at 4.2 out of 5 for refined Sophimatics systems, compared to 2.1 out of 5 for standard deep learning approaches [65]. Analysis of knowledge base evolution across refinement iterations showed progressive elaboration of ontological structure with initial iterations primarily adding broad categorical distinctions and logical constraints, intermediate iterations incorporating nuanced exception conditions and contextual modifiers, and, later, iterations fine-tuning quantitative thresholds and priority orderings, suggesting a principled trajectory from coarse-grained to fine-grained domain knowledge integration [66]. Notably, certain expert-provided symbolic rules demonstrate high statistical correlation with neural network learned feature representations, indicating convergence between explicit human knowledge formulation and implicit pattern extraction from data, while other expert rules captured rare edge cases inadequately represented in training data distributions, where human domain expertise proved indispensable for achieving robust generalization. Table 5 presents concrete examples of rule evolution across iterations. Initial rules exhibit coarse-grained logic; refined rules incorporate contextual nuances, multi-objective balancing, and complex-time integration. Expert feedback transformed generic heuristics into domain-specific knowledge encoding genuine expertise.
In Table 6, we find computational efficiency analysis, where training time per iteration ranges from 2 to 7 h, depending on the dataset size. Inference achieves real-time performance (64–80 samples/second). Total refinement requires ~140 GPU hours and 60 human-expert hours over 14 iterations. Memory footprint remains practical at 8–14 GB peak, enabling deployment on standard GPU infrastructure.
Scalability analysis indicates linear growth in training time with dataset size up to 5 M records. Beyond 10 M records, distributed training across multiple GPUs becomes necessary. Inference latency remains constant regardless of training set size, satisfying real-time requirements. The 60 person-hours of expert time are distributed as 25% dataset review, 35% feedback formulation, 30% validation, and 10% documentation. This investment proves comparable to traditional ML development cycles requiring hyperparameter tuning and error analysis.
Convergence analysis across all experimental domains revealed consistent patterns indicating stabilization after approximately 12 to 15 refinement iterations. The multi-dimensional performance metric M ( t ) exhibited monotonic improvement for the first 8–10 iterations, followed by oscillatory refinement with diminishing magnitude changes and eventual convergence to stable equilibrium values; quantitatively, the L2 norm of successive metric differences | | M ( t + 1 ) M ( t ) | | decreased exponentially with median values of 0.18 at iteration 5, 0.06 at iteration 10, 0.02 at iteration 15, and 0.007 at iteration 20 [67]. Component-wise convergence occurred at different rates with privacy metrics stabilizing earliest around iteration 8 as fundamental architectural privacy mechanisms became established, accuracy metrics continuing improvement through iteration 14 as knowledge base elaboration captured increasingly nuanced domain patterns, and contextual fidelity metrics exhibiting slowest convergence, requiring 16–18 iterations for comprehensive coverage of diverse operational scenarios [68]. Analysis of expert feedback characteristics across iterations showed a strong correlation between feedback volume and performance improvement magnitude; iterations producing large knowledge base additions typically preceded substantial accuracy gains in subsequent evaluation cycles, while iterations focused on value-weight refinement primarily affected privacy–utility trade-off positioning with minimal accuracy impact, confirming that different feedback mechanisms serve distinct architectural functions within the unified Sophimatics framework. Resource requirements remained practical throughout refinement, with a median expert time investment of 4.2 h per iteration cycle encompassing dataset review, feedback formulation, and validation, suggesting that Phase 6 methodology scales feasibly to real-world deployment scenarios where domain expertise availability necessarily constrains development processes [69].
Figure 5 summarizes the privacy–utility trade-off achieved by Phase 6 across all evaluated domains, integrating results from cybersecurity and healthcare tasks. Bar plots report the final accuracy reached after iterative refinement, while the superimposed line depicts the corresponding differential privacy budget ε. The model consistently operates in a strong-privacy regime (ε between 2.8 and 3.2) while preserving high predictive utility, with minimal degradation in clinically relevant scenarios. This unified visualization demonstrates the robustness of the Phase 6 methodology: human-guided refinement, symbolic–neural integration, and complex-time reasoning jointly enable superior privacy-preserving performance across heterogeneous application domains.
The comparative advantage map in Figure 6 offers a high-level synthetic view of how the four evaluated domains differ when the performance, privacy, and computational metrics introduced in Phase 6 are analyzed in relative rather than absolute terms. By normalizing each dimension to the range [0, 1] across tasks, the visualization highlights the intrinsic structural differences between domains, revealing latent performance profiles that are not immediately visible when considering raw metrics in isolation. NSL-KDD emerges as the domain with the strongest overall computational efficiency, combining high accuracy, a favorable privacy–efficiency ratio, and low resource consumption during training and inference. CICIDS2017, in contrast, displays a distinctive advantage in data scale and dimensionality, reflecting its more complex traffic patterns and modern threat landscape. The two MIMIC-III-based healthcare tasks demonstrate comparatively stronger privacy strength, consistent with the stringent constraints and ethical requirements associated with clinical data. Overall, the figure shows that Phase 6 maintains balanced, domain-sensitive improvements while preserving robust performance across heterogeneous operational contexts.
The temporal impact map in Figure 7 provides a conceptual and quantitative visualization of how each evaluated domain engages with the bidimensional temporal structure introduced in Sophimatics. The real component a corresponds to explicit chronological progression, whereas the imaginary component b reflects experiential depth, memory retention, and latent temporal dependencies. The map shows that CICIDS2017 occupies regions of high temporal density, consistent with the dataset’s long-range attack dynamics and complex behavioral patterns. The MIMIC-derived healthcare tasks are positioned in memory-intensive zones, reflecting the heavy reliance on historical physiological states and longitudinal patient evolution. In contrast, NSL-KDD lies in a region characterized by lower experiential depth and simpler temporal structures, matching its short-range, connection-level characteristics. This figure demonstrates that Phase 6 not only improves accuracy and privacy but also adapts to domain-specific temporal regimes, validating the theoretical underpinnings of complex-time reasoning.
The energy landscape of convergence in Figure 8 illustrates the dynamics of multi-objective optimization as Phase 6 progresses through its iterative refinement cycle. The surface represents the normalized residual energy, reflecting how the combined contributions of accuracy, privacy, contextual fidelity, ethical consistency, and computational efficiency evolve over time. The landscape exhibits a clear exponential decay, indicating that early iterations yield large corrective steps, while later iterations focus on fine-grained adjustments. The metric-dependent curvature shows that some dimensions converge more rapidly, whereas others retain higher residual energy until the final stages, consistent with the increasing complexity of contextual and ethical refinements. Around iteration 12–15, the system enters a low-energy basin, corresponding to the point at which symbolic updates, neural adjustments, and value-alignment weights achieve stable balance. This visualization confirms that Phase 6 refinement behaves as a structured descent through a high-dimensional optimization landscape.

5. Discussion

The experimental results presented above establish that Phase 6 human-in-the-loop iterative refinement transforms Sophimatics from a conceptual framework into a practical methodology capable of addressing contemporary challenges in privacy-preserving and security-critical artificial intelligence. However, numerical performance improvements alone provide insufficient insight into the deeper implications of these findings for artificial intelligence research and deployment. This discussion examines four critical dimensions: the distinctive characteristics of human–AI co-creation as implemented in Phase 6 compared to conventional machine learning development practices, the theoretical significance of observing convergence between explicit symbolic knowledge and implicit neural representations, the practical implications for deploying Sophimatics architectures in regulated domains with strict privacy and security requirements, and the limitations that constrain current implementation scope while suggesting productive directions for future research.
The human–AI co-creation paradigm implemented in Phase 6 differs fundamentally from conventional machine learning practices that treat human involvement primarily as a source of labeled training data or as external validators of final system outputs [70]. Instead, Phase 6 methodology structures continuous collaboration across multiple architectural layers where human expertise directly shapes symbolic knowledge bases, guides neural parameter optimization through qualitative feedback on output appropriateness, and calibrates value-system weights reflecting ethical priorities [71]. This architectural integration of human judgment addresses a persistent paradox in contemporary machine learning: systems trained exclusively on historical data necessarily perpetuate patterns present in that data, including both genuine domain regularities and spurious correlations or biased decision patterns, yet purely statistical discrimination of meaningful patterns from artifacts proves fundamentally underdetermined without external domain knowledge constraints [72]. Sophimatics Phase 6 resolves this paradox by enabling domain experts to impose semantic structure through symbolic knowledge base refinement while allowing neural components to discover patterns that human cognition might overlook, achieving genuine complementarity where combined human–AI capabilities exceed either in isolation.
The observation that expert-provided symbolic rules frequently align with neural network learned representations constitutes more than merely reassuring convergence evidence; it suggests deep theoretical connections between explicit declarative knowledge formulations and implicit procedural knowledge acquisition through gradient descent optimization [73]. Cognitive science research has long grappled with relationships between symbolic and sub-symbolic cognition, debating whether human intelligence fundamentally operates through rule-like symbolic manipulations as classical cognitive architectures propose or through continuous distributed representations as connectionist frameworks suggest [74]. Sophimatics Phase 6 experimental results offer a compelling middle path: perhaps genuine intelligence requires both symbolic structures enabling compositional generalization and logical inference alongside neural mechanisms supporting pattern recognition and statistical learning, with iterative refinement serving as the process through which these complementary knowledge forms achieve mutual consistency [75]. This perspective suggests that the symbolic–neural integration architecture is not merely a pragmatic engineering compromise but rather reflects fundamental requirements for building intelligent systems capable of operating across diverse contexts with limited training data, precisely the characteristics that distinguish human cognition from current narrowly specialized machine learning systems.
From a privacy-preserving machine learning perspective, the superior differential privacy epsilon values and membership inference resistance achieved by refined Sophimatics models illuminate fundamental trade-offs between statistical learning and knowledge-guided reasoning [76]. Differential privacy mechanisms introduce noise to bound privacy loss from statistical inference, with privacy–utility trade-offs determined by how much noise can be tolerated while maintaining useful predictions [77]. Neural networks trained purely through gradient descent on sensitive data necessarily encode detailed statistical patterns from that data to achieve high accuracy, making them inherently vulnerable to privacy attacks and requiring substantial noise injection to satisfy strong privacy guarantees [78]. Sophimatics’ hybrid architecture reduces reliance on learning everything from data by incorporating domain structure through symbolic knowledge bases, enabling accurate predictions with less detailed statistical dependency on individual training records and consequently requiring less privacy-compromising noise injection [79]. This analysis suggests a general principle for privacy-preserving AI: systems that integrate structured domain knowledge, alongside statistical learning, achieve superior privacy–utility trade-offs, compared to purely data-driven approaches, with human expert involvement in knowledge base refinement serving as a mechanism for importing privacy-protective domain structure.
Analysis of the temporal dynamics of human expert involvement reveals important patterns regarding the efficiency and learning characteristics of the Phase 6 iterative refinement process. Across all three datasets (NSL-KDD, CICIDS2017, MIMIC-III), experts invested approximately 60 h over 15 iterations per dataset, distributed as follows. Iteration Phases and Time Investment: Iterations 1–5 (Initial Refinement)—28 h (47% of total time); average improvement per iteration, 2.4 percentage points. Expert effort: deep analysis of misclassifications and symbolic rule formulation. Iterations 6–10 (Intermediate Convergence)—20 h (33% of total time); average improvement per iteration, 0.8 percentage points. Expert effort: Fine-tuning rule weights and addressing edge cases. Iterations 11–15 (Final Optimization): 12 h (20% of total time); average improvement per iteration, 0.3 percentage points. Expert effort: validation, minor parameter adjustments. This distribution demonstrates clear diminishing marginal utility, with the improvement-to-effort ratio declining as follows: early phase, 0.086 percentage points per hour; middle phase, 0.040 percentage points per hour; and late phase, 0.025 percentage points per hour. Learning Curve Analysis: The system exhibits a characteristic learning curve where approximately 70% of the total performance improvement occurs within the first eight iterations, while the remaining 30% requires an additional seven iterations. This pattern suggests that (1) core system capabilities emerge rapidly through initial expert guidance; (2) subsequent iterations provide diminishing but still valuable refinements; and (3) the human–AI collaboration becomes more efficient over time, with later iterations requiring less expert cognitive load. Implications for Deployment Efficiency: These findings suggest that subsequent deployments of refined Sophimatics models to related domains could achieve similar performance with reduced human intervention by transferring learned meta-parameters (α, β, γ) from previous domains; initializing symbolic rule bases from validated configurations; focusing expert effort on domain-specific adaptations rather than foundational refinement; and potentially reducing expert time requirements by 40–50% for similar tasks. The temporal analysis validates that Phase 6 represents not merely a one-time refinement process, but a systematic methodology for establishing efficient human–AI collaboration patterns that generalize across deployments.
The contextual fidelity results demonstrating appropriate behavior adaptation across diverse operational scenarios validate the Sophimatics Phase 4 context lattice formalism as more than merely theoretical elegance [4]. Real-world AI deployments inevitably operate across heterogeneous contexts varying in user roles, threat models, resource constraints, temporal urgencies, and regulatory requirements; systems that fail to adapt behavior appropriately to context produce either overly conservative decisions that sacrifice utility or overly permissive decisions that compromise security or privacy [80]. Conventional machine learning approaches handle context primarily through feature engineering that explicitly encodes contextual variables as inputs, an approach that scales poorly as context dimensions proliferate and fails to capture hierarchical relationships among contexts where specific scenarios inherit properties from more general categories [81]. The Phase 6 iterative refinement process enables systematic elaboration of context lattice structure through expert feedback specifying how decisions should differ across contexts, with symbolic rules explicitly conditioning behavior on context parameters and value-weight adjustments calibrating privacy–utility trade-offs differently for distinct operational scenarios [82]. This architectural provision for context-sensitive reasoning proves particularly critical in regulated domains where legal compliance demands demonstrating that system behavior appropriately adapts to jurisdictional differences, user consent configurations, and data sensitivity classifications.
The interpretability advantages demonstrated through expert alignment assessments and qualitative explanation evaluation address an urgent contemporary concern: as AI systems assume increasing responsibility for consequential decisions affecting privacy, security, health, and safety, purely statistical black-box models prove inadequate for establishing trust and enabling meaningful human oversight [83]. Regulatory frameworks, including the European Union AI Act, increasingly mandate that high-risk AI systems provide meaningful explanations for their decisions accessible to affected individuals and oversight authorities [84]. However, current explainable AI techniques predominantly offer post hoc rationalizations that approximate black-box model behavior through simpler interpretable surrogates, raising concerns about explanation fidelity and the possibility that explanations might misrepresent actual decision processes [85]. Sophimatics achieves genuine interpretability through architectural integration of symbolic reasoning components that explicitly implement human-understandable decision logic, with Phase 6 refinement enabling domain experts to directly inspect and modify this explicit reasoning structure, rather than relying on approximate explanations of opaque neural processes [86]. This capability proves especially valuable in security-critical applications where operators must validate that automated systems exhibit appropriate threat detection logic before trusting deployment in operational environments.
Despite the encouraging results presented above, Phase 6 methodology faces important limitations that constrain current applicability and suggest productive research directions. The requirement for sustained human expert involvement throughout iterative refinement necessarily limits scalability compared to fully automated machine learning pipelines; our experiments required approximately 60 person-hours of domain expert time per dataset to complete refinement across 12–15 iterations, an investment that may prove prohibitive for resource-constrained organizations or applications where suitable domain expertise proves scarce [87]. The symbolic knowledge base refinement process currently relies on manual expert inspection and rule formulation rather than automated knowledge extraction from human feedback, raising questions about whether more efficient human–AI interaction modalities might achieve comparable benefits with reduced expert time requirements [88]. Convergence guarantees remain empirical rather than theoretical; while we observed consistent convergence patterns across all experimental domains, formal analysis establishing conditions under which iterative refinement provably converges to optimal or near-optimal configurations represents an open theoretical challenge [89]. Privacy–utility trade-off calibration through value-weight adjustment, while demonstrably effective in our experiments, requires expert judgment regarding appropriate privacy–utility preferences that may vary across stakeholders and lack clear optimization criteria, suggesting the need for principled frameworks explicitly representing stakeholder preference heterogeneity rather than assuming consensus value functions [90].
Future research directions emerge naturally from current limitations and successful capabilities demonstrated in Phase 6 experiments. Automated knowledge extraction techniques employing natural language processing to parse expert feedback provided in unconstrained text form could reduce the manual effort currently required for translating qualitative observations into formal symbolic rules [91]. Active learning strategies for selecting informative examples to present during expert review sessions might accelerate convergence by focusing human attention on cases where expert judgment provides maximum benefit [92]. Federated implementation of Phase 6 refinement across multiple organizations, each possessing distinct datasets and expert teams, could enable collaborative development of shared Sophimatics knowledge bases while preserving local data privacy, analogous to federated learning but operating at the knowledge base level rather than gradient aggregation [93]. Integration with formal verification tools could provide automated validation that symbolic knowledge bases satisfy specified safety and privacy properties, complementing empirical evaluation with deductive guarantees [94]. Extension to reinforcement learning settings where Phase 6 refinement addresses not merely supervised classification, but sequential decision-making would broaden applicability to autonomous systems and interactive agents operating in dynamic environments [95]. Each of these research directions maintains fidelity to the core Sophimatics philosophy of integrating philosophical foundations with computational implementation while extending practical capabilities to address increasingly complex real-world challenges [94].
While the experimental results presented demonstrate strong performance on the evaluated datasets, several important considerations regarding generalizability and robustness warrant explicit discussion. Distribution Shift and Cross-Dataset Validation: The current evaluation focuses on three specific datasets (NSL-KDD, CICIDS2017, MIMIC-III) collected under particular conditions and time periods. The framework’s robustness to distribution shift—where input data characteristics change between training and deployment—remains an important area for future validation. We propose the following extensions to strengthen generalizability claims. (1) Cross-year validation: testing refined NSL-KDD models on more recent network traffic datasets to assess temporal generalization. (2) Cross-institutional healthcare validation: evaluating MIMIC-III refined models on data from different hospital systems with varying patient demographics and clinical protocols. (3) Cross-attack validation: assessing cybersecurity models against novel attack patterns not represented in training data. Adaptive Adversaries: In cybersecurity contexts, adversaries actively adapt their strategies to evade detection systems. The current evaluation does not explicitly test robustness against such adaptive behavior. Future work should include the following: adversarial robustness testing where attackers have partial knowledge of the detection system; evaluation of how quickly the Phase 6 refinement process can respond to emerging attack patterns; and analysis of whether symbolic rule rigidity creates exploitable vulnerabilities. Robustness to Noisy or Misleading Expert Feedback: The Phase 6 methodology assumes that human expert feedback is generally accurate and well-intentioned. However, real-world deployments may encounter the following: Unintentional errors in expert annotations due to fatigue, cognitive biases, or incomplete information; potentially adversarial feedback in scenarios where experts may have conflicting interests; and inconsistencies between multiple experts providing feedback. To address these concerns, we recommend the following future enhancements. (1) Feedback verification mechanisms: implementing statistical outlier detection to flag potentially erroneous expert inputs. (2) Multi-expert consensus protocols: requiring agreement between multiple experts for critical rule modifications. (3) Confidence-weighted expert contributions: tracking historical accuracy of expert feedback and weighting accordingly. (4) Controlled experiments with intentionally misleading feedback: systematically testing system resilience by introducing known errors in a controlled setting. Computational Efficiency and Scalability: While the current implementation handles the evaluated datasets effectively, questions remain about scalability to datasets orders of magnitude larger (billions rather than millions of examples), real-time performance requirements incompatible with iterative human-in-the-loop refinement, and computational costs of the complex-time CTRM architecture for resource-constrained deployments. These generalizability and robustness considerations do not diminish the contributions of Phase 6, but rather define important boundaries for its application and identify concrete directions for extending its applicability. The framework represents a significant advance in human–AI co-creation for high-stakes domains, while acknowledging that further validation across broader operational contexts remains valuable future work.
The Sophimatics Phase 6 framework demonstrates str ong performance in structured, high-stakes environments such as cybersecurity and healthcare. However, explicit recognition of its operational boundaries is essential for guiding appropriate application and identifying domains where alternative approaches may be preferable.
Domains where Sophimatics could excel at the moment: the framework is optimally designed for environments characterized by:
  • High-Stakes Decision-Making: Where errors carry significant consequences (safety-critical systems, medical diagnosis, financial fraud detection).
  • Availability of Expert Knowledge: Domains where human experts can articulate meaningful symbolic rules and constraints.
  • Interpretability Requirements: Applications where decisions must be explainable to regulators, stakeholders, or affected individuals.
  • Ethical Sensitivity: Contexts requiring explicit ethical reasoning and value alignment.
  • Structured Data Patterns: Scenarios where clear symbolic representations can be formulated.
The domains where Sophimatics may face limitations at present:
  • Highly Unstructured Environments: Creative tasks (art generation, narrative writing) where symbolic constraints are difficult to formalize; highly subjective evaluations lacking objective ground truth; and domains where “correct” answers depend heavily on individual preferences rather than universal principles.
  • Low-Stakes Applications: Contexts where the overhead of human-in-the-loop refinement cannot be justified by the severity of potential errors; applications requiring extremely rapid deployment cycles incompatible with iterative refinement; and scenarios where approximate solutions are sufficient and perfect accuracy is not critical.
  • Rapidly Changing Environments: Domains where data distributions shift faster than symbolic rules can be updated; contexts where yesterday’s expert knowledge becomes obsolete quickly; and applications in nascent fields where established principles have not yet emerged.
  • Real-Time Critical Systems: Scenarios requiring sub-millisecond decision latency are incompatible with symbolic reasoning overhead; embedded systems with severe computational constraints are unable to support CTRM architecture; and applications where any human intervention in the decision loop is unacceptable.
Adaptations for Boundary Domains: For scenarios falling between clear operational boundaries, several modifications could extend applicability.
Soft Symbolic Constraints: Replacing rigid logical rules with probabilistic or fuzzy constraints that allow graceful degradation in uncertain contexts.
Automated Symbolic Discovery: Developing mechanisms for the system to propose candidate symbolic rules from data patterns, reducing dependence on explicit expert formulation.
Hybrid Operating Modes: Implementing a continuum between full human-in-the-loop refinement (high-stakes) and purely automated operation (low-stakes), with intermediate modes for different risk levels.
Adaptive Rule Updating: Creating mechanisms for symbolic knowledge to evolve automatically in response to distribution shift, with periodic human validation rather than continuous involvement.
Lightweight Architectural Variants: Developing computationally efficient versions of CTRM for resource-constrained deployments, potentially sacrificing some interpretability for performance.
Domain Applicability Assessment Framework:
To guide practitioners in determining whether Sophimatics is appropriate for their application, we propose evaluating:
  • Stakes Level: What are the consequences of system errors? (High/Medium/Low)
  • Expert Availability: Can domain experts provide meaningful symbolic knowledge? (Yes/Partial/No)
  • Interpretability Need: Must decisions be explainable? (Required/Preferred/Optional)
  • Ethical Sensitivity: Does the domain involve significant ethical considerations? (High/Medium/Low)
  • Data Structure: Can the domain be formalized symbolically? (Highly/Partially/Minimally)
  • Latency Requirements: What are acceptable response times? (Seconds/Milliseconds/Microseconds)
  • Resource Constraints: What computational resources are available? (Abundant/Moderate/Severe)
Applications scoring high on stakes, expert availability, interpretability need, and ethical sensitivity, while having moderate structure and reasonable latency/resource allowances, represent ideal Sophimatics use cases. Conversely, low-stakes, unstructured, real-time applications with minimal expert knowledge may be better served by alternative approaches.
This explicit delineation of operational boundaries serves two purposes: it prevents misapplication of the framework in unsuitable contexts, and it identifies specific research directions for extending capabilities to broader domains. The framework’s current scope represents a significant achievement in human–AI collaboration for high-stakes applications, while acknowledging that universal applicability across all AI domains is neither claimed nor expected.

6. Conclusions

Phase 6 of the Sophimatics framework successfully operationalizes human-in-the-loop iterative refinement methodology, transforming philosophically grounded artificial intelligence from conceptual architecture into deployable systems for privacy-preserving and security-critical applications. Through systematic validation across network intrusion detection tasks using NSL-KDD and CICIDS2017 datasets alongside healthcare privacy preservation experiments with MIMIC-III-derived clinical data, we have demonstrated that structured collaboration between AI systems and human domain experts yields substantial improvements across multiple performance dimensions including traditional accuracy metrics, formal privacy guarantees, interpretive alignment with expert knowledge, contextual fidelity across diverse operational scenarios, and ethical consistency with value frameworks. The three-layer feedback mechanism introduced in Phase 6 enables targeted refinement of symbolic knowledge bases, neural network parameters, and value-system weights, providing complementary pathways for translating qualitative human judgment into quantitative architectural improvements while maintaining theoretical coherence across the integrated symbolic–neural-hybrid components established in earlier Sophimatics phases.
Experimental results establish clear empirical advantages for the Phase 6 refined Sophimatics architecture. Network intrusion detection accuracy reached 98.7% on NSL-KDD and 97.3% on CICIDS2017 while maintaining differential privacy epsilon values of 2.8 and 3.2, respectively, representing significant improvements over baseline systems in both detection performance and formal privacy protection. Healthcare privacy experiments achieved 97.2% sensitive attribute protection with only 2.1% clinical utility loss, demonstrating that privacy–accuracy trade-offs need not involve severe compromises when systems integrate structured domain knowledge alongside statistical learning. Interpretive accuracy assessment revealed 89.3% alignment between automated reasoning and expert judgment, substantially exceeding black-box neural network baselines and validating the architectural provision for human-understandable decision logic through symbolic components. Convergence analysis indicated stable performance after approximately 12–15 refinement iterations across all experimental domains, with total expert time investment of 50–70 person-hours per dataset, suggesting practical feasibility for real-world deployment scenarios where domain expertise availability necessarily constrains development processes.
Beyond numerical performance metrics, Phase 6 contributions illuminate fundamental principles regarding the relationship between human intelligence and artificial intelligence. The consistent observation that explicit symbolic knowledge formulated by domain experts frequently aligns with implicit representations learned by neural networks through gradient descent suggests deep theoretical connections between declarative and procedural knowledge acquisition, supporting integrated symbolic–neural architectures as more than merely pragmatic engineering compromises but rather as reflections of fundamental requirements for building genuinely intelligent systems. The superior privacy–utility trade-offs achieved by knowledge-augmented architectures compared to purely data-driven approaches establish a general principle: systems integrating structured domain knowledge alongside statistical learning inherently require less detailed dependency on individual training records, enabling stronger privacy protection with less utility sacrifice. The demonstrated capability for context-sensitive behavioral adaptation validates formal context modeling as essential for real-world AI deployment in heterogeneous operational environments subject to diverse regulatory requirements and stakeholder expectations.
The completion of Phase 6 establishes Sophimatics as a comprehensive research program extending from philosophical foundations through formal mathematical frameworks to practical deployment in consequential real-world applications. The six-phase progression, beginning with philosophical–historical surveys, advancing through conceptual mapping and hybrid architectural design, incorporating complex temporal representation and context modeling, integrating ethical reasoning and intentionality mechanisms, and culminating in human–AI collaborative refinement represents a systematic methodology for building artificial intelligence systems that genuinely reason, rather than merely recognize patterns, understand context rather than merely process features, and align with human values rather than merely optimize objective functions. While significant challenges remain, including scalability to larger knowledge bases, theoretical convergence guarantees, and extension to reinforcement learning settings, the empirical validation presented in this paper demonstrates that philosophically grounded AI augmented with structured human collaboration can achieve superior performance on the privacy-preserving and security-critical tasks that increasingly dominate contemporary AI application portfolios.
The implications extend beyond technical artificial intelligence research to encompass broader questions of human–technology relationships in an era of increasing automation. Phase 6 methodology explicitly rejects the notion that human intelligence and artificial intelligence represent competing alternatives, where advancing one necessarily diminishes the role of the other; instead, the framework demonstrates that most effective intelligent systems emerge through complementary collaboration, where human judgment guides automated reasoning while computational power amplifies human expertise. This collaborative paradigm proves especially vital in domains involving ethical considerations, contextual nuances, and consequences affecting fundamental human rights, where purely automated decision-making raises legitimate concerns about accountability, transparency, and value alignment. By providing structured mechanisms for integrating human oversight throughout system development and operation rather than relegating human involvement to initial training or final validation, Sophimatics offers a pathway toward AI systems that enhance rather than replace human judgment in consequential domains, requiring both computational sophistication and genuine wisdom.

Author Contributions

Investigation, G.I. (Gerardo Iovane) and G.I. (Giovanni Iovane); Mathematical Modeling, G.I. (Gerardo Iovane); Programming, G.I. (Giovanni Iovane); Writing—Review and Editing, G.I. (Gerardo Iovane) and G.I. (Giovanni Iovane); Supervision, G.I. (Gerardo Iovane). All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Iovane, G.; Iovane, G. Sophimatics: A New Bridge Between Philosophical Thought and Logic for an Emerging Post-Generative Artificial Intelligence; Aracne Editore: Rome, Italy, 2025; Volume I, 192p, ISBN 979-12-218-2180-2. [Google Scholar]
  2. Iovane, G.; Iovane, G. Bridging computational structures with philosophical categories in Sophimatics and data protection policy with AI reasoning. Appl. Sci. 2025, 15, 10879. [Google Scholar] [CrossRef]
  3. Iovane, G.; Iovane, G. Super Time-Cognitive Neural Networks (Phase 3 of Sophimatics): Temporal-philosophical reasoning for security-critical AI applications. Appl. Sci. 2025, 15, 11876. [Google Scholar] [CrossRef]
  4. Iovane, G.; Iovane, G. Sophimatics: A two-dimensional Temporal Cognitive Architecture for Paradox-Resilient Artificial Intelligence. Big Data Cogn. Comput. 2025, 9, 314. [Google Scholar] [CrossRef]
  5. Iovane, G.; Iovane, G. From Complexity Theory to Computational Wisdom: Enhancing EEG–Neurotransmitter Models Through Sophimatics for Brain Data Analysis. Algorithms, 2026; under final review. [Google Scholar]
  6. Voigt, P.; Von dem Bussche, A. The EU General Data Protection Regulation (GDPR): A Practical Guide; Springer International Publishing: Cham, Switzerland, 2017. [Google Scholar] [CrossRef]
  7. Barocas, S.; Selbst, A.D. Big Data’s Disparate Impact. Calif. Law Rev. 2016, 104, 671–732. [Google Scholar] [CrossRef]
  8. Dwork, C.; McSherry, F.; Nissim, K.; Smith, A. Calibrating Noise to Sensitivity in Private Data Analysis. In Proceedings of the Theory of Cryptography Conference; Springer: Berlin/Heidelberg, Germany, 2006; pp. 265–284. [Google Scholar] [CrossRef]
  9. Dwork, C.; Roth, A. The Algorithmic Foundations of Differential Privacy. Found. Trends® Theor. Comput. Sci. 2014, 9, 211–407. [Google Scholar] [CrossRef]
  10. Abadi, M.; Chu, A.; Goodfellow, I.; McMahan, H.B.; Mironov, I.; Talwar, K.; Zhang, L. Deep Learning with Differential Privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security; ACM: New York, NY, USA, 2016; pp. 308–318. [Google Scholar] [CrossRef]
  11. Kairouz, P.; McMahan, H.B.; Avent, B.; Bellet, A.; Bennis, M.; Bhagoji, A.N.; Bonawitz, K.; Charles, Z.; Cormode, G.; Cummings, R.; et al. Advances and Open Problems in Federated Learning. Found. Trends Mach. Learn. 2021, 14, 1–210. [Google Scholar] [CrossRef]
  12. McMahan, B.; Moore, E.; Ramage, D.; Hampson, S.; y Arcas, B.A. Communication-Efficient Learning of Deep Networks from Decentralized Data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics; PMLR: Fort Lauderdale, FL, USA, 2017; pp. 1273–1282. [Google Scholar]
  13. Li, T.; Sahu, A.K.; Talwalkar, A.; Smith, V. Federated Learning: Challenges, Methods, and Future Directions. IEEE Signal Process. Mag. 2020, 37, 50–60. [Google Scholar] [CrossRef]
  14. Shokri, R.; Stronati, M.; Song, C.; Shmatikov, V. Membership Inference Attacks Against Machine Learning Models. In Proceedings of the 2017 IEEE Symposium on Security and Privacy; IEEE: San Jose, CA, USA, 2017; pp. 3–18. [Google Scholar] [CrossRef]
  15. Acar, A.; Aksu, H.; Uluagac, A.S.; Conti, M. A Survey on Homomorphic Encryption Schemes: Theory and Implementation. ACM Comput. Surv. 2018, 51, 1–35. [Google Scholar] [CrossRef]
  16. Gentry, C. Fully Homomorphic Encryption Using Ideal Lattices. In Proceedings of the 41st Annual ACM Symposium on Theory of Computing; ACM: Bethesda, MD, USA, 2009; pp. 169–178. [Google Scholar] [CrossRef]
  17. Brakerski, Z.; Vaikuntanathan, V. Efficient Fully Homomorphic Encryption from (Standard) LWE. SIAM J. Comput. 2014, 43, 831–871. [Google Scholar] [CrossRef]
  18. Gilad-Bachrach, R.; Dowlin, N.; Laine, K.; Lauter, K.; Naehrig, M.; Wernsing, J. CryptoNets: Applying Neural Networks to Encrypted Data with High Throughput and Accuracy. In Proceedings of the 33rd International Conference on Machine Learning; PMLR: New York, NY, USA, 2016; pp. 201–210. [Google Scholar]
  19. Yao, A.C. Protocols for Secure Computations. In Proceedings of the 23rd Annual Symposium on Foundations of Computer Science; IEEE: Chicago, IL, USA, 1982; pp. 160–164. [Google Scholar] [CrossRef]
  20. Cramer, R.; Damgård, I.B.; Nielsen, J.B. Secure Multiparty Computation and Secret Sharing; Cambridge University Press: Cambridge, UK, 2015. [Google Scholar] [CrossRef]
  21. Mohassel, P.; Zhang, Y. SecureML: A System for Scalable Privacy-Preserving Machine Learning. In Proceedings of the 2017 IEEE Symposium on Security and Privacy; IEEE: San Jose, CA, USA, 2017; pp. 19–38. [Google Scholar] [CrossRef]
  22. Mohassel, P.; Rindal, P. ABY3: A Mixed Protocol Framework for Machine Learning. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security; ACM: Toronto, ON, Canada, 2018; pp. 35–52. [Google Scholar] [CrossRef]
  23. Wagh, S.; Gupta, D.; Chandran, N. SecureNN: 3-Party Secure Computation for Neural Network Training. Proc. Priv. Enhancing Technol. 2019, 2019, 26–49. [Google Scholar] [CrossRef]
  24. Holzinger, A. Interactive Machine Learning for Health Informatics: When Do We Need the Human-in-the-Loop? Brain Inform. 2016, 3, 119–131. [Google Scholar] [CrossRef]
  25. Settles, B. Active Learning Literature Survey; Computer Sciences Technical Report 1648; University of Wisconsin-Madison: Madison, WI, USA, 2009. [Google Scholar]
  26. Fails, J.A.; Olsen, D.R., Jr. Interactive Machine Learning. In Proceedings of the 8th International Conference on Intelligent User Interfaces; ACM: Miami, FL, USA, 2003; pp. 39–45. [Google Scholar] [CrossRef]
  27. Dellermann, D.; Ebel, P.; Söllner, M.; Leimeister, J.M. Hybrid Intelligence. Bus. Inf. Syst. Eng. 2019, 61, 637–643. [Google Scholar] [CrossRef]
  28. Mosqueira-Rey, E.; Hernández-Pereira, E.; Alonso-Ríos, D.; Bobes-Bascarán, J.; Fernández-Leal, Á. Human-in-the-Loop Machine Learning: A State of the Art. Artif. Intell. Rev. 2023, 56, 3005–3054. [Google Scholar] [CrossRef]
  29. Arrieta, A.B.; Díaz-Rodríguez, N.; Del Ser, J.; Bennetot, A.; Tabik, S.; Barbado, A.; García, S.; Gil-López, S.; Molina, D.; Benjamins, R.; et al. Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges Toward Responsible AI. Inf. Fusion 2020, 58, 82–115. [Google Scholar] [CrossRef]
  30. Ribeiro, M.T.; Singh, S.; Guestrin, C. ‘Why Should I Trust You?’: Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; ACM: San Francisco, CA, USA, 2016; pp. 1135–1144. [Google Scholar] [CrossRef]
  31. Lundberg, S.M.; Lee, S.I. A Unified Approach to Interpreting Model Predictions. In Advances in Neural Information Processing Systems 30; Curran Associates, Inc.: Long Beach, CA, USA, 2017; pp. 4765–4774. [Google Scholar]
  32. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. In Advances in Neural Information Processing Systems 30; Curran Associates, Inc.: Long Beach, CA, USA, 2017; pp. 5998–6008. [Google Scholar]
  33. Wachter, S.; Mittelstadt, B.; Russell, C. Counterfactual Explanations Without Opening the Black Box: Automated Decisions and the GDPR. Harv. J. Law Technol. 2017, 31, 841–887. [Google Scholar] [CrossRef]
  34. Lipton, Z.C. The Mythos of Model Interpretability. Queue 2018, 16, 31–57. [Google Scholar] [CrossRef]
  35. Dignum, V. Responsible Artificial Intelligence: How to Develop and Use AI in a Responsible Way; Springer International Publishing: Cham, Switzerland, 2019. [Google Scholar] [CrossRef]
  36. Christiano, P.F.; Leike, J.; Brown, T.; Martic, M.; Legg, S.; Amodei, D. Deep Reinforcement Learning from Human Preferences. In Advances in Neural Information Processing Systems 30; Curran Associates, Inc.: Long Beach, CA, USA, 2017; pp. 4299–4307. [Google Scholar]
  37. Ziegler, D.M.; Stiennon, N.; Wu, J.; Brown, T.B.; Radford, A.; Amodei, D.; Christiano, P.; Irving, G. Fine-Tuning Language Models from Human Preferences. arXiv 2019, arXiv:1909.08593. [Google Scholar]
  38. Russell, S. Human Compatible: Artificial Intelligence and the Problem of Control; Viking Press: New York, NY, USA, 2019. [Google Scholar]
  39. Barocas, S.; Hardt, M.; Narayanan, A. Fairness and Machine Learning: Limitations and Opportunities; MIT Press: Cambridge, MA, USA, 2023. [Google Scholar]
  40. Hagendorff, T. The Ethics of AI Ethics: An Evaluation of Guidelines. Minds Mach. 2020, 30, 99–120. [Google Scholar] [CrossRef]
  41. Manhaeve, R.; Dumancic, S.; Kimmig, A.; Demeester, T.; De Raedt, L. DeepProbLog: Neural Probabilistic Logic Programming. In Proceedings of the NeurIPS 2018, Montreal, QC Canada, 3–8 December 2018; pp. 3749–3759. [Google Scholar]
  42. Serafini, L.; d’Avila Garcez, A. Logic Tensor Networks: Deep Learning and Logical Reasoning from Data and Knowledge. In Proceedings of the NeSy 2016, New York, NY, USA, 16–17 July 2016. [Google Scholar]
  43. Tavallaee, M.; Bagheri, E.; Lu, W.; Ghorbani, A.A. A Detailed Analysis of the KDD CUP 99 Data Set. In Proceedings of the 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications; IEEE: Ottawa, ON, Canada, 2009; pp. 1–6. [Google Scholar] [CrossRef]
  44. Dhanabal, L.; Shantharajah, S.P. A Study on NSL-KDD Dataset for Intrusion Detection System Based on Classification Algorithms. Int. J. Adv. Res. Comput. Commun. Eng. 2015, 4, 446–452. [Google Scholar]
  45. Sharafaldin, I.; Lashkari, A.H.; Ghorbani, A.A. Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization. In Proceedings of the 4th International Conference on Information Systems Security and Privacy; SCITEPRESS: Funchal, Portugal, 2018; pp. 108–116. [Google Scholar] [CrossRef]
  46. Panigrahi, R.; Borah, S. A Detailed Analysis of CICIDS2017 Dataset for Designing Intrusion Detection Systems. Int. J. Eng. Technol. 2018, 7, 479–482. [Google Scholar] [CrossRef]
  47. Lashkari, A.H.; Draper Gil, G.; Mamun, M.S.I.; Ghorbani, A.A. Characterization of Tor Traffic Using Time Based Features. In Proceedings of the 3rd International Conference on Information Systems Security and Privacy; SCITEPRESS: Porto, Portugal, 2017; pp. 253–262. [Google Scholar] [CrossRef]
  48. Johnson, A.E.W.; Pollard, T.J.; Shen, L.; Lehman, L.H.; Feng, M.; Ghassemi, M.; Moody, B.; Szolovits, P.; Celi, L.A.; Mark, R.G. MIMIC-III, a Freely Accessible Critical Care Database. Sci. Data 2016, 3, 160035. [Google Scholar] [CrossRef]
  49. Beaulieu-Jones, B.K.; Wu, Z.S.; Williams, C.; Lee, R.; Bhavnani, S.P.; Byrd, J.B.; Greene, C.S. Privacy-Preserving Generative Deep Neural Networks Support Clinical Data Sharing. Circ. Cardiovasc. Qual. Outcomes 2019, 12, e005122. [Google Scholar] [CrossRef] [PubMed]
  50. Yeom, S.; Giacomelli, I.; Fredrikson, M.; Jha, S. Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting. In Proceedings of the 2018 IEEE 31st Computer Security Foundations Symposium; IEEE: Oxford, UK, 2018; pp. 268–282. [Google Scholar] [CrossRef]
  51. Carvalho, D.V.; Pereira, E.M.; Cardoso, J.S. Machine Learning Interpretability: A Survey on Methods and Metrics. Electronics 2019, 8, 832. [Google Scholar] [CrossRef]
  52. Dey, A.K. Understanding and Using Context. Pers. Ubiquitous Comput. 2001, 5, 4–7. [Google Scholar] [CrossRef]
  53. McNamara, A.; Smith, J.; Murphy-Hill, E. Does ACM’s Code of Ethics Change Ethical Decision Making in Software Development? In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering; ACM: Lake Buena Vista, FL, USA, 2018; pp. 729–733. [Google Scholar] [CrossRef]
  54. Wu, C.J.; Raghavendra, R.; Gupta, U.; Acun, B.; Ardalani, N.; Maeng, K.; Chang, G.; Behram, F.A.; Huang, J.; Bai, C.; et al. Sustainable AI: Environmental Implications, Challenges and Opportunities. In Proceedings of the Machine Learning and Systems 2022; PMLR: Santa Clara, CA, USA, 2022; Volume 4, pp. 795–813. [Google Scholar]
  55. Ring, M.; Wunderlich, S.; Scheuring, D.; Landes, D.; Hotho, A. A Survey of Network-based Intrusion Detection Data Sets. Comput. Secur. 2019, 86, 147–167. [Google Scholar] [CrossRef]
  56. Ferrag, M.A.; Maglaras, L.; Moschoyiannis, S.; Janicke, H. Deep Learning for Cyber Security Intrusion Detection: Approaches, Datasets, and Comparative Study. J. Inf. Secur. Appl. 2020, 50, 102419. [Google Scholar] [CrossRef]
  57. Apruzzese, G.; Colajanni, M.; Ferretti, L.; Guido, A.; Marchetti, M. On the Effectiveness of Machine and Deep Learning for Cyber Security. In Proceedings of the 2018 10th International Conference on Cyber Conflict (CyCon); IEEE: Tallinn, Estonia, 2018; pp. 371–390. [Google Scholar] [CrossRef]
  58. Papernot, N.; Abadi, M.; Erlingsson, U.; Goodfellow, I.; Talwar, K. Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data. In Proceedings of the 5th International Conference on Learning Representations; OpenReview.net: Toulon, France, 2017. [Google Scholar]
  59. Tramèr, F.; Boneh, D. Differentially Private Learning Needs Better Features (or Much More Data). In Proceedings of the 9th International Conference on Learning Representations; OpenReview.net: Alameda, CA, USA, 2021. [Google Scholar]
  60. Salem, A.; Zhang, Y.; Humbert, M.; Berrang, P.; Fritz, M.; Backes, M. ML-Leaks: Model and Data Independent Membership Inference Attacks and Defenses on Machine Learning Models. In Proceedings of the 2019 Network and Distributed System Security Symposium; Internet Society: San Diego, CA, USA, 2019. [Google Scholar] [CrossRef]
  61. Tayebi Arasteh, S.; Ziller, A.; Kuhl, C.; Makowski, M.; Nebelung, S.; Braren, R.; Rueckert, D.; Truhn, D.; Kaissis, G. Preserving fairness and diagnostic accuracy in private large-scale AI models for medical imaging. Commun. Med. 2024, 4, 46. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
  62. Obermeyer, Z.; Powers, B.; Vogeli, C.; Mullainathan, S. Dissecting Racial Bias in an Algorithm Used to Manage the Health of Populations. Science 2019, 366, 447–453. [Google Scholar] [CrossRef]
  63. Rajkomar, A.; Hardt, M.; Howell, M.D.; Corrado, G.; Chin, M.H. Ensuring Fairness in Machine Learning to Advance Health Equity. Ann. Intern. Med. 2018, 169, 866–872. [Google Scholar] [CrossRef]
  64. Guidotti, R.; Monreale, A.; Ruggieri, S.; Turini, F.; Giannotti, F.; Pedreschi, D. A Survey of Methods for Explaining Black Box Models. ACM Comput. Surv. 2019, 51, 1–42. [Google Scholar] [CrossRef]
  65. Miller, T. Explanation in Artificial Intelligence: Insights from the Social Sciences. Artif. Intell. 2019, 267, 1–38. [Google Scholar] [CrossRef]
  66. Davis, E.; Marcus, G. Commonsense Reasoning and Commonsense Knowledge in Artificial Intelligence. Commun. ACM 2015, 58, 92–103. [Google Scholar] [CrossRef]
  67. Bottou, L.; Curtis, F.E.; Nocedal, J. Optimization Methods for Large-Scale Machine Learning. SIAM Rev. 2018, 60, 223–311. [Google Scholar] [CrossRef]
  68. Zhang, C.; Bengio, S.; Hardt, M.; Recht, B.; Vinyals, O. Understanding Deep Learning Requires Rethinking Generalization. In Proceedings of the 5th International Conference on Learning Representations; OpenReview.net: Toulon, France, 2017. [Google Scholar]
  69. Amershi, S.; Weld, D.; Vorvoreanu, M.; Fourney, A.; Nushi, B.; Collisson, P.; Suh, J.; Iqbal, S.; Bennett, P.N.; Inkpen, K.; et al. Guidelines for Human-AI Interaction. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems; ACM: Glasgow, UK, 2019; pp. 1–13. [Google Scholar] [CrossRef]
  70. Shneiderman, B. Human-Centered Artificial Intelligence: Reliable, Safe & Trustworthy. Int. J. Hum.–Comput. Interact. 2020, 36, 495–504. [Google Scholar] [CrossRef]
  71. Bansal, G.; Nushi, B.; Kamar, E.; Weld, D.S.; Lasecki, W.S.; Horvitz, E. Updates in Human-AI Teams: Understanding and Addressing the Performance/Compatibility Tradeoff. In Proceedings of the AAAI Conference on Artificial Intelligence 2019; AAAI Press: Honolulu, HI, USA, 2019; Volume 33, pp. 2429–2437. [Google Scholar] [CrossRef]
  72. Pearl, J.; Mackenzie, D. The Book of Why: The New Science of Cause and Effect; Basic Books: New York, NY, USA, 2018. [Google Scholar]
  73. Marcus, G. The Next Decade in AI: Four Steps Towards Robust Artificial Intelligence. arXiv 2020, arXiv:2002.06177. [Google Scholar] [CrossRef]
  74. Lake, B.M.; Ullman, T.D.; Tenenbaum, J.B.; Gershman, S.J. Building Machines That Learn and Think Like People. Behav. Brain Sci. 2017, 40, e253. [Google Scholar] [CrossRef] [PubMed]
  75. Garnelo, M.; Shanahan, M. Reconciling Deep Learning with Symbolic Artificial Intelligence. Curr. Opin. Behav. Sci. 2019, 29, 17–23. [Google Scholar] [CrossRef]
  76. Jayaraman, B.; Evans, D. Evaluating Differentially Private Machine Learning in Practice. In Proceedings of the USENIX Security 2019, Santa Clara, CA, USA, 14–16 August 2019; pp. 1895–1912. [Google Scholar]
  77. Bagdasaryan, E.; Poursaeed, O.; Shmatikov, V. Differential Privacy Has Disparate Impact on Model Accuracy. In Proceedings of the NeurIPS 2019, Vancouver, BC, Canada, 8–14 December 2019; pp. 15453–15462. [Google Scholar]
  78. Carlini, N.; Liu, C.; Erlingsson, Ú.; Kos, J.; Song, D. The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks. In Proceedings of the USENIX Security 2019, Santa Clara, CA, USA, 14–16 August 2019; pp. 267–284. [Google Scholar]
  79. Chaudhuri, K.; Monteleoni, C.; Sarwate, A.D. Differentially Private Empirical Risk Minimization. J. Mach. Learn. Res. 2011, 12, 1069–1109. [Google Scholar] [PubMed]
  80. Schilit, B.; Adams, N.; Want, R. Context-Aware Computing Applications. In Proceedings of the Workshop on Mobile Computing Systems; IEEE: Piscataway, NJ, USA, 1994; pp. 85–90. [Google Scholar]
  81. Dourish, P. What We Talk About When We Talk About Context. Pers. Ubiquitous Comput. 2004, 8, 19–30. [Google Scholar] [CrossRef]
  82. Baldauf, M.; Dustdar, S.; Rosenberg, F. A Survey on Context-Aware Systems. Int. J. Ad Hoc Ubiquitous Comput. 2007, 2, 263–277. [Google Scholar] [CrossRef]
  83. Selbst, A.D.; Boyd, D.; Friedler, S.A.; Venkatasubramanian, S.; Vertesi, J. Fairness and Abstraction in Sociotechnical Systems. In Proceedings of the FAT* 2019, Atlanta, GA, USA, 29–31 January 2019; pp. 59–68. [Google Scholar]
  84. European Commission. Proposal for a Regulation on Artificial Intelligence (AI Act); COM(2021) 206 Final; European Commission: Brussels, Belgium, 2021. [Google Scholar]
  85. Rudin, C. Stop Explaining Black Box Machine Learning Models and Use Interpretable Models Instead. Nat. Mach. Intell. 2019, 1, 206–215. [Google Scholar] [CrossRef] [PubMed]
  86. Adadi, A.; Berrada, M. Peeking Inside the Black-Box: A Survey on Explainable AI. IEEE Access 2018, 6, 52138–52160. [Google Scholar] [CrossRef]
  87. Sculley, D.; Holt, G.; Golovin, D.; Davydov, E.; Phillips, T.; Ebner, D.; Chaudhary, V.; Young, M.; Crespo, J.-F.; Dennison, D. Hidden Technical Debt in Machine Learning Systems. In Proceedings of the NeurIPS 2015, Montreal, QC, Canada, 7–12 December 2015; pp. 2503–2511. [Google Scholar]
  88. Rader, E.; Cotter, K.; Cho, J. Explanations as Mechanisms for Supporting Algorithmic Transparency. In Proceedings of the CHI 2018, Montreal, QC, Canada, 21–26 April 2018; pp. 1–13. [Google Scholar]
  89. Boţ, R.; Dong, G.; Elbau, P.; Scherzer, O. Convergence Rates of First- and Higher-Order Dynamics for Solving Linear Ill-Posed Problems. Found. Comput. Math. 2022, 22, 1567–1629. [Google Scholar] [CrossRef]
  90. Hadfield-Menell, D.; Russell, S.J.; Abbeel, P.; Dragan, A. Cooperative Inverse Reinforcement Learning. In Proceedings of the NeurIPS 2016, Barcelona, Spain, 5–10 December 2016; pp. 3909–3917. [Google Scholar]
  91. Wang, Z.; Qin, Y.; Zhou, W.; Yan, J.; Ye, Q.; Neves, L.; Liu, Z.; Ren, X. Learning from Explanations with Neural Execution Tree. In Proceedings of the ICLR 2020, Addis Ababa, Ethiopia, 30 April 2020. [Google Scholar]
  92. Cohn, D.; Atlas, L.; Ladner, R. Improving Generalization with Active Learning. Mach. Learn. 1994, 15, 201–221. [Google Scholar] [CrossRef]
  93. Bonawitz, K.; Ivanov, V.; Kreuter, B.; Marcedone, A.; McMahan, H.B.; Patel, S.; Ramage, D.; Segal, A.; Seth, K. Practical Secure Aggregation for Privacy-Preserving Machine Learning. In Proceedings of the CCS 2017, Dallas, TX, USA, 30 October–3 November 2017; pp. 1175–1191. [Google Scholar]
  94. Katz, G.; Barrett, C.; Dill, D.L.; Julian, K.; Kochenderfer, M.J. Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks. In Proceedings of the CAV 2017, Heidelberg, Germany, 24–28 July 2017; pp. 97–117. [Google Scholar]
  95. Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction, 2nd ed.; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
Figure 1. The general architecture of Sophimatics.
Figure 1. The general architecture of Sophimatics.
Algorithms 19 00175 g001
Figure 2. The Phase 6 iterative refinement architecture.
Figure 2. The Phase 6 iterative refinement architecture.
Algorithms 19 00175 g002
Figure 3. Knowledge Evolution Trajectory across the 15 refinement iterations of Phase 6.
Figure 3. Knowledge Evolution Trajectory across the 15 refinement iterations of Phase 6.
Algorithms 19 00175 g003
Figure 4. Interpretability Composition Matrix for the four evaluated domains.
Figure 4. Interpretability Composition Matrix for the four evaluated domains.
Algorithms 19 00175 g004
Figure 5. Privacy–utility trade-off across the four evaluated tasks (NSL-KDD, CICIDS2017, MIMIC Mortality, MIMIC LOS).
Figure 5. Privacy–utility trade-off across the four evaluated tasks (NSL-KDD, CICIDS2017, MIMIC Mortality, MIMIC LOS).
Algorithms 19 00175 g005
Figure 6. Comparative advantage profiles across the four application domains (NSL-KDD, CICIDS2017, MIMIC Mortality, MIMIC Length of Stay).
Figure 6. Comparative advantage profiles across the four application domains (NSL-KDD, CICIDS2017, MIMIC Mortality, MIMIC Length of Stay).
Algorithms 19 00175 g006
Figure 7. Temporal impact map showing the relative position of the four evaluated domains in the bidimensional complex-time space (a + ib). The horizontal axis represents the real/chronological component a, while the vertical axis encodes the imaginary/experiential component b. The background density field illustrates temporal-complexity intensity, highlighting how different domains occupy distinct temporal signatures: CICIDS2017 in regions of high temporal expansion, MIMIC-based tasks in memory-dominant zones, and NSL-KDD in areas of low experiential depth.
Figure 7. Temporal impact map showing the relative position of the four evaluated domains in the bidimensional complex-time space (a + ib). The horizontal axis represents the real/chronological component a, while the vertical axis encodes the imaginary/experiential component b. The background density field illustrates temporal-complexity intensity, highlighting how different domains occupy distinct temporal signatures: CICIDS2017 in regions of high temporal expansion, MIMIC-based tasks in memory-dominant zones, and NSL-KDD in areas of low experiential depth.
Algorithms 19 00175 g007
Figure 8. Energy landscape of convergence for the Phase 6 refinement process. The surface plot represents the normalized residual energy across multiple metric dimensions as iterations progress. The landscape shows a characteristic exponential descent, with different metric components exhibiting distinct convergence rates. After approximately 12–15 iterations, the system approaches a low-energy basin corresponding to stable multi-objective optimization, confirming the theoretical expectations of the Complex-Time Recursive Model under human-in-the-loop refinement.
Figure 8. Energy landscape of convergence for the Phase 6 refinement process. The surface plot represents the normalized residual energy across multiple metric dimensions as iterations progress. The landscape shows a characteristic exponential descent, with different metric components exhibiting distinct convergence rates. After approximately 12–15 iterations, the system approaches a low-energy basin corresponding to stable multi-objective optimization, confirming the theoretical expectations of the Complex-Time Recursive Model under human-in-the-loop refinement.
Algorithms 19 00175 g008
Table 1. Comparative analysis against DeepProbLog and Logic Tensor Networks. Phase 6 provides unique complex temporal reasoning, formal privacy preservation, hierarchical context modeling, and systematic human–AI collaboration [1,2,3,4,5,41,42].
Table 1. Comparative analysis against DeepProbLog and Logic Tensor Networks. Phase 6 provides unique complex temporal reasoning, formal privacy preservation, hierarchical context modeling, and systematic human–AI collaboration [1,2,3,4,5,41,42].
DimensionDeepProbLogLogic Tensor NetworksSophimatics Phase 6
Symbolic–NeuralProbabilistic logic + neural predicatesFirst-order logic as graphsHybrid: KB + STCNN + CTRM + integration
Temporal ReasoningSequential rules onlyTime as predicateComplex t = a + ib (chronological + experiential)
PrivacyNoneNoneDP-SGD with ε ≈ 2.8–3.2
Context ModelingLimited predicatesContext as predicatesHierarchical lattice Λ
Human-in-LoopManual logic editingManual formulasStructured 3-layer feedback (K,N,W)
InterpretabilityLogic tracesFormula satisfactionSymbolic rules + neural + context
Ethical ReasoningCan encode rulesCan encode constraintsIntegrated value-weights W
ParametersVariesDepends on grounding~7–10 M (efficient)
ValidationAcademic benchmarksSemantic image tasksCybersecurity + healthcare
Table 2. Differential privacy parameters. Target ε reflects privacy strength, δ sets negligible breach probability, C bounds per-example gradient influence, σ determines noise scale, and T indicates training steps before budget exhaustion.
Table 2. Differential privacy parameters. Target ε reflects privacy strength, δ sets negligible breach probability, C bounds per-example gradient influence, σ determines noise scale, and T indicates training steps before budget exhaustion.
DatasetTarget εδClipping CNoise σSteps T
NSL-KDD2.8 10 5 1.00.810,000
CICIDS20173.2 10 5 1.00.712,000
MIMIC Mortality2.8 10 5 0.50.88000
MIMIC LOS3.0 10 5 0.50.758500
Table 3. Dataset characteristics. NSL-KDD tests discriminative learning with interpretability; CICIDS2017 evaluates temporal reasoning on modern attacks; MIMIC-III assesses privacy–utility balance in clinical prediction.
Table 3. Dataset characteristics. NSL-KDD tests discriminative learning with interpretability; CICIDS2017 evaluates temporal reasoning on modern attacks; MIMIC-III assesses privacy–utility balance in clinical prediction.
DatasetRecordsFeaturesClasses/TasksDomainKey ChallengesRefs
NSL-KDD148 K (125 K train)415 classesNetwork IntrusionClass imbalance, interpretability[43,44]
CICIDS2017~2.8 M flows80+15 classesModern ThreatsTemporal patterns, Botnet[45,46,47]
MIMIC-III40 K+ patientsTime-series + notesMortality, LOSHealthcare PrivacySensitive attributes, ethics[48,49]
Table 4. Botnet detection error analysis. False positives dominate (58.6%), primarily from legitimate P2P and cloud service traffic mimicking command-and-control patterns. False negatives stem from encrypted or slow-interval C&C. Mitigation strategies include application-specific context rules and enhanced temporal analysis leveraging complex-time memory.
Table 4. Botnet detection error analysis. False positives dominate (58.6%), primarily from legitimate P2P and cloud service traffic mimicking command-and-control patterns. False negatives stem from encrypted or slow-interval C&C. Mitigation strategies include application-specific context rules and enhanced temporal analysis leveraging complex-time memory.
Error TypeCount% of TotalPrimary CauseMitigation Strategy
False Positives34258.6%Legitimate P2P apps mimic C&CApplication-specific context rules
False Negatives24241.4%Encrypted C&C evades detectionEnhanced temporal pattern analysis
Cloud Services FP12721.7%Regular heartbeats resemble beaconingService identification layer
IoT Device FP9816.8%Limited traffic patterns similar to botsDevice profiling integration
Slow C&C FN15626.7%Long intervals reduce temporal signatureExtended memory window (b component)
Table 5. Concrete examples of symbolic rule evolution through iterative refinement.
Table 5. Concrete examples of symbolic rule evolution through iterative refinement.
DomainInitial Rule (Iteration 1)Refined Rule (Iteration 8)Refinement Rationale
Intrusion DetectionIF (connections > 100) THEN suspiciousIF (connections > 100 AND port_scan_pattern AND NOT cloud_service) THEN alertReduced FP from cloud heartbeats
Healthcare PrivacyALWAYS protect demographicsIF (role = physician AND context = treatment) THEN disclose_age ELSE protectContext-sensitive disclosure
Ethical ConstraintMaximize accuracyBALANCE (accuracy, privacy, fairness) WITH weights = [0.4, 0.4, 0.2] WHEN high_stakesMulti-objective optimization
Temporal PatternDetect rapid_succession eventsIF (rapid_succession IN real_time AND anomaly_memory IN imaginary_time) THEN threatComplex-time integration
Context AdaptationStandard security_levelIF (context.threat_model = APT AND context.data_sensitivity = high) THEN security_level = maximumThreat-aware adaptation
Table 6. Computational efficiency analysis.
Table 6. Computational efficiency analysis.
OperationDatasetTime/ThroughputMemoryHardware
Training (per iter)NSL-KDD2.3 h8.2 GBV100 GPU
Training (per iter)CICIDS20176.8 h14.1 GBV100 GPU
Training (per iter)MIMIC-III4.1 h10.7 GBV100 GPU
InferenceNSL-KDD12.4 ms/sample (80 samples/sec)3.2 GBV100 GPU
InferenceCICIDS201715.7 ms/sample (64 samples/sec)3.8 GBV100 GPU
Expert ReviewAll datasets4.2 h/iterationN/AHuman time
Total Refinement14 iterations avg~140 GPU hours + 60 human hours8–14 GB peakV100 GPU
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Iovane, G.; Iovane, G. Co-Creation by Human–AI Sophimatics Framework and Applications. Algorithms 2026, 19, 175. https://doi.org/10.3390/a19030175

AMA Style

Iovane G, Iovane G. Co-Creation by Human–AI Sophimatics Framework and Applications. Algorithms. 2026; 19(3):175. https://doi.org/10.3390/a19030175

Chicago/Turabian Style

Iovane, Gerardo, and Giovanni Iovane. 2026. "Co-Creation by Human–AI Sophimatics Framework and Applications" Algorithms 19, no. 3: 175. https://doi.org/10.3390/a19030175

APA Style

Iovane, G., & Iovane, G. (2026). Co-Creation by Human–AI Sophimatics Framework and Applications. Algorithms, 19(3), 175. https://doi.org/10.3390/a19030175

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop