Data Quality in the Age of AI: A Review of Governance, Ethics, and the FAIR Principles
Abstract
1. Introduction
2. Material and Methods
2.1. Review Design and Reporting
Rationale for Methodological Choices
2.2. Information Sources
2.3. Search Strategy
Search String Development and Validation
2.4. Eligibility Criteria
- (i)
- peer-reviewed articles, reviews, or conference papers that explicitly define, model, assess, or manage data quality;
- (ii)
- standards or technical reports describing data-quality frameworks or dimensions;
- (iii)
- sectoral applications (healthcare, business/CRM, public administration) illustrating implications for decision-making, governance, ethics, or FAIR/reproducibility.
- (i)
- opinion/editorial pieces without methodological detail;
- (ii)
- works focused solely on cybersecurity or privacy without a substantive data-quality component;
- (iii)
- non-scholarly web content lacking sources;
- (iv)
- duplicates and superseded versions;
- (v)
- articles in which “data quality” was addressed only as routine data cleaning within the methodology of a specific empirical study (e.g., dietary intake and cancer association studies) without contributing to broader conceptual, methodological, or framework-oriented discussions of data quality.
2.5. Screening and Selection
2.6. Data Extraction
2.7. Quality Appraisal and Source Credibility
2.8. Synthesis Methods
Software and Tools
2.9. Grey Literature and Standards Handling
2.10. Data Management and Reproducibility
3. Definitions and Frameworks of Data Quality
3.1. Standards and ISO 8000
3.2. Frameworks
3.3. Scope of Data Quality in Data Analytics
Role in Data Analytics
4. Dimensions of Data Quality
4.1. Accuracy
4.2. Completeness
4.3. Consistency
4.4. Timeliness
4.5. Relevance
4.6. Validity
4.7. Emerging Dimensions (Plausibility, Multifacetedness, Integrity)
5. Impact and Evidence: Consequences and Case Studies of Data Quality Management
5.1. Multidimensional Consequences of Poor Data Quality
5.1.1. Financial Impact
5.1.2. Operational and Strategic Consequences
5.1.3. Resource Waste and Operational Inefficiency
5.1.4. Reputational Damage and Stakeholder Trust
5.1.5. Missed Opportunities and Competitive Disadvantage
5.2. High-Profile Failures: Learning from Data Quality Disasters
5.2.1. Aerospace: NASA’s Mars Climate Orbiter
5.2.2. Technology: Unity Technologies’ Audience Pinpoint Error
5.2.3. Public Administration: Data Inaccuracy in Criminal Justice
5.2.4. Financial Sector: Data Silo Architecture and Systemic Risk
5.3. Success Stories: Data Quality as a Strategic Asset
5.3.1. E-Commerce: Netflix’s Data-Driven Success
5.3.2. Healthcare: Multicenter Clinical Database Excellence
5.3.3. Business Operations: CRM Implementation with Strong Governance
5.4. Cross-Sector Patterns: Where Data Quality Determines Outcomes
5.4.1. Healthcare Sector
5.4.2. Finance and Investment
5.4.3. E-Commerce and CRM
5.4.4. Public Administration
5.4.5. Small and Medium Enterprises (SMEs)
5.5. Synthesis: Why Data Quality Governance Matters
- Inadequate process design at the point of data collection
- Insufficient validation and verification mechanisms
- Organizational silos that prevent information sharing and consistency
- Weak leadership commitment to quality as a strategic priority
- Insufficient training and unclear accountability structures
- Delayed or absent monitoring and maintenance processes
- Clear governance structures with defined roles and accountability
- Quality mechanisms embedded throughout the data lifecycle (not retrofitted)
- Continuous monitoring and feedback loops
- Leadership commitment and resource allocation
- Regular training and stakeholder engagement
- Alignment of technology with organizational processes and objectives
6. Ensuring Data Quality: Methods and Best Practices
6.1. Data Collection Practices
6.2. Data Cleaning and Validation
6.3. Monitoring and Maintenance
6.4. Governance/Ethics
7. Challenges and Proposed Solutions
7.1. Technical Challenges
7.2. Organizational Challenges
7.3. Solutions
8. Data Quality in the AI Era: Paradigm Shifts and Governance Implications
8.1. How AI Reshapes the Definition of Data Quality
8.2. AI-Driven Challenges to Traditional Governance Frameworks
8.3. AI Ethics as an Integral Component of Data Quality
8.4. From “AI for Data Quality” to “Data Quality for AI”: A Bidirectional Relationship
- Rigorous preprocessing, profiling, and cleansing to eliminate inaccuracies, duplicates, and missing values [277]
- Bias detection and mitigation at the data sourcing stage to prevent algorithmic discrimination [278]
- Comprehensive metadata and provenance documentation to support explainability and accountability [281]
8.5. Synthesis: Toward Integrated AI-Era Data Quality Governance
- Sociotechnical Approach: Technology alone—even sophisticated AI-driven validation systems—cannot resolve governance challenges. Sustainable quality requires integration of technical solutions with organizational design, clear accountability structures, adequate training, and committed leadership [309,310,311].
9. Discussion
9.1. Synthesis of Literature: What Is Agreed upon, What Is Debated
9.2. Emerging Trends and Lessons from Failures
9.3. Research Gaps
9.4. Implications for Practitioners and Policymakers
10. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
Abbreviations
| Abbreviation | Full Term |
| AI | Artificial Intelligence |
| BERT | Bidirectional Encoder Representations from Transformers |
| BI | Business Intelligence |
| CLV | Customer Lifetime Value |
| CRM | Customer Relationship Management |
| DQA | Data Quality Assessment |
| DQI | Data Quality Index |
| DQM | Data Quality Management |
| DSAN | Denoising Self-Attention Network |
| DSS | Decision Support Systems |
| ECCMA | Electronic Commerce Code Management Association |
| FAIR | Findability, Accessibility, Interoperability, and Reusability |
| FAIR4RS | FAIR for Research Software |
| GDPR | General Data Protection Regulation |
| IoT | Internet of Things |
| ISO | International Organization for Standardization |
| KNN | K-Nearest Neighbors |
| LLMs | Large Language Models |
| LSTM | Long Short-Term Memory |
| NATO | North Atlantic Treaty Organization |
| NCS | NATO Codification System |
| RFM | Recency, Frequency, and Monetary |
| SaaS | Software as a Service |
| SMEs | Small and Medium-sized Enterprises |
| WHO | World Health Organization |
References
- Bernardi, F.; Andrade Alves, D.; Crepaldi, N.Y.; Yamada, D.B.; Lima, V.; Costa Rijo, R.P.C.L. Data Quality in Health Research: A Systematic Literature Review. medRxiv 2022, 2022, 22275804. [Google Scholar]
- Elahi, E. Data Quality in Healthcare–Benefits, Challenges, and Steps for Improvement. Available online: https://dataladder.com/data-quality-in-healthcare-data-systems/ (accessed on 16 August 2024).
- Ali, S.M.; Naureen, F.; Noor, A.; Kamel Boulos, M.N.; Aamir, J.; Ishaq, M.; Anjum, N.; Ainsworth, J.; Rashid, A.; Majidulla, A.; et al. Data Quality: A Negotiator between Paper-Based and Digital Records in Pakistan’s TB Control Program. Data 2018, 3, 27. [Google Scholar] [CrossRef]
- Chen, H.; Hailey, D.; Wang, N.; Yu, P. A Review of Data Quality Assessment Methods for Public Health Information Systems. Int. J. Environ. Res. Public Health 2014, 11, 5170–5207. [Google Scholar] [CrossRef]
- ISO 8000-1:2022(en); Data Quality—Part 1: Overview. International Organization for Standardization (ISO): Geneva, Switzerland, 2022. Available online: https://www.iso.org/obp/ui/#iso:std:iso:8000:-1:ed-1:v1:en (accessed on 20 August 2025).
- ECCMA. What Is ISO 8000? Available online: https://eccma.org/what-is-iso-8000/ (accessed on 20 August 2025).
- ISO 8000-1:2011; Data Quality—Part 1: Overview. International Organization for Standardization ISO: Geneva, Switzerland, 2011.
- ISO 8000-110:2009; Data Quality—Part 110: Master Data: Exchange of Characteristic Data: Syntax, Semantic Encoding, and Conformance to Data Specification. International Organization for Standardization ISO: Geneva, Switzerland, 2009.
- ISO/IEC 25012:2008; Software Engineering—Software Product Quality Requirements and Evaluation (SQuaRE)—Data Quality Model. ISO: Geneva, Switzerland, 2008. Available online: https://www.iso.org/standard/35736.html (accessed on 27 October 2025).
- Petrović, M. Data Quality in Customer Relationship Management (CRM): Literature Review. Strateg. Manag. 2020, 25, 40–47. [Google Scholar] [CrossRef]
- Henderson, D.; Earley, S.; Sykora, E.; Smith, E. DAMA-DMBOOK Data Management Body of Knowledge, 2nd ed.; DAMA International: Basking Ridge, NJ, USA, 2017. [Google Scholar]
- Alshawi, S.; Missi, F.; Irani, Z. Organisational, Technical and Data Quality Factors in CRM Adoption—SMEs Perspective. Ind. Mark. Manag. 2011, 40, 376–383. [Google Scholar] [CrossRef]
- Henderson, D.; Earley, S.; Sykora, E.; Smith, E. (Eds.) Data Quality. In DAMA-DMBOOK Data Management Body of Knowledge; DAMA International: Basking Ridge, NJ, USA, 2017; pp. 551–611. [Google Scholar]
- Strong, D.M.; Lee, Y.W.; Wang, R.Y. Data Quality in Context. Commun. ACM 1997, 40, 103–110. [Google Scholar] [CrossRef]
- Wang, R.Y.; Strong, D.M. Beyond Accuracy: What Data Quality Means to Data Consumers. J. Manag. Inf. Syst. 1996, 12, 5–33. [Google Scholar] [CrossRef]
- Benson, P. NATO Codification System as the Foundation for ISO 8000, the International Standard for Data Quality. Oil IT J. 2008, 1, 1–4. [Google Scholar]
- Pipino, L.L.; Lee, Y.W.; Wang, R.Y. Data Quality Assessment. Commun. ACM 2002, 45, 211–218. [Google Scholar] [CrossRef]
- Ehrlinger, L.; Wöß, W. A Survey of Data Quality Measurement and Monitoring Tools. Front. Big Data 2022, 5, 850611. [Google Scholar] [CrossRef]
- Haug, A.; Zachariassen, F.; van Liempd, D. The Costs of Poor Data Quality. J. Ind. Eng. Manag. 2011, 4, 168–193. [Google Scholar] [CrossRef]
- Vaknin, M.; Filipowska, A. Information Quality Framework for the Design and Validation of Data Flow Within Business Processes-Position Paper; Springer: Berlin/Heidelberg, Germany, 2017; pp. 158–168. [Google Scholar] [CrossRef]
- Suh, Y. Exploring the Impact of Data Quality on Business Performance in CRM Systems for Home Appliance Business. IEEE Access 2023, 11, 116076–116089. [Google Scholar] [CrossRef]
- Tamm, H.C.; Nikiforova, A. From Data Quality for AI to AI for Data Quality: A Systematic Review of Tools for AI-Augmented Data Quality Management in Data Warehouses. arXiv 2025, arXiv:2406.10940. [Google Scholar]
- Bernardo, B.M.V.; Mamede, H.S.; Barroso, J.M.P.; dos Santos, V.M.P.D. Data Governance & Quality Management—Innovation and Breakthroughs across Different Fields. J. Innov. Knowl. 2024, 9, 100598. [Google Scholar] [CrossRef]
- Nguyen, T.; Nguyen, H.-T.; Nguyen-Hoang, T.-A. Data Quality Management in Big Data: Strategies, Tools, and Educational Implications. J. Parallel Distrib. Comput. 2025, 200, 105067. [Google Scholar] [CrossRef]
- Nicholson, N.; Negrao Carvalho, R.; Štotl, I. A FAIR Perspective on Data Quality Frameworks. Data 2025, 10, 136. [Google Scholar] [CrossRef]
- Lamprecht, A.-L.; Garcia, L.; Kuzak, M.; Martinez, C.; Arcila, R.; Martin Del Pico, E.; Dominguez Del Angel, V.; van de Sandt, S.; Ison, J.; Martinez, P.A.; et al. Towards FAIR Principles for Research Software. Data Sci. 2020, 3, 37–59. [Google Scholar] [CrossRef]
- Lopes, C.S.; Silveira, D.S.D.; Araujo, J. Business Processes Fragments to Promote Information Quality. Int. J. Qual. Reliab. Manag. 2021, 38, 1880–1901. [Google Scholar] [CrossRef]
- Oliychenko, I.; Ditkovska, M. Improving Information Quality in E-Government of Ukraine. Electron. Gov. Int. J. 2023, 19, 146. [Google Scholar] [CrossRef]
- Xu, J.; Tang, J.; Ma, X.; Xu, B.; Shen, Y.; Qiao, Y. Objective Information Theory: A Sextuple Model and 9 Kinds of Metrics. Comput. Sci. Math. 2014, 2014, 793–802. [Google Scholar]
- Lian, H.; He, T.; Qin, Z.; Li, H.; Liu, J. Research on the Information Quality Measurement of Judicial Documents. In Proceedings of the 2018 IEEE International Conference on Software Quality, Reliability and Security Companion (QRS-C), Lisbon, Portugal, 16–20 July 2018; pp. 177–181. [Google Scholar]
- Chue Hong, N.P.; Aragon, S.; Hettrick, S.; Jay, C. The Future of Research Software Is the Future of Research. Patterns 2025, 6, 101322. [Google Scholar] [CrossRef] [PubMed]
- Even, A.; Shankaranarayanan, G.; Berger, P.D. Evaluating a Model for Cost-Effective Data Quality Management in a Real-World CRM Setting. Decis. Support Syst. 2010, 50, 152–163. [Google Scholar] [CrossRef]
- Foote, K. The Impact of Poor Data Quality (and How to Fix It). Available online: https://www.dataversity.net/the-impact-of-poor-data-quality-and-how-to-fix-it/ (accessed on 26 August 2024).
- Payton, F.C.; Zahay, D. Understanding Why Marketing Does Not Use the Corporate Data Warehouse for CRM Applications. J. Database Mark. Cust. Strategy Manag. 2003, 10, 315–326. [Google Scholar] [CrossRef]
- Bidlack, C.; Wellman, M.P. Exceptional Data Quality Using Intelligent Matching and Retrieval. AI Mag. 2010, 31, 65–73. [Google Scholar] [CrossRef]
- Schäffer, T.; Beckmann, H. Trendstudie Stammdatenqualität 2013: Erhebung der Aktuellen Situation zur Stammdatenqualität in Unternehmen und Daraus Abgeleitete Trends [Trend StudyMaster Data Quality 2013: Inquiry of the Current Situation of Master Data Quality in Companies and Derived Trends]; Steinbeis-Edition: Stuttgart, Germany, 2014. [Google Scholar]
- Fisher, C.W.; Lauria, E.J.M.; Matheus, C.C. An Accuracy Metric. J. Data Inf. Qual. 2009, 1, 1–21. [Google Scholar] [CrossRef]
- Kelka, H. Supply Chain Resilience Navigating Disruptions Through Strategic Inventory Management. Bachelor’s Thesis, Metropolia University of Applied Sciences, Helsinki, Finland, 2024. [Google Scholar]
- Al-Harrasi, A.S.; Adarbah, H.Y.; Al-Badi, A.H.; Shaikh, A.K.; Al-Shihi, H.; Al-Barrak, A. Exploring the Adoption of Big Data Analytics in the Oil and Gas Industry: A Case Study. J. Bus. Commun. Technol. 2024, 3, 1–16. [Google Scholar] [CrossRef]
- Joseph, M.; Kumar, D.P.; Keerthana, J.K. Stock Market Analysis and Portfolio Management. In Proceedings of the 2024 11th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO), Noida, India, 14–15 March 2024; pp. 1–6. [Google Scholar] [CrossRef]
- Purohit, P.; Al Nuaimi, F.; Nakkolakkal, S. Data Governance, Privacy, Data Sharing Challenges. In Proceedings of the SPE Gas & Oil Technology Showcase and Conference, Dubai, United Arab Emirates, 7–9 May 2024. [Google Scholar] [CrossRef]
- UTradeAlgos. The Importance of Real-Time Data in Algo Trading Software. Available online: https://utradealgos.com/blog/the-importance-of-real-time-data-in-algo-trading-software/ (accessed on 27 August 2024).
- Antwi, B.O.; Adelakun, B.O.; Eziefule, A.O. Transforming Financial Reporting with AI: Enhancing Accuracy and Timeliness. Int. J. Adv. Econ. 2024, 6, 205–223. [Google Scholar] [CrossRef]
- Nwaimo, C.S.; Adegbola, A.E.; Adegbola, M.D.; Adeusi, K.B. Evaluating the Role of Big Data Analytics in Enhancing Accuracy and Efficiency in Accounting: A Critical Review. Financ. Account. Res. J. 2024, 6, 877–892. [Google Scholar] [CrossRef]
- Judijanto, L.; Edtiyarsih, D.D. The Effect of Company Policy, Legal Compliance, and Information Technology on Audit Report Accuracy in the Textile Industry in Tangerang. West Sci. Account. Financ. 2024, 2, 287–298. [Google Scholar] [CrossRef]
- Ehsani-Moghaddam, B.; Martin, K.; Queenan, J.A. Data Quality in Healthcare: A Report of Practical Experience with the Canadian Primary Care Sentinel Surveillance Network Data. Health Inf. Manag. J. 2021, 50, 88–92. [Google Scholar] [CrossRef]
- Lorence, D. Measuring Disparities in Information Capture Timeliness Across Healthcare Settings: Effects on Data Quality. J. Med. Syst. 2003, 27, 425–433. [Google Scholar] [CrossRef]
- Mashoufi, M.; Ayatollahi, H.; Khorasani-Zavareh, D.; Talebi Azad Boni, T. Data Quality in Health Care: Main Concepts and Assessment Methodologies. Methods Inf. Med. 2023, 62, 005–018. [Google Scholar] [CrossRef]
- Wager, K.A.; Schaffner, M.J.; Foulois, B.; Swanson Kazley, A.; Parker, C.; Walo, H. Comparison of the Quality and Timeliness of Vital Signs Data Using Three Different Data-Entry Devices. CIN Comput. Inform. Nurs. 2010, 28, 205–212. [Google Scholar] [CrossRef] [PubMed]
- Alzghoul, A.; Khaddam, A.A.; Abousweilem, F.; Irtaimeh, H.J.; Alshaar, Q. How Business Intelligence Capability Impacts Decision-Making Speed, Comprehensiveness, and Firm Performance. Inf. Dev. 2024, 40, 220–233. [Google Scholar] [CrossRef]
- Kusumawardhani, F.K.; Ratmono, D.; Wibowo, S.T.; Darsono, D.; Widyatmoko, S.; Rokhman, N. The Impact of Digitalization in Accounting Systems on Information Quality, Cost Reduction and Decision Making: Evidence from SMEs. Int. J. Data Netw. Sci. 2024, 8, 1111–1116. [Google Scholar] [CrossRef]
- GOV.UK. Hidden Costs of Poor Data Quality Tackling Data Quality Saves Money and Reduces Risk; Government Data Quality Hub: London, UK, 2021.
- Sattler, K.-U. Data Quality Dimensions. In Encyclopedia of Database Systems; Springer: New York, NY, USA, 2016; pp. 1–5. [Google Scholar] [CrossRef]
- Enterprise Big Data Framework. Understanding Data Quality: Ensuring Accuracy, Reliability, and Consistency. Available online: https://www.bigdataframework.org/knowledge/understanding-data-quality/ (accessed on 27 August 2024).
- Chen, B. What is Data Relevance? Definition, Examples, and Best Practices. Available online: https://www.metaplane.dev/blog/data-relevance-definition-examples (accessed on 27 August 2024).
- IBM. What Is Data Quality? Available online: https://www.ibm.com/think/topics/data-quality (accessed on 25 October 2025).
- Okembo, C.; Morales, J.; Lemmen, C.; Zevenbergen, J.; Kuria, D. A Land Administration Data Exchange and Interoperability Framework for Kenya and Its Significance to the Sustainable Development Goals. Land 2024, 13, 435. [Google Scholar] [CrossRef]
- Bammidi, T.; Gutta, L.; Kotagiri, A.; Samayamantri, L.; Vaddy, R. The Crucial Role of Data Quality in Automated Decision-Making Systems. Int. J. Manag. Educ. Sustain. Dev. 2024, 7, 1–22. [Google Scholar]
- Yandrapalli, V. AI-Powered Data Governance: A Cutting-Edge Method for Ensuring Data Quality for Machine Learning Applications. In Proceedings of the 2024 Second International Conference on Emerging Trends in Information Technology and Engineering (ICETITE), Vellore, India, 22–23 February 2024; pp. 1–6. [Google Scholar] [CrossRef]
- Van Iddekinge, C.H.; Ployhart, R.E. Developments in the Criterion–related Validation of Selection Procedures: A Critical Review and Recommendations for Practice. Pers. Psychol. 2008, 61, 871–925. [Google Scholar] [CrossRef]
- Redman, T.C. Bad Data Costs the U.S. $3 Trillion per Year. Available online: https://hbr.org/2016/09/bad-data-costs-the-u-s-3-trillion-per-year (accessed on 14 September 2025).
- Miller, R.; Whelan, H.; Chrubasik, M.; Whittaker, D.; Duncan, P.; Gregório, J. A Framework for Current and New Data Quality Dimensions: An Overview. Data 2024, 9, 151. [Google Scholar] [CrossRef]
- Gartner, Inc. Data Quality: Why It Matters and How to Achieve It. Available online: https://www.gartner.com/en/data-analytics/topics/data-quality (accessed on 20 August 2025).
- Albrecht, R.; Overbeek, S.; van de Weerd, I. Designing a Data Quality Management Framework for CRM Platform Delivery and Consultancy. SN Comput. Sci. 2023, 4, 742. [Google Scholar] [CrossRef]
- Nilashi, M.; Abumalloh, R.A.; Ahmadi, H.; Samad, S.; Alrizq, M.; Abosaq, H.; Alghamdi, A. The Nexus Between Quality of Customer Relationship Management Systems and Customers’ Satisfaction: Evidence from Online Customers’ Reviews. Heliyon 2023, 9, e21828. [Google Scholar] [CrossRef]
- Nikiforova, A. Open Data Quality Evaluation: A Comparative Analysis of Open Data in Latvia. Balt. J. Mod. Comput. 2018, 6, 363–386. [Google Scholar] [CrossRef]
- Southekal, P. Data Quality: Empowering Businesses with Analytics and AI; John Wiley & Sons: Hoboken, NJ, USA, 2023. [Google Scholar]
- Cornford, S.L.; Wheeler, A.; Feather, M.S.; Plante, J.F. Assurance Equations: A Cost and Criticality Model for Optimizing Quality Assurance Surveillance. In Proceedings of the 2022 IEEE Aerospace Conference (AERO), Big Sky, MT, USA, 5–12 March 2022; pp. 1–13. [Google Scholar] [CrossRef]
- Moore, B. How Bad Data Is Ruining Personalized Customer Experiences–And What to Do About It. Available online: https://www.infoverity.com/en/blog/how-bad-data-is-ruining-personalized-customer-experiences-and-what-to-do-about-it/ (accessed on 7 November 2025).
- Theodorakopoulos, L.; Theodoropoulou, A. Leveraging Big Data Analytics for Understanding Consumer Behavior in Digital Marketing: A Systematic Review. Hum. Behav. Emerg. Technol. 2024, 2024, 3641502. [Google Scholar] [CrossRef]
- Alves Gomes, M.; Meisen, T. A Review on Customer Segmentation Methods for Personalized Customer Targeting in E-Commerce Use Cases. Inf. Syst. E-Bus. Manag. 2023, 21, 527–570. [Google Scholar] [CrossRef]
- Fu, Q.; Nicholson, G.L.; Easton, J.M. Understanding Data Quality in a Data-Driven Industry Context: Insights from the Fundamentals. J. Ind. Inf. Integr. 2024, 42, 100729. [Google Scholar] [CrossRef]
- Sun, B. Data-Driven Personalized Marketing Strategy Optimization Based on User Behavior Modeling and Predictive Analytics: Sustainable Market Segmentation and Targeting. PLoS ONE 2025, 20, e0328151. [Google Scholar] [CrossRef]
- The Information Difference Ltd.; Experian. The Data Quality Landscape–Q1 2023; The Information Difference Ltd.: York, UK, 2023. [Google Scholar]
- Validity. The State of CRM Data Management in 2024; Validity: Boston, MA, USA, 2024. [Google Scholar]
- Rahm, E.; Do, H. Data Cleaning: Problems and Current Approaches. IEEE Data Eng. Bull. 2000, 23, 3–14. [Google Scholar]
- Nagle, T.; Redman, T.C.; Sammon, D. Only 3% of Companies’ Data Meets Basic Quality Standards. Harv. Bus. Rev. 2017, 95, 2–5. [Google Scholar]
- Ahani, A.; Rahim, N.Z.A.; Nilashi, M. Forecasting Social CRM Adoption in SMEs: A Combined SEM-Neural Network Method. Comput. Hum. Behav. 2017, 75, 560–578. [Google Scholar] [CrossRef]
- Delone, W.H.; McLean, E.R. The DeLone and McLean Model of Information Systems Success: A Ten-Year Update. J. Manag. Inf. Syst. 2003, 19, 9–30. [Google Scholar] [CrossRef]
- Azeroual, O.; Saake, G.; Abuosba, M.; Schöpfel, J. Data Quality as a Critical Success Factor for User Acceptance of Research Information Systems. Data 2020, 5, 35. [Google Scholar] [CrossRef]
- Redman, T.C. To Improve Data Quality, Start at the Source. Harv. Bussiness Rev. 2020. Available online: https://hbr.org/2020/02/to-improve-data-quality-start-at-the-source (accessed on 27 July 2025).
- Gatzert, N. The Impact of Corporate Reputation and Reputation Damaging Events on Financial Performance: Empirical Evidence from the Literature. Eur. Manag. J. 2015, 33, 485–499. [Google Scholar] [CrossRef]
- Peña-García, N.; Losada-Otálora, M.; Auza, D.P.; Cruz, M.P. Reviews, Trust, and Customer Experience in Online Marketplaces: The Case of Mercado Libre Colombia. Front. Commun. 2024, 9, 1460321. [Google Scholar] [CrossRef]
- Rushing, B.; Xu, S.; Fairman, A. From Breach to Bias: Measuring Reputation Value and Trust Recovery after Cyber Incidents in Critical Infrastructure. Int. J. Crit. Infrastruct. Prot. 2025, 50, 100787. [Google Scholar] [CrossRef]
- Açikgöz, F.Y.; Kayakuş, M.; Zăbavă, B.-Ș.; Kabas, O. Brand Reputation and Trust: The Impact on Customer Satisfaction and Loyalty for the Hewlett-Packard Brand. Sustainability 2024, 16, 9681. [Google Scholar] [CrossRef]
- Nuortimo, K.; Harkonen, J.; Breznik, K. Exploring Corporate Reputation and Crisis Communication. J. Mark. Anal. 2024, 2024, 1–22. [Google Scholar] [CrossRef]
- Nagalakshmi, M.; Sai Sri Charan, Y.; Farooq, B.; Gaur, S.; Saxena, R.; Soni, D. The Role of Brand Image in Strategy. Adv. Consum. Res. 2025, 2, 623–626. [Google Scholar]
- Barakat Ali, M.A. The Effect of Firm’s Brand Reputation on Customer Loyalty and Customer Word of Mouth: The Mediating Role of Customer Satisfaction and Customer Trust. Int. Bus. Res. 2022, 15, 30. [Google Scholar] [CrossRef]
- La, S.; Choi, B. The Role of Customer Affection and Trust in Loyalty Rebuilding after Service Failure and Recovery. Serv. Ind. J. 2012, 32, 105–125. [Google Scholar] [CrossRef]
- McCance, L. Fixing the Foundation: The State of Marketing Data Quality 2025; Adverity: Vienna, Austria, 2025; Available online: https://www.adverity.com/state-of-play-research-data-quality-2025 (accessed on 21 July 2025).
- Zhou, Y.; Shi, J.; Stein, R.; Liu, X.; Baldassano, R.N.; Forrest, C.B.; Chen, Y.; Huang, J. Missing Data Matter: An Empirical Evaluation of the Impacts of Missing EHR Data in Comparative Effectiveness Research. J. Am. Med. Inf. Assoc 2023, 30, 1246–1256. [Google Scholar] [CrossRef]
- Lewis, A.E.; Weiskopf, N.; Abrams, Z.B.; Foraker, R.; Lai, A.M.; Payne, P.R.O.; Gupta, A. Electronic Health Record Data Quality Assessment and Tools: A Systematic Review. J. Am. Med. Inf. Assoc. 2023, 30, 1730–1740. [Google Scholar] [CrossRef] [PubMed]
- Heilbroner, S.P.; Carter, C.; Vidmar, D.M.; Mueller, E.T.; Stumpe, M.C.; Miotto, R. A Self-Supervised Framework for Laboratory Data Imputation in Electronic Health Records. Commun. Med. 2025, 5, 251. [Google Scholar] [CrossRef] [PubMed]
- Kumar, P.; Gupta, V. Ai-Driven Market Analysis and Business Intelligence. Int. J. Res. Manag. 2024, 6, 252–260. [Google Scholar] [CrossRef]
- European Securities and Market Authority. 2024 Report on Quality and Use of Data; European Securities and Market Authority: Paris, France, 2024. [Google Scholar]
- Harish, A. When NASA Lost a Spacecraft Due to a Metric Math Mistake. Available online: https://www.simscale.com/blog/nasa-mars-climate-orbiter-metric/ (accessed on 6 November 2025).
- Euler, E.A.; Jolly, S.; Curtis, H.H. The Failures of the Mars Climate Orbiter and Mars Polar Lander: A Perspective from the People Involved (Paper AAS 01-074). In Proceedings of the 44th Annual American Astronautical Society Guidance, Navigation, and Control Conference, 2022, Harbin, China, 5–7 August 2022; American Astronautical Society: Breckenridge, CO, USA, 2001; pp. 2–22. [Google Scholar]
- Abdullah, F. A Case Study on the Mars Climate Orbiter and Mars Polar Lander Failures: What Is the Cost of Underestimating Testing. In Zenodo; Zenodo: Geneva, Switzerland, 2025. [Google Scholar]
- NASA Tangles with the Metric System. Science 1999, 286, 2241. [CrossRef]
- Reichhardt, T. NASA Reworks Its Sums after Mars Fiasco. Nature 1999, 401, 517. [Google Scholar] [CrossRef]
- Davidson, N. The Cost of Poor Data Quality on Business Operations. Available online: https://lakefs.io/blog/poor-data-quality-business-costs/ (accessed on 26 August 2024).
- Yackel, R. The Impact of Bad Data: A Case Study on Unity. Available online: https://www.ibm.com/think/insights/observability-data-benefits (accessed on 6 November 2025).
- Xie, J.; Sun, L.; Zhao, Y.F. On the Data Quality and Imbalance in Machine Learning-Based Design and Manufacturing—A Systematic Review. Engineering 2025, 45, 105–131. [Google Scholar] [CrossRef]
- U.S. Government Accountability Office. Criminal History Records: Additional Actions Could Enhance the Completeness of Records Used for Employment-Related Background Checks; GAO: Washington, DC, USA, 2015.
- Lageson, S.; Stewart, R. The Problem with Criminal Records: Discrepancies between State Reports and Private–sector Background Checks. Criminology 2024, 62, 5–34. [Google Scholar] [CrossRef]
- Goggins, B.; DeBacco, D. Survey of State Criminal History Information Systems, 2020; Bureau of Justice Statistics: Washington, DC, USA, 2022.
- Bureau of Justice Statistics. FY 2023 National Criminal History Improvement Program (NCHIP); Bureau of Justice Statistics: Washington, DC, USA, 2023.
- Wand, Y.; Wang, R. Anchoring Data Quality Dimensions in Ontological Foundations. Commun. ACM 1996, 39, 86–95. [Google Scholar] [CrossRef]
- LaValle, C.R.; Haas, S.M.; Nolan, J.J. Testing the Validity of Demonstrated Imputation Methods on Longitudinal NIBRS Data; West Virginia Criminal Justice Statistical Analysis Center: Charleston, WV, USA, 2014.
- Prescott, J.J.; Starr, S.B. Expungement of Criminal Convictions: An Empirical Study. Harv. Law Rev. 2020, 133, 2460–2550. [Google Scholar] [CrossRef]
- Redman, T. Data Driven: Profiting from Your Most Important Business Asset; Redman, T., Ed.; Harvard Business Review Press: Cambridge, MA, USA, 2013. [Google Scholar]
- Strom, K.J.; Smith, E.L. The Future of Crime Data. Criminol. Public Policy 2017, 16, 1027–1048. [Google Scholar] [CrossRef]
- Mahendra, P.; Doshi, P.; Verma, A.; Shrivastava, S. A Comprehensive Review of AI and ML in Data Governance and Data Quality. In Proceedings of the 2025 3rd International Conference on Inventive Computing and Informatics (ICICI), Bangalore, India, 4–6 June 2025; pp. 1–6. [Google Scholar] [CrossRef]
- Inmon, W.H. Building the Data Warehouse, 3rd ed.; John Wiley & Sons: Hoboken, NJ, USA, 2005. [Google Scholar]
- KPMG. Managing the Data Challenge in Banking; KPMG: London, UK, 2014. [Google Scholar]
- Jeleel-Ojuade, A. The Role of Information Silos: An Analysis of How the Categorization of Information Creates Silos within Financial Institutions, Hindering Effective Communication and Collaboration. SSRN Electron. J. 2024, 2014, 4881342. [Google Scholar] [CrossRef]
- European Central Bank (SSM). Guide on Effective Risk Data Aggregation and Risk Reporting; European Central Bank (SSM): Frankfurt, Germany, 2024. [Google Scholar]
- Basel Committee on Banking Supervision. Principles for Effective Risk Data Aggregation and Risk Reporting (BCBS 239); Bank for International Settlements: Basel, Switzerland, 2013. [Google Scholar]
- Dehghani, Z. Data Mesh: Delivering Data-Driven Value at Scale; O’Reilly Media: Sebastopol, CA, USA, 2022. [Google Scholar]
- Capirossi, J.; Rabier, P. An Enterprise Architecture and Data Quality Framework; Springer: Berlin/Heidelberg, Germany, 2013; pp. 67–79. [Google Scholar] [CrossRef]
- Taleb, I.; Serhani, M.A.; Bouhaddioui, C.; Dssouli, R. Big Data Quality Framework: A Holistic Approach to Continuous Quality Management. J. Big Data 2021, 8, 76. [Google Scholar] [CrossRef]
- Alaqla, M.F. The Impact of IT Governance and Administrative Information Quality on Decision-Making in the Banking Sector. Corp. Gov. Organ. Behav. Rev. 2023, 7, 171–182. [Google Scholar] [CrossRef]
- Weill, P.; Ross, J. IT Governance: How Top Performers Manage IT Decision Rights for Superior Results; Harvard Business School Press: Cambridge, MA, USA, 2004. [Google Scholar]
- Khatri, V.; Brown, C.V. Designing Data Governance. Commun. ACM 2010, 53, 148–152. [Google Scholar] [CrossRef]
- Storey, V.C.; Dewan, R.M.; Freimer, M. Data Quality: Setting Organizational Policies. Decis. Support Syst. 2012, 54, 434–442. [Google Scholar] [CrossRef]
- Yang, Y. Applications and Challenges of Big Data in Market Analytics. Trans. Econ. Bus. Manag. Res. 2024, 9, 450–458. [Google Scholar] [CrossRef]
- Gomez-Uribe, C.A.; Hunt, N. The Netflix Recommender System. ACM Trans. Manag. Inf. Syst. 2016, 6, 1–19. [Google Scholar] [CrossRef]
- Pajkovic, N. Algorithms and Taste-Making: Exposing the Netflix Recommender System’s Operational Logics. Converg. Int. J. Res. New Media Technol. 2022, 28, 214–235. [Google Scholar] [CrossRef]
- Gerber, C. A Consumer Perspective on Netflix’s Recommender System. A Qualitative Analysis. Master’s Thesis, Erasmus University Rotterdam, Rotterdam, The Netherlands, 2021. [Google Scholar]
- Dutta, A. Personalized Content Recommendation Impact on User Engagement of Netflix. Int. J. Res. Publ. Rev. 2025, 6, 10889–10892. [Google Scholar]
- Kahn, M.G.; Raebel, M.A.; Glanz, J.M.; Riedlinger, K.; Steiner, J.F. A Pragmatic Framework for Single-Site and Multisite Data Quality Assessment in Electronic Health Record-Based Clinical Research. Med Care 2012, 50, S21–S29. [Google Scholar] [CrossRef]
- Daniel, C.; Kalra, D. Clinical Research Informatics: Contributions from 2018. Yearb Med. Inf. 2019, 28, 203–205. [Google Scholar] [CrossRef] [PubMed]
- Qualls, L.G.; Phillips, T.A.; Hammill, B.G.; Topping, J.; Louzao, D.M.; Brown, J.S.; Curtis, L.H.; Marsolo, K. Evaluating Foundational Data Quality in the National Patient-Centered Clinical Research Network (PCORnet®). eGEMs (Gener. Evid. Methods Improv. Patient Outcomes) 2018, 6, 3. [Google Scholar] [CrossRef] [PubMed]
- Daniel Boie, S.; Meyer-Eschenbach, F.; Schreiber, F.; Giesa, N.; Barrenetxea, J.; Guinemer, C.; Haufe, S.; Krämer, M.; Brunecker, P.; Prasser, F.; et al. A Scalable Approach for Critical Care Data Extraction and Analysis in an Academic Medical Center. Int. J. Med. Inf. 2024, 192, 105611. [Google Scholar] [CrossRef]
- Ozonze, O.; Scott, P.J.; Hopgood, A.A. Automating Electronic Health Record Data Quality Assessment. J. Med. Syst. 2023, 47, 23. [Google Scholar] [CrossRef] [PubMed]
- WHO. Data Quality Assurance (DQA) Toolkit; WHO: Geneva, Switzerland, 2022. [Google Scholar]
- European Medicines Agency. Committee for Medicinal Products for Human Use (CHMP). Data Quality Framework for EU Medicines Regulation: 4 Application to Real-World Data; European Medicines Agency: Amsterdam, The Netherlands, 2024. [Google Scholar]
- Hibbert, P.D.; Stewart, S.; Wiles, L.K.; Braithwaite, J.; Runciman, W.B.; Thomas, M.J.W. Improving Patient Safety Governance and Systems through Learning from Successes and Failures: Qualitative Surveys and Interviews with International Experts. Int. J. Qual. Health Care 2023, 35. [Google Scholar] [CrossRef]
- Oktaviana, R.S.; Handayani, P.W.; Hidayanto, A.N.; Siswanto, B.B. Healthcare Data Governance Assessment Based on Hospital Management Perspectives. Int. J. Inf. Manag. Data Insights 2025, 5, 100342. [Google Scholar] [CrossRef]
- Lighterness, A.; Adcock, M.; Scanlon, L.A.; Price, G. Data Quality–Driven Improvement in Health Care: Systematic Literature Review. J. Med. Internet Res. 2024, 26, e57615. [Google Scholar] [CrossRef]
- Payne, A.; Frow, P. Relationship Marketing: Looking Backwards towards the Future. J. Serv. Mark. 2017, 31, 11–15. [Google Scholar] [CrossRef]
- Choudhury, M.M.; Harrigan, P. CRM to Social CRM: The Integration of New Technologies into Customer Relationship Management. J. Strateg. Mark. 2014, 22, 149–176. [Google Scholar] [CrossRef]
- Becker, J.U.; Greve, G.; Albers, S. The Impact of Technological and Organizational Implementation of CRM on Customer Acquisition, Maintenance, and Retention. Int. J. Res. Mark. 2009, 26, 207–215. [Google Scholar] [CrossRef]
- Shum, P.; Bove, L.; Auh, S. Employees’ Affective Commitment to Change. Eur. J. Mark. 2008, 42, 1346–1371. [Google Scholar] [CrossRef]
- Adane, A.; Adege, T.M.; Ahmed, M.M.; Anteneh, H.A.; Ayalew, E.S.; Berhanu, D.; Berhanu, N.; Getnet, M.; Bishaw, T.; Busza, J.; et al. Exploring Data Quality and Use of the Routine Health Information System in Ethiopia: A Mixed-Methods Study. BMJ Open 2021, 11, e050356. [Google Scholar] [CrossRef] [PubMed]
- Tilahun, H.; Abate, B.; Belay, H.; Gebeyehu, A.; Ahmed, M.; Simanesew, A.; Ayele, W.; Mohammedsanni, A.; Knittel, B.; Wondarad, Y. Drivers and Barriers to Improved Data Quality and Data-Use Practices: An Interpretative Qualitative Study in Addis Ababa, Ethiopia. Glob. Health Sci. Pract. 2022, 10 (Suppl. 1), e2100689. [Google Scholar] [CrossRef]
- Tolera, A.; Firdisa, D.; Roba, H.S.; Motuma, A.; Kitesa, M.; Abaerei, A.A. Barriers to Healthcare Data Quality and Recommendations in Public Health Facilities in Dire Dawa City Administration, Eastern Ethiopia: A Qualitative Study. Front. Digit. Health 2024, 6, 1261031. [Google Scholar] [CrossRef]
- Gazi, M.A.I.; Al Mamun, A.; Al Masud, A.; Senathirajah, A.R.B.S.; Rahman, T. The Relationship between CRM, Knowledge Management, Organization Commitment, Customer Profitability and Customer Loyalty in Telecommunication Industry: The Mediating Role of Customer Satisfaction and the Moderating Role of Brand Image. J. Open Innov. Technol. Mark. Complex. 2024, 10, 100227. [Google Scholar] [CrossRef]
- Lee, Y.-C.; Wang, Y.-C.; Lu, S.-C.; Hsieh, Y.-F.; Chien, C.-H.; Tsai, S.-B.; Dong, W. An Empirical Research on Customer Satisfaction Study: A Consideration of Different Levels of Performance. Springerplus 2016, 5, 1577. [Google Scholar] [CrossRef]
- Guerola-Navarro, V.; Oltra-Badenes, R.; Gil-Gomez, H.; Gil-Gomez, J.A. Research Model for Measuring the Impact of Customer Relationship Management (CRM) on Performance Indicators. Econ. Res.-Ekon. Istraživanja 2021, 34, 2669–2691. [Google Scholar] [CrossRef]
- Eklof, J.; Podkorytova, O.; Malova, A. Linking Customer Satisfaction with Financial Performance: An Empirical Study of Scandinavian Banks. Total Qual. Manag. Bus. Excell. 2020, 31, 1684–1702. [Google Scholar] [CrossRef]
- Prasad, A. Impact of Poor Data Quality on Business Performance: Challenges, Costs, and Solutions. SSRN Electron. J. 2024. Available online: https://ssrn.com/abstract=4843991 (accessed on 29 July 2025). [CrossRef]
- Haverila, M.; Haverila, K.C.; Mohiuddin, M.; Su, Z. The Impact of Quality of Big Data Marketing Analytics (BDMA) on the Market and Financial Performance. J. Glob. Inf. Manag. 2022, 30, 1–21. [Google Scholar] [CrossRef]
- Redyuk, S.; Kaoudi, Z.; Markl, V.; Schelter, S. Automating Data Quality Validation for Dynamic Data Ingestion. In Proceedings of the 24th International Conference on Extending Database Technology, EDBT’21, Nicosia, Cyprus, 23–26 March 2021; pp. 61–72. [Google Scholar]
- Syed, R.; Eden, R.; Makasi, T.; Chukwudi, I.; Mamudu, A.; Kamalpour, M.; Kapugama Geeganage, D.; Sadeghianasl, S.; Leemans, S.J.J.; Goel, K.; et al. Digital Health Data Quality Issues: Systematic Review. J. Med. Internet Res. 2023, 25, e42615. [Google Scholar] [CrossRef]
- Barchard, K.A.; Freeman, A.J.; Ochoa, E.; Stephens, A.K. Comparing the Accuracy and Speed of Four Data-Checking Methods. Behav. Res. Methods 2020, 52, 97–115. [Google Scholar] [CrossRef]
- Perez-Castillo, R.; Carretero, A.G.; Caballero, I.; Rodriguez, M.; Piattini, M.; Mate, A.; Kim, S.; Lee, D. DAQUA-MASS: An ISO 8000-61 Based Data Quality Management Methodology for Sensor Data. Sensors 2018, 18, 3105. [Google Scholar] [CrossRef]
- Silva, M.D.S.T.; Correia, S.É.N.; de A. Machado, P.; de Oliveira, V.M. Adoption of Information Technology in Public Administration: A Focus on the Organizational Factors of a Brazilian Federal University. Teor. Prática Adm. 2020, 10, 138–153. [Google Scholar] [CrossRef]
- Yukhno, A. Digital Transformation: Exploring Big Data Governance in Public Administration. Public Organ. Rev. 2024, 24, 335–349. [Google Scholar] [CrossRef]
- Cerrillo-Martínez, A.; Casadesús-de-Mingo, A. Data Governance for Public Transparency. El Prof. Inf. 2021, 30, e300402. [Google Scholar] [CrossRef]
- Lutsenko, K. Digitalisation of Public Administration: Challenges and Prospects. Health Leadersh. Qual. Life 2024, 3, 434. [Google Scholar] [CrossRef]
- OECD. Developing Skills for Digital Government: A Review of Good Practices Across OECD Governments; OECD: Paris, France, 2024. [Google Scholar]
- Tawil, A.-R.; Mohamed, M.; Schmoor, X.; Vlachos, K.; Haidar, D. Trends and Challenges Towards an Effective Data-Driven Decision Making in UK SMEs: Case Studies and Lessons Learnt from the Analysis of 85 SMEs. arXiv 2023, arXiv:2305.15454. [Google Scholar] [CrossRef]
- Mohamed, M.; Weber, P. Trends of Digitalization and Adoption of Big Data & Analytics among UK SMEs: Analysis and Lessons Drawn from a Case Study of 53 SMEs. In Proceedings of the 2020 IEEE International Conference on Engineering, Technology and Innovation (ICE/ITMC), Cardiff, UK, 15–17 June 2020; pp. 1–6. [Google Scholar] [CrossRef]
- Gates, S. 5 Examples of Bad Data Quality in Business—And How to Avoid Them. Available online: https://www.montecarlodata.com/blog-bad-data-quality-examples/ (accessed on 26 August 2024).
- Federal Trade Commission. Report to Congress Under Section 319 of the Fair and Accurate Credit Transactions Act of 2003; Federal Trade Commission: Washington, DC, USA, 2015.
- Schroeder, P. US Consumer Bureau Fines Equifax $15 Million over Handling of Consumer Disputes. Reuters 2025. Available online: https://www.reuters.com/business/finance/us-consumer-bureau-fines-equifax-15-million-issues-fixing-consumer-disputes-2025-01-17/ (accessed on 26 August 2024).
- Mars Climate Orbiter Mishap Investigation Board. Mars Climate Orbiter Mishap Investigation Board Phase I Report; Mars Climate Orbiter Mishap Investigation Board: Washington, DC, USA, 2019.
- Data Ladder. Data Ladder Whitepapers|How Legacy Systems and Bad Data Quality Hinders a Digital Transformation Plan-Data Ladder. Available online: https://dataladder.com/whitepapers/how-legacy-systems-and-bad-data-quality-hinders-a-digital-transformation-plan/?imz_s=9nekd6omo7qd26qrf5hdkithi6%2F (accessed on 9 November 2025).
- Deepak Veeravalli, S. Legacy System Modernization: Guidelines for Migrating from Legacy Systems to Salesforce: Address Challenges and Implementing Best Practices with Reusable Integration Blueprints. Int. J. Comput. Sci. Inf. Technol. Res. 2022, 3, 133–144. [Google Scholar] [CrossRef]
- Hüner, M.K.; Schierning, A.; Otto, B.; Österle, H. Product Data Quality in Supply Chains: The Case of Beiersdorf. Electron. Mark. 2011, 21, 141–154. [Google Scholar] [CrossRef]
- Rigo, G.-E.; Pedron, C.D.; Caldeira, M.; De Araújo, C.C.S. CRM Adoption in a Higher Education Institution. J. Inf. Syst. Technol. Manag. 2016, 13, 45–60. [Google Scholar] [CrossRef]
- Weerts, D.J.; Ronca, J.M. Characteristics of Alumni Donors Who Volunteer at Their Alma Mater. Res. High. Educ. 2008, 49, 274–292. [Google Scholar] [CrossRef]
- Research Group of the Office of the Privacy Commissioner of Canada. The Age of Predictive Analytics: From Patterns to Predictions-Office of the Privacy Commissioner of Canada; Research Group of the Office of the Privacy Commissioner of Canada: Gatineau, QC, Canada, 2012. [Google Scholar]
- Biemer, P.P. Data Quality and Inference Errors. In Big Data and Social Science Data Science Methods and Tools for Research and Practice; Foster, I., Ghani, R., Jarmin, R., Kreuter, F., Lane, J., Eds.; CRC: Boca Raton, FL, USA, 2020. [Google Scholar]
- Butler, D. When Google Got Flu Wrong. Nature 2013, 494, 155–156. [Google Scholar] [CrossRef]
- Lazer, D.; Kennedy, R.; King, G.; Vespignani, A. The Parable of Google Flu: Traps in Big Data Analysis. Science (1979) 2014, 343, 1203–1205. [Google Scholar] [CrossRef]
- Lazer, D.; Kennedy, R.; King, G.; Vespignani, A. Google Flu Trends Still Appears Sick: An Evaluation of the 2013–2014 Flu Season. SSRN Electron. J. 2014, 2014, 2408560. [Google Scholar] [CrossRef]
- Algemene Rekenkamer. Datagedreven Selectie van Aangiften Door de Belastingdienst|Rapport|Algemene Rekenkamer [Data-Driven Selection of Tax Returns by the Dutch Tax and Customs Administration|Report|Netherlands Court of Audit]; Algemene Rekenkamer: The Hague, The Netherlands, 2019. [Google Scholar]
- OECD. Tax Administration 3.0: The Digital Transformation of Tax Administration; OECD: Paris, France, 2020. [Google Scholar] [CrossRef]
- Aslett, J. Tax Administration; International Monetary Fund: Wasington, DC, USA, 2024; Volume 2024. [Google Scholar] [CrossRef]
- WiredGov. The Damaging Impact of Poor Quality Data in the Public Secto|Official Press Release. Available online: https://www.wired-gov.net/wg/content.nsf/industrynews/The+damaging+impact+of+poor+quality+data+in+the+public+sector?open&id=BDEX-6ZFKSP (accessed on 9 November 2025).
- Marzullo, A.; Savevski, V.; Menini, M.; Schilirò, A.; Franchellucci, G.; Dal Buono, A.; Bezzio, C.; Gabbiadini, R.; Hassan, C.; Repici, A.; et al. Collecting and Analyzing IBD Clinical Data for Machine-Learning: Insights from an Italian Cohort. Data 2025, 10, 100. [Google Scholar] [CrossRef]
- Tlouyamma, J.; Mokwena, S. Automated Data Quality Control System in Health and Demographic Surveillance System. Sci. Eng. Technol. 2024, 4, 82–91. [Google Scholar] [CrossRef]
- Razzaghi, H.; Goodwin Davies, A.; Boss, S.; Bunnell, H.T.; Chen, Y.; Chrischilles, E.A.; Dickinson, K.; Hanauer, D.; Huang, Y.; Ilunga, K.T.S.; et al. Systematic Data Quality Assessment of Electronic Health Record Data to Evaluate Study-Specific Fitness: Report from the PRESERVE Research Study. PLoS Digit. Health 2024, 3, e0000527. [Google Scholar] [CrossRef] [PubMed]
- Wang, Y.; Hulstijn, J.; Tan, Y.-H. Data Quality Assurance in International Supply Chains: An Application of the Value Cycle Approach to Customs Reporting. Int. J. Adv. Logist. 2016, 5, 76–85. [Google Scholar] [CrossRef]
- Zovko, L. Digitalization in Health Systems in the European Union. Bachelor’s Thesis, University of Zagreb, Zagreb, Croatia, 2025. [Google Scholar]
- Gelashvili-Luik, T.; Vihma, P.; Pappel, I. Navigating the AI Revolution: Challenges and Opportunities for Integrating Emerging Technologies into Knowledge Management Systems. Systematic Literature Review. Front. Artif. Intell. 2025, 8, 1595930. [Google Scholar] [CrossRef] [PubMed]
- Masod, M.Y.B.; Zakaria, S.F. Artificial Intelligence Adoption in the Manufacturing Sector: Challenges and Strategic Framework. Int. J. Res. Innov. Soc. Sci. 2024, 8, 150–158. [Google Scholar] [CrossRef]
- Kapiki, S.; Pappa, A. Enhancing Healthcare Efficiency: Leveraging Advanced Maintenance Management for Optimal Staff Performance. J. Health Organ. Manag. 2025, 39, 398–418. [Google Scholar] [CrossRef]
- Davidson, P.L.; Hunt, J.; La Manna, A.; Luke, D.A. Editorial: Impact Evaluation Using the Translational Science Benefits Model Framework in the National Center for Advancing Translational Science Clinical and Translational Science Award Program. Front. Public Health 2025, 13, 1707595. [Google Scholar] [CrossRef] [PubMed]
- Ebenso, B.; Namisango, E.; Abejirinde, I.-O.; Allsop, M.J. Editorial: The Scale-up and Sustainability of Digital Health Interventions in Low- and Middle-Income Settings. Front. Digit. Health 2025, 7, 1634223. [Google Scholar] [CrossRef]
- Shi, Y.; Li, J.; Kou, G.; Tien, J.M.; Berg, D. Merging Artificial Intelligence and Business Applications: Preface for ITQM 2025. Procedia Comput. Sci. 2025, 266, 1–8. [Google Scholar] [CrossRef]
- Pykes, K. 10 Signs of Bad Data: How to Spot Poor Quality Data. Available online: https://www.datacamp.com/blog/10-signs-bad-data-quality (accessed on 26 August 2024).
- Fu, A.; Shen, T.; Roberts, S.B.; Liu, W.; Vaidyanathan, S.; Marchena-Romero, K.-J.; Lam, Y.Y.P.; Shah, K.; Mak, D.Y.F.; Chin, S.; et al. Optimizing the Efficiency and Effectiveness of Data Quality Assurance in a Multicenter Clinical Dataset. J. Am. Med. Inform. Assoc. 2025, 32, 835–844. [Google Scholar] [CrossRef]
- Haverila, M.J.; Haverila, K.C. The Influence of Quality of Big Data Marketing Analytics on Marketing Capabilities: The Impact of Perceived Market Performance! Mark. Intell. Plan. 2024, 42, 346–372. [Google Scholar] [CrossRef]
- Lee, D.-H.; Kim, H. A Self-Attention-Based Imputation Technique for Enhancing Tabular Data Quality. Data 2023, 8, 102. [Google Scholar] [CrossRef]
- Becerra, M.A.; Tobón, C.; Castro-Ospina, A.E.; Peluffo-Ordóñez, D.H. Information Quality Assessment for Data Fusion Systems. Data 2021, 6, 60. [Google Scholar] [CrossRef]
- MacDonald, L. Measuring Data Quality: Key Metrics, Processes, and Best Practices. Available online: https://www.montecarlodata.com/blog-measuring-data-quality-key-metrics-processes-and-best-practices/ (accessed on 27 August 2024).
- Karkošková, S. Data Governance Model to Enhance Data Quality in Financial Institutions. Inf. Syst. Manag. 2023, 40, 90–110. [Google Scholar] [CrossRef]
- Sluzki, N. 8 Data Quality Monitoring Techniques & Metrics to Watch. Available online: https://www.ibm.com/think/topics/data-quality-monitoring-techniques (accessed on 27 August 2024).
- Verma, P.; Kumar, V.; Mittal, A.; Rathore, B.; Jha, A.; Rahman, M.S. The Role of 3S in Big Data Quality: A Perspective on Operational Performance Indicators Using an Integrated Approach. TQM J. 2023, 35, 153–182. [Google Scholar] [CrossRef]
- Woods, C.; Selway, M.; Bikaun, T.; Stumptner, M.; Hodkiewicz, M. An Ontology for Maintenance Activities and Its Application to Data Quality. Semant. Web 2024, 15, 319–352. [Google Scholar] [CrossRef]
- Stepanenko, R. Data Stewardship Explained: The Backbone of Data Management. Available online: https://recordlinker.com/data-stewardship-explained/ (accessed on 14 September 2025).
- Jatin, B. Data Governance for Quality: Policies Ensuring Reliable Data. Available online: https://www.decube.io/post/data-quality-data-governance (accessed on 27 August 2024).
- Hanna, M.G.; Pantanowitz, L.; Jackson, B.; Palmer, O.; Visweswaran, S.; Pantanowitz, J.; Deebajah, M.; Rashidi, H.H. Ethical and Bias Considerations in Artificial Intelligence/Machine Learning. Mod. Pathol. 2025, 38, 100686. [Google Scholar] [CrossRef]
- Duggireddy, G.B.R. Integrated Data and AI Governance Framework: A Lifecycle Approach to Responsible AI Implementation. J. Comput. Sci. Technol. Stud. 2025, 7, 771–777. [Google Scholar]
- Papagiannidis, E.; Mikalef, P.; Conboy, K. Responsible Artificial Intelligence Governance: A Review and Research Framework. J. Strateg. Inf. Syst. 2025, 34, 101885. [Google Scholar] [CrossRef]
- Floridi, L.; Taddeo, M. What Is Data Ethics? Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 2016, 374, 20160360. [Google Scholar] [CrossRef]
- Pahune, S.; Akhtar, Z.; Mandapati, V.; Siddique, K. The Importance of AI Data Governance in Large Language Models. Big Data Cogn. Comput. 2025, 9, 147. [Google Scholar] [CrossRef]
- Forrest, S. Study Examines Accuracy of Arrest Data in FBI’s NIBRS Crime Database. Available online: https://phys.org/news/2022-02-accuracy-fbi-nibrs-crime-database.html (accessed on 15 August 2024).
- Labarrère, N.; Costa, L.; Lima, R.M. Data Science Project Barriers—A Systematic Review. Data 2025, 10, 132. [Google Scholar] [CrossRef]
- Illinois Criminal Justice Information Authority. Annual Audit Report for 1982–1983: Data Quality of Computerized Criminal Histories; National Criminal Justice Reference Service (NCJRS): Rockville, MD, USA, 1983.
- Bosse, R.C.; Jino, M.; de Franco Rosa, F. A Study on Data Quality and Analysis in Business Intelligence; Springer: Berlin/Heidelberg, Germany, 2024; pp. 249–253. [Google Scholar] [CrossRef]
- Sienkiewicz, M. From Data Silos to Data Mesh: A Case Study in Financial Data Architecture. In Proceedings of the 36th International Conference, DEXA 2025, Bangkok, Thailand, 25–27 August 2025; pp. 3–20. [Google Scholar] [CrossRef]
- Senguttuvan, K.R. Multi-Agent Based Automated Data Quality Engineering. Master’s Thesis, Fordham University, New York, NY, USA, 2025. [Google Scholar]
- Stamkou, C.; Saprikis, V.; Fragulis, G.F.; Antoniadis, I. User Experience and Perceptions of AI-Generated E-Commerce Content: A Survey-Based Evaluation of Functionality, Aesthetics, and Security. Data 2025, 10, 89. [Google Scholar] [CrossRef]
- Vanam, R.R.; Pingili, R.; Myadaboyina, S.G. AI-Based Data Quality Assurance for Business Intelligence and Decision Support Systems. Int. J. Emerg. Trends Comput. Sci. Inf. Technol. 2025, 6, 21–29. [Google Scholar] [CrossRef]
- Elouataoui, W.; El Mendili, S.; Gahi, Y. An Automated Big Data Quality Anomaly Correction Framework Using Predictive Analysis. Data 2023, 8, 182. [Google Scholar] [CrossRef]
- Pasupuleti, S. AI-Augmented Data Pipelines: Integrating Machine Learning for Intelligent Data Processing. J. Comput. Sci. Technol. Stud. 2025, 7, 276–283. [Google Scholar]
- Dhanekula, A. AI-Driven Business Intelligence Framework for Predictive Decision-Making and Strategic Resource Optimization. Int. J. Bus. Econ. Insights 2025, 05, 1238–1270. [Google Scholar] [CrossRef]
- Tomar, S.; Kadaverugu, R. Trend Analysis of Long-Term Temperature Data for Prediction of Heat Waves Through Statistical Analysis Using Extreme Value Theory for Climate Disaster Management; Springer: Singapore, 2025; pp. 91–106. [Google Scholar] [CrossRef]
- Cinar, R.F.; Yuksek, G.; Lale, T.; Ekinci, S.; Izci, D.; Ma’arif, A. SHAP-Based Framework for Temporal Detection of Sensor Drift in Gas Sensor Arrays. J. Robot. Control 2025, 6, 2592–2601. [Google Scholar]
- Shafaghat, A. Integrating Artificial Intelligence and Machine Learning to Forecast Air Pollution Impacts on Climate Variability and Public Health. bioRxiv 2025, 2025, 685968. [Google Scholar] [CrossRef]
- Ermilov, A.; Tveritnev, A.; Trusova, A. New Role of Technical Specialists to Enable Digital Transformation in the Petroleum Industry: A Petrophysicist-Based Proof of Concept. In Proceedings of the Abu Dhabi International Petroleum Exhibition and Conference, Abu Dhabi, United Arab Emirates, 3–6 November 2025; SPE: Washington, DC, USA, 2025. [Google Scholar] [CrossRef]
- ISO 8000-8:2015; Data Quality—Part 8: Information and Data Quality: Concepts and Measuring. ISO: Geneva, Switzerland, 2015.
- Abhishek, A.; Erickson, L.; Bandopadhyay, T. Data and AI Governance: Promoting Equity, Ethics, and Fairness in Large Language Models. arXiv 2025, arXiv:2508.03970. [Google Scholar] [CrossRef]
- Angwin, J.; Larson, J.; Mattu, S.; Kirchner, L.; ProPublica. Machine Bias—ProPublica. Available online: https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing (accessed on 1 November 2025).
- Obermeyer, Z.; Powers, B.; Vogeli, C.; Mullainathan, S. Dissecting Racial Bias in an Algorithm Used to Manage the Health of Populations. Science 2019, 366, 447–453. [Google Scholar] [CrossRef]
- Batool, A.; Zowghi, D.; Bano, M. AI Governance: A Systematic Literature Review. AI Ethics 2025, 5, 3265–3279. [Google Scholar] [CrossRef]
- Gebru, T.; Morgenstern, J.; Vecchione, B.; Vaughan, J.W.; Wallach, H.; Daumé, H., III; Crawford, K. Datasheets for Datasets. Commun. ACM 2021, 64, 86–92. [Google Scholar] [CrossRef]
- Franklin, G.; Stephens, R.; Piracha, M.; Tiosano, S.; Lehouillier, F.; Koppel, R.; Elkin, P.L. The Sociodemographic Biases in Machine Learning Algorithms: A Biomedical Informatics Perspective. Life 2024, 14, 652. [Google Scholar] [CrossRef]
- Leslie, D. Understanding Artificial Intelligence Ethics and Safety; The Alan Turing Institute: London, UK, 2019. [Google Scholar]
- Belenguer, L. AI Bias: Exploring Discriminatory Algorithmic Decision-Making Models and the Application of Possible Machine-Centric Solutions Adapted from the Pharmaceutical Industry. AI Ethics 2022, 2, 771–787. [Google Scholar] [CrossRef]
- Cross, T.P.; Wagner, A.; Bibel, D. The Accuracy of Arrest Data in the National Incident-Based Reporting System (NIBRS). Crime Delinq. 2023, 69, 2484–2507. [Google Scholar] [CrossRef]
- Bayram, F.; Ahmed, B.S. Towards Trustworthy Machine Learning in Production: An Overview of the Robustness in MLOps Approach. ACM Comput. Surv. 2025, 57, 1–35. [Google Scholar] [CrossRef]
- Kore, A.; Abbasi Bavil, E.; Subasri, V.; Abdalla, M.; Fine, B.; Dolatabadi, E.; Abdalla, M. Empirical data drift detection experiments on real-world medical imaging data. Nat. Commun. 2024, 15, 1887. [Google Scholar] [CrossRef] [PubMed]
- Wilkinson, M.D.; Dumontier, M.; Aalbersberg, I.J.; Appleton, G.; Axton, M.; Baak, A.; Blomberg, N.; Boiten, J.-W.; da Silva Santos, L.B.; Bourne, P.E.; et al. The FAIR Guiding Principles for Scientific Data Management and Stewardship. Sci. Data 2016, 3, 16001. [Google Scholar] [CrossRef]
- Mons, B.; Schultes, E.; Liu, F.; Jacobsen, A. The FAIR Principles: First Generation Implementation Choices and Challenges. Data Intell. 2020, 2, 1–9. [Google Scholar] [CrossRef]
- Stvilia, B.; Pang, Y.; Lee, D.J.; Gunaydin, F. Data Quality Assurance Practices in Research Data Repositories—A Systematic Literature Review. An Annual Review of Information Science and Technology (ARIST) Paper. J. Inf. Sci. Technol. 2025, 76, 238–261. [Google Scholar] [CrossRef]
- Open Data Institute. A Framework for AI-Ready Data; Open Data Institute: London, UK, 2025. [Google Scholar]
- Publications Office of the European Union. Principles and Recommendations to Make Data.Europa.Eu Data More Reusable; Publications Office of the European Union: Luxembourg, 2022. [Google Scholar]
- Clark, T.; Caufield, H.; Parker, J.A.; Al Manir, S.; Amorim, E.; Eddy, J.; Gim, N.; Gow, B.; Goar, W.; Haendel, M.; et al. AI-Readiness for Biomedical Data: Bridge2AI Recommendations. bioRxiv 2024, 2024, 619844. [Google Scholar] [CrossRef]
- Hiniduma, K.; Byna, S.; Bez, J.L.; Madduri, R. AI Data Readiness Inspector (AIDRIN) for Quantitative Assessment of Data Readiness for AI. In Proceedings of the 36th International Conference on Scientific and Statistical Database Management, Rennes, France, 10–12 July 2024; ACM: New York, NY, USA, 2024; pp. 1–12. [Google Scholar] [CrossRef]
- Ravi, N.; Chaturvedi, P.; Huerta, E.A.; Liu, Z.; Chard, R.; Scourtas, A.; Schmidt, K.J.; Chard, K.; Blaiszik, B.; Foster, I. FAIR Principles for AI Models with a Practical Application for Accelerated High Energy Diffraction Microscopy. Sci. Data 2022, 9, 657. [Google Scholar] [CrossRef] [PubMed]
- ISO/IEC 42001:2023; Information Technology—Artificial Intelligence—Management System. International Organization for Standardization: Geneva, Switzerland, 2023.
- Morshed, A. Ensuring Trust in Sustainability Financial Reports: The Role of AI and Blockchain in Metadata Standardization. Manag. Sustain. Arab Rev. 2025, 2025, 1–24. [Google Scholar] [CrossRef]
- Kamisetty, N.S. Intelligent Cloud-Based KNN Model for Enhancing Data Quality in SAP Financial Systems. Int. J. Res. Appl. Innov. (IJRAI) 2025, 8, 12909–12914. [Google Scholar]
- European Parliament; European Council. Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024 Laying down Harmonized Rules on Artificial Intelligence and Amending Certain Legislative Acts (Artificial Intelligence Act); European Union: Brussels, Belgium, 2024. [Google Scholar]
- Loza Corera, M. Data and Data Governance and Connections to Data Protection Principles in Article 10 of the Artificial Intelligence Act. In The European Union Artificial Intelligence Act; CotinoHueso, L., GaLetta, D., Eds.; Editoriale Scientifica: Napoli, Italy, 2025; pp. 595–626. [Google Scholar]
- National Institute of Standards and Technology (NIST). Artificial Intelligence Risk Management Framework (AI RMF 1.0). NIST Special Publication 1270; NIST: Gaithersburg, MD, USA, 2023.
- Rodríguez Valencia, L.; Ochoa Arellano, M.J.; Gutiérrez Figueroa, S.A.; Mur Nuño, C.; Monsalve Piqueras, B.; Corrales Paredes, A.D.V.; Bemposta Rosende, S.; López López, J.M.; Puertas Sanz, E.; Levi Alfaroviz, A. A Systematic Review of Artificial Intelligence Applied to Compliance: Fraud Detection in Cryptocurrency Transactions. J. Risk Financ. Manag. 2025, 18, 612. [Google Scholar] [CrossRef]
- Alotaibi, K.O. Developing a Comprehensive Financial Reporting Governance Framework Using AI Techniques. Eng. Technol. Appl. Sci. Res. 2025, 15, 29202–29207. [Google Scholar]
- Santos, W.D.S.; Coutinho, J.R.; Baião, F.; Miranda Spyrides, G.; Vieira Lopes, H.C. Enhancing Declarative Business Process Management Availability through Generative AI. Process Sci. 2025, 2, 21. [Google Scholar] [CrossRef]
- Ali, S.; Rehman, T.; Saira, S. Exploring Pakistan’s Legal Challenges in Artificial Intelligence Regulation: A Data-Driven Approach. Crit. Rev. Soc. Sci. Stud. 2025, 3, 1096–1108. [Google Scholar] [CrossRef]
- Bayram, F.; Ahmed, B.S.; Hallin, E. Adaptive Data Quality Scoring Operations Framework Using Drift-Aware Mechanism for Industrial Applications. J. Syst. Softw. 2024, 217, 112184. [Google Scholar] [CrossRef]
- Cheong, B.C. Transparency and accountability in AI systems: Safeguarding wellbeing in the age of algorithmic decision-making. Front. Hum. Dyn. 2024, 6, 1421273. [Google Scholar] [CrossRef]
- Bayram, S.B.; Caliskan, N. Effect of a Game-Based Virtual Reality Phone Application on Tracheostomy Care Education for Nursing Students: A Randomized Controlled Trial. Nurse Educ. Today 2019, 79, 25–31. [Google Scholar] [CrossRef]
- High-Level Expert Group on AI (AI HLEG). Ethics Guidelines for Trustworthy AI; European Union: Brussels, Belgium, 2019. [Google Scholar]
- Eubanks, V. Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor; St. Martin’s Press: New York, NY, USA, 2018. [Google Scholar]
- Lyu, Q.; Tan, J.; Zapadka, M.E.; Ponnatapura, J.; Niu, C.; Myers, K.J.; Wang, G.; Whitlow, C.T. Translating Radiology Reports into Plain Language Using ChatGPT and GPT-4 with Prompt Learning: Results, Limitations, and Potential. Vis. Comput. Ind. Biomed. Art 2023, 6, 9. [Google Scholar] [CrossRef]
- Kazlaris, I.; Antoniou, E.; Diamantaras, K.; Bratsas, C. From Illusion to Insight: A Taxonomic Survey of Hallucination Mitigation Techniques in LLMs. AI 2025, 6, 260. [Google Scholar] [CrossRef]
- Anh-Hoang, D.; Tran, V.; Nguyen, L.-M. Survey and Analysis of Hallucinations in Large Language Models: Attribution to Prompting Strategies or Model Behavior. Front. Artif. Intell. 2025, 8, 1622292. [Google Scholar] [CrossRef] [PubMed]
- Farquhar, S.; Kossen, J.; Kuhn, L.; Gal, Y. Detecting Hallucinations in Large Language Models Using Semantic Entropy. Nature 2024, 630, 625–630. [Google Scholar] [CrossRef]
- Polyzotis, N.; Zinkevich, M.; Roy, S.; Breck, E.; Whang, S. Data Validation for Machine Learning. In Proceedings of the Second Conference on Machine Learning and Systems, SysML 2019, Stanford, CA, USA, 31 March–2 April 2019. [Google Scholar]
- Gautam, A.R. Impact of High Data Quality on LLM Hallucinations. Int. J. Comput. Appl. 2025, 187, 35–39. [Google Scholar] [CrossRef]
- Chelli, M.; Descamps, J.; Lavoué, V.; Trojani, C.; Azar, M.; Deckert, M.; Raynier, J.-L.; Clowez, G.; Boileau, P.; Ruetsch-Chelli, C. Hallucination Rates and Reference Accuracy of ChatGPT and Bard for Systematic Reviews: Comparative Analysis. J. Med. Internet Res. 2024, 26, e53164. [Google Scholar] [CrossRef]
- Park, S.; Nan, X. Generative AI and Misinformation: A Scoping Review of the Role of Generative AI in the Generation, Detection, Mitigation, and Impact of Misinformation. AI Soc. 2025, 1–15. [Google Scholar] [CrossRef]
- Simon, F.M.; Altay, S.; Mercier, H. Misinformation Reloaded? Fears about the Impact of Generative AI on Misinformation Are Overblown. Harv. Kennedy Sch. Misinform. Rev. 2023. [Google Scholar] [CrossRef]
- Ferrara, C.; Sellitto, G.; Ferrucci, F.; Palomba, F.; De Lucia, A. Fairness-Aware Machine Learning Engineering: How Far Are We? Empir. Softw. Eng. 2024, 29, 9. [Google Scholar] [CrossRef] [PubMed]
- Ali, S.; Abuhmed, T.; El-Sappagh, S.; Muhammad, K.; Alonso-Moral, J.M.; Confalonieri, R.; Guidotti, R.; Del Ser, J.; Díaz-Rodríguez, N.; Herrera, F. Explainable Artificial Intelligence (XAI): What We Know and What Is Left to Attain Trustworthy Artificial Intelligence. Inf. Fusion 2023, 99, 101805. [Google Scholar] [CrossRef]
- Lahusen, C.; Maggetti, M.; Slavkovik, M. Trust, Trustworthiness and AI Governance. Sci. Rep. 2024, 14, 20752. [Google Scholar] [CrossRef] [PubMed]
- Jarmakovica, A. Machine Learning-Based Strategies for Improving Healthcare Data Quality: An Evaluation of Accuracy, Completeness, and Reusability. Front. Artif. Intell. 2025, 8, 1621514. [Google Scholar] [CrossRef]
- Seabra, A.; Cavalcante, C.; Ruberg, N.; Lifschitz, S. AI-Driven Semantic Data Quality Assessment and Scoring for Relational Databases; Springer: Berlin/Heidelberg, Germany, 2025; pp. 199–206. [Google Scholar] [CrossRef]
- Lesouple, J.; Baudoin, C.; Spigai, M.; Tourneret, J.-Y. Generalized Isolation Forest for Anomaly Detection. Pattern Recognit. Lett. 2021, 149, 109–119. [Google Scholar] [CrossRef]
- Mohammed, S.; Budach, L.; Feuerpfeil, M.; Ihde, N.; Nathansen, A.; Noack, N.; Patzlaff, H.; Naumann, F.; Harmouch, H. The Effects of Data Quality on Machine Learning Performance on Tabular Data. Inf. Syst. 2025, 132, 102549. [Google Scholar] [CrossRef]
- Mowla, N.I. A Guide to Data Quality Testing for AI Applications Based on Standards; RISE Research Institutes of Sweden: Gothenburg, Sweden, 2024. [Google Scholar]
- EU FRA. Data Quality and Artificial Intelligence–Mitigating Bias and Error to Protect Fundamental Rights; European Union Agency for Fundamental Rights: Viena, Austria, 2019. [Google Scholar]
- Pulicharla, M.R. Detecting and Addressing Model Drift: Automated Monitoring and Real-Time Retraining in ML Pipelines. World J. Adv. Res. Rev. 2019, 3, 147–152. [Google Scholar] [CrossRef]
- Patchipala, S.G. Tackling Data and Model Drift in AI: Strategies for Maintaining Accuracy during ML Model Inference. Int. J. Sci. Res. Arch. 2023, 10, 1198–1209. [Google Scholar] [CrossRef]
- Poppy, D. Data Governance Frameworks for AI-Driven orgs|dbt Labs. Available online: https://www.getdbt.com/blog/data-governance-frameworks-ai?utm_source=chatgpt.com (accessed on 10 November 2025).
- Troyanskaya, O.; Cantor, M.; Sherlock, G.; Brown, P.; Hastie, T.; Tibshirani, R.; Botstein, D.; Altman, R.B. Missing Value Estimation Methods for DNA Microarrays. Bioinformatics 2001, 17, 520–525. [Google Scholar] [CrossRef] [PubMed]
- Li, W.; Wu, Y.; Huang, W.; Zhou, F.; Ou, W.; Wang, H.; Deng, L. System Log Anomaly Detection Based on Contrastive Learning and Retrieval Augmented. Sci. Rep. 2025, 15, 38370. [Google Scholar] [CrossRef]
- Hansen, H.T. Intelligent Cloud-Native DevOps Architecture for Enterprise Transformation Leveraging Blockchain, BERT Models, and AI-Powered Financial Cryptosystems. Int. J. Res. Publ. Eng. Technol. Manag. 2025, 8, 1–5. Available online: https://ijrpetm.com/index.php/IJRPETM/article/view/172/168 (accessed on 2 November 2025).
- Meng, T.; Jing, X.; Yan, Z.; Pedrycz, W. A Survey on Machine Learning for Data Fusion. Inf. Fusion 2020, 57, 115–129. [Google Scholar] [CrossRef]
- Ziv, L.; Nakash, M. Behind the Algorithm: International Insights into Data-Driven AI Model Development. Mach. Learn. Knowl. Extr. 2025, 7, 122. [Google Scholar] [CrossRef]
- Dibouliya, A. Unified Data Governance Framework for AI-Enabled Data Warehouses in Banking. Eur. Mod. Stud. J. 2025, 9, 67–76. [Google Scholar] [CrossRef]
- Wendt, D.W. Continuous Improvement. In AI Strategy and Security; Apress: Berkeley, CA, USA, 2025; pp. 175–184. [Google Scholar] [CrossRef]
- Grant, B.; Welch, M.; Deutschman, C.; McElcheran, C.; Badzynski, A.; Bell, J.A.H.; Hope, A.; Grant, R.C.; Truong, T.; Lane, K.; et al. Abstract PR-04: A Practical Framework for Operationalizing Responsible and Equitable AI in Healthcare: Tackling Bias, Inequity, and Implementation Challenges. Clin. Cancer Res. 2025, 31 (Suppl. 13), PR-04. [Google Scholar] [CrossRef]
- Bhosale, A.M. Implementing PowerBI Reporting for Quality Analysis in Decision Making Processes. Master’s Thesis, Politecnico di Torino, Turin, Italy, 2025. [Google Scholar]
- Verma, R.K. Digital Twin Technology for Process Optimization and Smart Manufacturing Systems. Int. J. Res. Publ. Eng. Technol. Manag. (IJRPETM) 2025, 8, 12699–12701. [Google Scholar]
- Raji, I.D.; Smart, A.; White, R.N.; Mitchell, M.; Gebru, T.; Hutchinson, B.; Smith-Loud, J.; Theron, D.; Barnes, P. Closing the AI Accountability Gap. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, Barcelona, Spain, 27–30 January 2020; ACM: New York, NY, USA, 2020; pp. 33–44. [Google Scholar] [CrossRef]
- Floridi, L.; Cowls, J.; Beltrametti, M.; Chatila, R.; Chazerand, P.; Dignum, V.; Luetge, C.; Madelin, R.; Pagallo, U.; Rossi, F.; et al. AI4People—An Ethical Framework for a Good AI Society: Opportunities, Risks, Principles, and Recommendations. Minds Mach. (Dordr.) 2018, 28, 689–707. [Google Scholar] [CrossRef]
- Selbst, A.D.; Boyd, D.; Friedler, S.A.; Venkatasubramanian, S.; Vertesi, J. Fairness and Abstraction in Sociotechnical Systems. In Proceedings of the Conference on Fairness, Accountability, and Transparency, Atlanta, GA, USA, 29–31 January 2019; ACM: New York, NY, USA, 2019; pp. 59–68. [Google Scholar] [CrossRef]
- Mehrabi, N.; Morstatter, F.; Saxena, N.; Lerman, K.; Galstyan, A. A Survey on Bias and Fairness in Machine Learning. ACM Comput. Surv. 2022, 54, 1–35. [Google Scholar] [CrossRef]
- Morley, J.; Floridi, L.; Kinsey, L.; Elhalal, A. From What to How: An Initial Review of Publicly Available AI Ethics Tools, Methods and Research to Translate Principles into Practices. In Ethics, Governance, and Policies in Artificial Intelligence; Floridi, L., Ed.; Springer Nature: Cham, Switzerland, 2021; pp. 144–153. [Google Scholar]
- Mitchell, M.; Wu, S.; Zaldivar, A.; Barnes, P.; Vasserman, L.; Hutchinson, B.; Spitzer, E.; Raji, I.D.; Gebru, T. Model Cards for Model Reporting. In Proceedings of the Conference on Fairness, Accountability, and Transparency, Atlanta, GA, USA, 29–31 January 2019; ACM: New York, NY, USA, 2019; pp. 220–229. [Google Scholar] [CrossRef]
- Jobin, A.; Ienca, M.; Vayena, E. The Global Landscape of AI Ethics Guidelines. Nat. Mach. Intell. 2019, 1, 389–399. [Google Scholar] [CrossRef]
- Whittlestone, J.; Nyrup, R.; Alexandrova, A.; Cave, S. The Role and Limits of Principles in AI Ethics. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society; ACM: New York, NY, USA, 2019; pp. 195–200. [Google Scholar] [CrossRef]
- Goellner, S.; Tropmann-Frick, M.; Brumen, B. Responsible Artificial Intelligence: A Structured Literature Review. arXiv 2024. [Google Scholar] [CrossRef]
- Zeng, Y.; Lu, E.; Huangfu, C. Linking Artificial Intelligence Principles. arXiv 2018. [Google Scholar] [CrossRef]
- Lu, J.; Liu, A.; Dong, F.; Gu, F.; Gama, J.; Zhang, G. Learning under Concept Drift: A Review. IEEE Trans. Knowl. Data Eng. 2018, 31, 2346–2363. [Google Scholar] [CrossRef]
- Adepoju, A.S. Adaptive Program Management Strategies for AI-Based Cyber Defense Deployments in Critical Infrastructure and Enterprise Digital Transformation Initiatives. Int. J. Res. Publ. Rev. 2025, 6, 5599–5615. [Google Scholar] [CrossRef]
- Widmer, G.; Kubat, M. Learning in the Presence of Concept Drift and Hidden Contexts. Mach. Learn. 1996, 23, 69–101. [Google Scholar] [CrossRef]
- Sayles, J. Designing a Well-Governed AI Lifecycle Model. In Principles of AI Governance and Model Risk Management; Apress: Berkeley, CA, USA, 2024; pp. 85–111. [Google Scholar] [CrossRef]
- Park, C. Addressing Challenges for the Effective Adoption of Artificial Intelligence in the Energy Sector. Sustainability 2025, 17, 5764. [Google Scholar] [CrossRef]
- Mulyaningsih, S.R.; Ghoffar, A.; Mufrihah, A.; Hasanah, I.; Aun, A. Self-Adaptive Systems: Redefining Best Practices in AI and Big Data in Recruitment. In Emerging Technologies for Recruitment Strategy and Practice; Vasudevan, K., Vasudevan, S.K., Sudha, M., Eds.; IGI Global: Hershey, PA, USA, 2023; pp. 77–103. [Google Scholar]
- Baum, S.D.; Owe, A. Artificial Intelligence Needs Environmental Ethics. Ethics Policy Environ. 2023, 26, 139–143. [Google Scholar] [CrossRef]
- Morley, J.; Kinsey, L.; Elhalal, A.; Garcia, F.; Ziosi, M.; Floridi, L. Operationalising AI Ethics: Barriers, Enablers and next Steps. AI Soc. 2023, 38, 411–423. [Google Scholar] [CrossRef]
- Asokan, D.R.; Smith, C.M.; Huq, F.A. Digitalisation as a Catalyst for Supplier Diversity, Equity and Inclusion. Int. J. Oper. Prod. Manag. 2025, 2025, 1–26. [Google Scholar] [CrossRef]
- Mahler, S. Building Trust in Workplace AI Why Governance Outweighs Employee Co-Creation in Building Trust. Ph.D. Thesis, Vorarlberg University of Applied Sciences, Dornbirn, Austria, 2025. [Google Scholar]
- Sagona, M.; Dai, T.; Macis, M.; Darden, M. Trust in AI-Assisted Health Systems and AI’s Trust in Humans. npj Health Syst. 2025, 2, 10. [Google Scholar] [CrossRef]
- Agate, J. Artificial Intelligence Methods and Approaches to Improve Data Quality in Healthcare Data. Artif. Intell. Life Sci. 2025, 8, 100135. [Google Scholar] [CrossRef]
- Stoudt, S.; Jernite, Y.; Marshall, B.; Marwick, B.; Sharan, M.; Whitaker, K.; Danchev, V. Ten Simple Rules for Building and Maintaining a Responsible Data Science Workflow. PLoS Comput. Biol. 2024, 20, e1012232. [Google Scholar] [CrossRef] [PubMed]
- Korbmacher, M.; Azevedo, F.; Pennington, C.R.; Hartmann, H.; Pownall, M.; Schmidt, K.; Elsherif, M.; Breznau, N.; Robertson, O.; Kalandadze, T.; et al. The Replication Crisis Has Led to Positive Structural, Procedural, and Community Changes. Commun. Psychol. 2023, 1, 3. [Google Scholar] [CrossRef] [PubMed]
- Dudda, L.; Kormann, E.; Kozula, M.; DeVito, N.J.; Klebel, T.; Dewi, A.P.M.; Spijker, R.; Stegeman, I.; Van den Eynden, V.; Ross-Hellauer, T.; et al. Open Science Interventions to Improve Reproducibility and Replicability of Research: A Scoping Review. R. Soc. Open Sci. 2025, 12, 242057. [Google Scholar] [CrossRef] [PubMed]
- MacMaster, S.; Sinistore, J. Testing the Use of a Large Language Model (LLM) for Performing Data Quality Assessment. Int. J. Life Cycle Assess. 2024, 1–12. [Google Scholar] [CrossRef]
- WHO. Overview of the Data Quality Review (DQR) Frameworkand Methodology; WHO: Geneva, Switzerland, 2020. [Google Scholar]
- Patra, P.; Di Pompeo, D.; Di Marco, A. An Evaluation Framework for the FAIR Assessment Tools in Open Science. arXiv 2025, arXiv:2503.15929. [Google Scholar] [CrossRef]

| Source/Standard | Definition of Data Quality | Key Dimensions/Emphasis | Notes |
|---|---|---|---|
| Wang & Strong [15] | Data quality is data that is fit for use by data consumers | Intrinsic (accuracy, objectivity) Contextual (relevance, timeliness, completeness) Representational (interpretability, consistency) Accessibility (access, security). | Highly influential conceptual framework; consumer-oriented. |
| Strong, Lee & Wang [14] | Emphasizes data quality as “fitness for use” in operations, decision-making, and planning. | Same four categories as above. | Extends the earlier framework with an organizational perspective. |
| Pipino, Lee & Wang [17] | Defines data quality through measurable attributes that reflect accuracy, completeness, consistency, and timeliness. | Quantitative measures for core dimensions. | Introduces practical tools for data quality assessment. |
| Ehrlinger & Wöß [18] | Data quality as a multidimensional construct is influenced by context and use. | Highlights timeliness, completeness, plausibility, integrity, and multifacetedness. | Extends beyond classical dimensions and focuses on big data. |
| Haug, Zachariassen [19] | Suggests that “perfect” data quality is neither achievable nor optimal; instead, the right level balances costs of maintenance vs. costs of poor data. | Trade-off between quality maintenance effort and business impact. | Cost-oriented perspective. |
| Dimension | Definition | Practical Example | Measurement Approach |
|---|---|---|---|
| Accuracy | The degree to which data correctly describes the real-world object or event. | Patient’s recorded blood pressure matches the actual measurement. | Comparison against an authoritative source or ground truth. |
| Completeness | The extent to which all required data is present. | The customer database contains contact details for all clients. | Ratio of available values to required values; percentage of missing fields. |
| Consistency | Absence of contradictions within and across datasets. | A patient’s birthdate is consistent across both electronic health records and insurance records. | Cross-field and cross-database validation checks. |
| Timeliness | The degree to which data is up to date and available when needed. | Stock market prices updated in real time. | Lag time between data generation and availability for use. |
| Validity | Degree to which data conforms to defined formats, rules, or ranges. | Postal codes follow the official national standard. | Validation rules, format checks, and range constraints. |
| Relevance | Appropriateness of data for the intended use. | Including clinical trial data when evaluating a new treatment. | Expert judgment; alignment with analytical or decision-making needs. |
| Uniqueness | The degree to which data is free of duplicate records. | Each patient has a single unique medical record number. | Duplicate detection and record linkage algorithms. |
| Consequence | Definition | Practical Example | Impact/Cost |
|---|---|---|---|
| Faulty decision-making | Wrong or suboptimal choices based on inaccurate data. | A hospital prescribes inappropriate treatment due to errors in lab data. | Patient harm, liability risks, loss of trust. |
| Financial losses | Direct or indirect costs from incorrect, incomplete, or duplicated data. | A bank suffers multimillion-dollar losses due to flawed credit risk models. | Wasted resources, loss of revenue. |
| Operational inefficiencies | Processes slowed or disrupted due to unreliable information. | Logistics companies misroute deliveries due to inaccurate addresses. | Increased workload, delays, and higher costs. |
| Reputational damage | Erosion of trust from stakeholders, customers, or the public. | Data breaches and reporting errors damage a company’s brand. | Customer attrition, lower market share. |
| Regulatory and legal risks | Non-compliance with laws and standards due to poor data. | A pharmaceutical firm fails an audit due to inconsistent records. | Fines, sanctions, reputational harm. |
| Missed opportunities | Failure to identify insights or innovations. | Retailer loses potential sales due to incomplete CRM data. | Reduced competitiveness, slower growth. |
| Misleading analytics | Models or reports based on flawed inputs lead to invalid results. | Overestimation of flu outbreaks by Google Flu Trends. | Misallocation of resources leads to a loss of credibility. |
| Case/Organization | Domain | Data Quality Issue | Consequence |
|---|---|---|---|
| Failures | |||
| Equifax [165] | Finance/Credit reporting | Inaccurate and poorly managed consumer credit data [166] | Erosion of public trust; legal and financial consequences [167] |
| NASA Mars Climate Orbiter [101] | Aerospace/Engineering | Unit mismatch (imperial vs. metric) not reconciled in data systems [168] | Spacecraft loss (~$125 million) |
| Mid-sized enterprise (CRM migration) [35] | Business/CRM | Data quality challenges during migration from legacy systems [23,64,169,170] | Errors, inconsistent formats, and disruption in customer management |
| Large home appliance business [21] | Retail/CRM | Low completeness, timeliness, and accuracy of customer data [21,171] | Ineffective campaigns, reduced loyalty, and weak predictive performance |
| University fundraising CRM [32]. | Education/Fundraising | Outdated, incomplete, and inaccurate alumni data [172,173] | Reduced donor identification, inefficient fundraising, wasted resources |
| Target [126]. | Retail/CRM | Predictive analytics revealed sensitive customer information [174] | Public backlash over privacy intrusion |
| Google Flu Trends (2008–2013) [175,176,177]. | Public health analytics | Overfitting and reliance on biased signals | Overestimation of flu cases; credibility loss [178] |
| Amsterdam Tax Office [35]. | Public sector | Duplicate and inconsistent taxpayer records [179,180,181] | Inefficient operations; reduced compliance [182] |
| Healthcare organizations [34,183,184] | Healthcare/CRM | Incomplete or inconsistent patient data in electronic health records [131,185] | Medical errors, patient safety risks |
| Successes of Data Quality | |||
| Netflix Recommendation System [126,127,128,129,130]. | Entertainment/Business | Leveraging high-quality behavioral data for personalization [127] | Recommendations drive 80% of content consumption; increased engagement and revenue |
| Freight forwarding industry [20] | Logistics/Freight forwarding | Workflow-embedded quality checks across logistics processes [186] | Improved coordination, fewer customs delays, and reduced correction costs |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Guillen-Aguinaga, M.; Aguinaga-Ontoso, E.; Guillen-Aguinaga, L.; Guillen-Grima, F.; Aguinaga-Ontoso, I. Data Quality in the Age of AI: A Review of Governance, Ethics, and the FAIR Principles. Data 2025, 10, 201. https://doi.org/10.3390/data10120201
Guillen-Aguinaga M, Aguinaga-Ontoso E, Guillen-Aguinaga L, Guillen-Grima F, Aguinaga-Ontoso I. Data Quality in the Age of AI: A Review of Governance, Ethics, and the FAIR Principles. Data. 2025; 10(12):201. https://doi.org/10.3390/data10120201
Chicago/Turabian StyleGuillen-Aguinaga, Miriam, Enrique Aguinaga-Ontoso, Laura Guillen-Aguinaga, Francisco Guillen-Grima, and Ines Aguinaga-Ontoso. 2025. "Data Quality in the Age of AI: A Review of Governance, Ethics, and the FAIR Principles" Data 10, no. 12: 201. https://doi.org/10.3390/data10120201
APA StyleGuillen-Aguinaga, M., Aguinaga-Ontoso, E., Guillen-Aguinaga, L., Guillen-Grima, F., & Aguinaga-Ontoso, I. (2025). Data Quality in the Age of AI: A Review of Governance, Ethics, and the FAIR Principles. Data, 10(12), 201. https://doi.org/10.3390/data10120201

