Mining User Perspectives: Multi Case Study Analysis of Data Quality Characteristics

Malieckal, Minnu; Gurtoo, Anjula

doi:10.3390/info16100920

Open AccessArticle

Mining User Perspectives: Multi Case Study Analysis of Data Quality Characteristics

by

Minnu Malieckal

^1,*

and

Anjula Gurtoo

²

¹

Centre for Society and Policy, Indian Institute of Science, Bangalore 560012, India

²

Department of Management Studies, Indian Institute of Science, Bangalore 560012, India

^*

Author to whom correspondence should be addressed.

Information 2025, 16(10), 920; https://doi.org/10.3390/info16100920

Submission received: 28 August 2025 / Revised: 7 October 2025 / Accepted: 15 October 2025 / Published: 21 October 2025

Download

Browse Figures

Versions Notes

Abstract

With the growth of digital economies, data quality forms a key factor in enabling use and delivering value. Existing research defines quality through technical benchmarks or provider-led frameworks. Our study shifts the focus to actual users. Thirty-seven distinct data quality dimensions identified through a comprehensive review of the literature provide limited applicability for practitioners seeking actionable guidance. To address the gap, in-depth interviews of senior professionals from 25 organizations were conducted, representing sectors like computer science and technology, finance, environmental, social, and governance, and urban infrastructure. Data are analysed using content analysis methodology, with 2 level coding, supported by NVivo R1 software. Several newer perspectives emerged. Firstly, data quality is not simply about accuracy or completeness, rather it depends on suitability for real-world tasks. Secondly, trust grows with data transparency. Knowing where the data comes from and the nature of data processing matters as much as the data per se. Thirdly, users are open to paying for data, provided the data is clean, reliable, and ready to use. These and other results suggest data users focus on a narrower, more practical set of priorities, considered essential in actual workflows. Rethinking quality from a consumer’s perspective offers a practical path to building credible and accessible data ecosystems. This study is particularly useful for data platform designers, policymakers, and organisations aiming to strengthen data quality and trust in data exchange ecosystems.

Keywords:

data quality; data exchange platforms; user perspectives; NVivo R; data characteristics

Graphical Abstract

1. Introduction

Data quality (DQ) forms the foundational element for effective decision-making, innovation, and analytics. High-quality datasets streamline operations and provide organisations with a competitive advantage, particularly in rapidly evolving markets [1]. Historically, DQ has been conceptualised as a multidimensional construct encompassing intrinsic, contextual, and representational attributes. However, the exponential increase in global data generation, with International Data Corporation (IDC) predicting the Global Datasphere growth from 33 zettabytes (ZB) in 2018 to 175 ZB by 2025 [2], presents significant challenges in ensuring sustained DQ.

Data-sharing platforms have become crucial in managing the data surge. The growth of the data economy is driven by interactions between data providers, consumers, and platform intermediaries [3]. Data-sharing platforms form a central part of the ecosystem, facilitating access to large, diverse datasets [4]. However, most platforms focus on supply-side priorities such as volume, variety, and pricing, often neglecting consumer needs like traceability, usability, and relevance [5].

In India, the challenges of fragmented datasets, limited platform standardisation, and immature regulatory frameworks remain significant barriers to effective implementation [6]. For example, recent analyses of India’s Smart City Mission further highlights lack of unified data platforms hampering DQ interoperability [6]. Furthermore, smart city governance emphasizes the need for robust data frameworks, standardized data exchange platforms, and clear regulatory policies [6,7]. The Organization for Economic Cooperation and Development (OECD) report [6] notes retention or utilization of a small fraction of generated Internet of Things (IoT) data, underscoring the urgency of establishing national data strategies and governance mechanisms. The United Nations Habitat World Smart Cities Outlook [8] further stresses on fragmented data platforms and varying digital maturity levels impeding citizen centric services and limiting consumer influence. Moreover, recent evaluations of smart city projects in India reveals weak regulatory enforcement and immature data governance structures limiting the realization of real-time decision-making capabilities [6].

Furthermore, worldwide, most research on DQ has focused on theoretical frameworks, technical solutions, and provider-centric assessments, in the attempt to capture the multifaceted complexity of quality. The literature shows limited empirical exploration of how data consumers—those who utilise data in everyday decision-making—perceive, experience, and act upon DQ. The gap becomes particularly evident in the Indian context, where the data ecosystem is diverse and characterised by varying levels of digital maturity and limited standardisation.

While our review led to the identification of 37 distinct DQ dimensions (see Appendix A), such comprehensive listings can be overwhelming and do not inherently provide guidance on aspects more critical for real-world applications. These theoretical parameters raise a key practical question: out of all possible dimensions, which ones truly matter most to data users for everyday work? Addressing the gap, this study adopts a consumer-centric approach to explores how Indian data consumers interpret, experience, and act upon DQ in platform-based data exchange environments.

This study aims to move beyond theory by empirically investigating how data consumers prioritise and interpret DQ, to develop practical guidance to identify which aspects matter most for real-world applications. The 37 identified dimensions provide a conceptual baseline. Adopting a multiple-case study approach, following the methodology of Urbinati et al., 2019 [9], in-depth interviews were conducted with 25 senior data users across sectors such as artificial intelligence, financial services, environmental, social, and governance ESG consulting, and urban mobility to surface the lived experiences and practical definitions of DQ. Semi-structured interviews were conducted using Microsoft Teams between June and January 2024, with the interview protocol focusing on the following dimensions, namely, definitions and relevance of DQ for business goals; challenges in sourcing, cleaning, and integrating datasets; expectations from data providers and platforms; views on metadata, documentation, and standards; perceptions of pricing models, licensing, and monetisation strategies; and trust mechanisms, validation practices, and legal concerns in data sharing. The qualitative inquiry allows us to connect the general dimensions with actionable, context-driven priorities, and to determine consensus among users on what makes data “fit for use.”

Despite the extensive literature on DQ (see Appendix A), current frameworks remain provider-centric, privileging technical specifications over lived user needs. These approaches have limited capacity to explain why data may be “usable” yet not “used.” This creates a research gap: consumer perspectives on DQ remain underexplored, particularly in emerging economies such as India where platform ecosystems are rapidly evolving. Accordingly, the central research question guiding this study is: Which DQ dimensions matter most to Indian data consumers in everyday practice, and how do these priorities align with or diverge from existing theoretical frameworks?

This study offers four key contributions. First, this study reframes DQ assessments by shifting the focus from provider-led definitions to consumer-driven evaluations, highlighting the practical realities faced by Indian data consumers. Second, the paper provides sector-specific insights into the distinct usability, trust, and governance challenges encountered across industries. Third, we identify both social and technical barriers to effective use and integration of external datasets. Finally, this study presents actionable recommendations for the development of more responsive, consumer-informed, and trust-oriented data platforms.

While prior studies on consumer-centric governance and monetisation have largely focused on Europe and North America, this paper contributes originality by situating consumer perspectives within the Indian digital ecosystem. India is both a major hub for global data services and a representative Global South setting, where infrastructural constraints, regulatory frameworks, and market dynamics differ substantially from those in the Global North. By analysing how Indian data consumers evaluate DQ, usability, governance, and monetisation, this study extends platform governance debates beyond their dominant geographies and surfaces newer, context-specific quality concerns not captured in earlier work.

The remainder of this paper is structured as follows. Section 2 provides the background and reviews the relevant literature. Section 3 outlines the materials and methods, describing the case selection, participant profile, and procedures for data collection and analysis. Section 4 presents the results, highlighting consumer perspectives on fit-for-purpose quality, usability and format frictions, transparency and governance requirements, and monetisation concerns. Section 5 discusses the implications of these findings, reframing DQ as a consumer-driven construct and exploring platform frictions and governance challenges. Finally, Section 6 concludes this study, summarising the contributions and suggesting directions for future research.

2. Background

Table 1 summarizes the literature studies. Four dimensions are investigated, namely, DQ assessment frameworks and methods; consumer-centric governance in data platforms; monetisation models and consumer participation; and clinical and business protocols: lessons for DQ.

Table 1 provides a consolidated overview of the prior literature, highlighting the objectives, benefits, and persistent barriers across different domains. This summary underscores the fragmented nature of existing approaches and frames the need for a consumer-centric perspective adopted in this study.

2.1. DQ Assessment Frameworks and Methods

DQ research has evolved from conceptual frameworks to quantitative models and automated tools aiming to assess and improve DQ systematically. Historical contributions by Hoare (1975), Chapple (1976), Fox et al. (1994), and Ballou and Pazer (1985) [45,46,47,48] laid the foundation for defining and modelling data and process quality. De Amicis et al., 2006 [49] developed an analytical framework to assess interdependencies among DQ dimensions, which remains relevant for modern ecosystems. Wang and Strong, 1996 [1] introduced a multidimensional framework categorising DQ as intrinsic, contextual, representational, and accessibility-based. Later studies [11,12] expanded these models, though most remain provider-focused and insufficiently tested in complex, real-world environments.

Several studies have developed objective metrics and scalable tools to assess DQ [13,14,15,16,17]. Studies [18,19] further advanced automation for streaming and big data environments. However, these approaches often rely on clean, stable, and well-structured datasets, which are rarely available.

Emerging research advocates flexible, real-time assessment methods which can adapt to evolving data environments [2,4,20]. Sadiq and Indulska, 2017 [21] and Cai and Zhu, 2015 [22] have laid the groundwork for more dynamic quality assessment, a concept further advanced by recent studies. For instance, Abedjan et al., 2016 [23] has called for more emphasis on uniqueness and task-independent evaluation, critical for platform-based systems. This aligns with broader discussions on data governance in socio-technical systems [50].

2.2. Consumer-Centric Governance in Data Platforms

Platforms increasingly recognise the need to prioritise consumer needs such as transparency, traceability, privacy, and fairness [27] and DQ frameworks like Consumer-Oriented Data Control and Auditability (CODCA) [25] provide mechanisms for user-driven transparency and auditability, enhancing trust in data custodians. The automated governance framework [24] emphasizes how consumer perceptions and technical realities of DQ can be aligned through automated validation and governance processes, thereby supporting participatory and trustworthy data ecosystems. Complementing this, Acev et al., 2025 highlights foundational governance mechanisms such as provenance tracking, access control definitions, accountability structures, and interoperability standards underpinning consumer trust and elevate DQ within platforms facilitating shared data use [26].

Otto and Jarke (2019) [27] propose participatory ecosystems where consumers contribute directly to platform rules and data validation mechanisms. Veltri et al. (2020) [28] demonstrate greater platform transparency to significantly influence consumer trust and decision-making. Ducuing and Reich (2023) [29] explore governance tensions in emerging digital product passport frameworks, underscoring the need for balanced oversight and consumer influence. Agahari et al., 2022 [30] emphasise trust principles and participatory mechanisms as critical for sustainable data ecosystems. Dynamic consent models [44] offer practical governance solutions by enabling ongoing consumer control over data sharing, consent, and platform participation. These frameworks provide pathways for real-time consumer involvement and highlight need for participatory governance.

2.3. Monetisation Models and Consumer Participation

Understanding monetisation models and willingness to pay in the context of data platforms forms a significant gap in the literature. However, pricing opacity and the poor quality of available services frequently diminished the willingness. Yu and Zhang, 2017 [31] proposed a data-pricing strategy based on DQ, employing a bi-level programming model and a multi-version approach to enhance market segmentation, profitability, and consumer utility. Yang et al., 2019 [32] developed subscription models to jointly optimise DQ and consumer willingness to pay. Zhang et al., 2018 [33] introduced fairness-based pricing mechanisms, while Pei [34] and Miao et al., 2023 [35] offered comprehensive reviews of data-pricing strategies, including quality-driven, feature-based, and query-based pricing. Zhang et al., 2023 [4] provided a taxonomy of pricing models, further emphasising the importance of transparent, consumer-centric monetisation frameworks.

Malieckal et al., 2024 [37] extends the argument by demonstrating how pricing can be dynamically linked to concrete DQ dimensions such as timeliness, completeness, and contextual relevance through forecasting-based and AI-enabled mechanisms. The approach operationalises the conceptual link between quality and value, ensuring monetisation directly reflects consumers priorities. Despite these advances, limited empirical work has explored how consumer preferences shape pricing expectations in real-world data platforms. This study highlights the importance of tiered pricing, DQ-linked service level agreements (SLAs), and participatory pricing mechanisms.

2.4. Clinical and Business Protocols: Lessons for DQ

In clinical research, the SPIRIT 2013 guidelines [38] demonstrate how structured, detailed documentation and active stakeholder participation can significantly improve data usability and trust. Juddoo et al. (2018) [39] and Facile et al., 2022 [40] further reinforce the role of Clinical Data Interchange Standards Consortium (CDISC) protocols and key DQ dimensions such as accuracy, completeness, and timeliness in ensuring interoperability and traceability across clinical data systems.

Patient-centred studies [41] reveal strong preferences for transparent data sharing practices, semantic clarity, and participatory governance mechanisms. The dynamic consent model [44] and good clinical data management practices [40] reinforce the importance of traceability, auditability, and consumer governance in sensitive data environments. Etheredge (2007) [43] introduced the concept of a learning health system, which emphasises continuous improvement through consumer feedback and iterative data-driven development. These clinical governance structures provide useful frameworks to improve trust and usability in broader data-sharing platforms.

3. Materials and Methods

This study adopts a multiple-case study approach, following the methodology of Urbinati et al., 2019 [9], to explore how Indian data consumers experience and interpret DQ (DQ) in real-world settings.

3.1. Case Selection and Participant Profile

Participants are selected using purposive sampling to ensure they met the following criteria:

•: Regular interaction with third-party or external datasets.
•: Active use of data for decision-making, modelling, or product development.
•: Organisational responsibility for data-related strategy, integration, or evaluation.

A total of 25 cases are included, each representing an individual data consumer or a data-driven function within an organisation. The sample size of twenty-five cases was determined based on thematic saturation. Thematic saturation was reached after the seventeenth interview, when no substantially new insights emerged from subsequent data collection. At this stage, additional interviews yielded confirmatory rather than novel information. Nevertheless, further interviews were conducted to ensure adequate sectoral representation and to capture potential contextual variations across domains. This extended sampling strategy enhances the analytical generalisability and robustness of cross-case comparisons. Such an approach aligns with qualitative research conventions recognising that saturation is typically achieved between twelve and twenty interviews in focused studies, though additional cases are often collected in heterogeneous, multi-sector research to consolidate and verify emergent themes [51,52]. The sample included participants from the following sectors:

•: Computer Science and Technology
•: Financial Services and Banking
•: ESG Consulting and Workplace Strategy
•: Urban Infrastructure and Transport Analytics
•: Public Research and Statistical Modelling

Participants held senior positions such as CEOs, Product Owners, Data Architects, and Senior Managers (Table 2). Organisations ranged from startups and multinational firms to public sector bodies and consulting companies. The sectoral and organisational diversity of the sample allowed for rich cross-case comparison. The methodological choice is motivated by a desire to go beyond comprehensive but abstract DQ frameworks and to identify quality dimensions according to data users themselves.

While the sampling framework grouped participants into five broad sectors, the thematic analysis in the Results section further differentiated these into domain-specific categories. For example, “Financial Services and Banking” was analysed in greater detail as Finance and Analytics, Finance and Data Science, and Banking and FinTech. Similarly, “ESG Consulting and Workplace Strategy” corresponds to the Sustainability and General Business categories, while “Urban Infrastructure and Transport Analytics” aligns directly with Urban/Smart Cities. Additional categories such as Healthcare, Energy/Oil and Gas, and Retail/Manufacturing emerged inductively from participants’ job profiles and sector-specific discussions. This expanded classification allowed for richer sectoral comparisons in the results.

3.2. Data Collection

Semi-structured interviews were conducted using Microsoft Teams between June 2023 and January 2024. All interviews were conducted in English, which was the working language of participants, and lasted between 45 and 60 min and followed an interview protocol focusing on the following dimensions:

•: Definitions and relevance of DQ for business goals.
•: Challenges in sourcing, cleaning, and integrating datasets.
•: Expectations from data providers and platforms.
•: Views on metadata, documentation, and standards.
•: Perceptions of pricing models, licensing, and monetisation strategies.
•: Trust mechanisms, validation practices, and legal concerns in data sharing.

Participants provided consent prior to the interviews, and the interview guide was shared upon request. All interviews were audio-recorded, transcribed verbatim, anonymised, and manually cleaned for accuracy and completeness.

Informed Consent Statement and Approvals

All participants provided informed consent at the start of each interview. They were informed about the purpose of the research, the voluntary nature of participation, confidentiality assurances, and how the data would be used. Participants were explicitly told that, at any point, they could request the audio recording to be turned off and their response kept undocumented. They retained the right to withdraw their participation at any stage without consequence. No personally identifiable information (such as names or company details) has been disclosed without explicit permission.

3.3. Data Analysis

The cleaned transcripts are analysed using NVivo R1 (Release 1.7.2) software. A hybrid coding approach was applied. To ensure analytical reliability, the primary researcher conducted the coding of all transcripts. Coding decisions and emerging interpretations were periodically discussed with the co-author to review consistency and consolidate shared understanding. These discussions served as an informal reliability check and helped refine the thematic structure and definitions of key categories.

•: Deductive codes are developed based on established DQ frameworks [1,13] covering completeness, accuracy, timeliness, consistency, traceability, and documentation (Figure 1).
•: Inductive codes are derived from participant narratives, including terms and concerns specific to the Indian data ecosystem, such as, ‘data freshness’, ‘lineage visibility’, ‘manual cleaning burden’, ‘version control gaps’, and ‘APIs without schema’.

The dual coding approach allowed validation of established DQ dimensions while surfacing novel, practice-driven themes from the Indian context. Table 3 and Table 4 present a summary of the themes and the number of participants who raised them. The emerging thematic clusters are (1) Fit-for-Purpose Quality, (2) Usability and Format Frictions, (3) Transparency and Traceability and Governance, and (4) Monetisation and Sharing Models. In this paper, “extracts” refer to coded segments of interview transcripts, ranging from short phrases (e.g., “missing values issue”) to longer verbatim quotations that capture participants’ concerns in context. Each extract is a discrete passage linked to a theme or sub-theme during coding.

Illustrative quotes are extracted to support thematic validity and analytic depth. The approach follows methodological precedents from similar digital transformation studies [9], enabling both domain-specific insight and theory extension.

To systematically assess the strength and prominence of each domain’s engagement with identified DQ themes, we developed a theme engagement rating (Table 5). The coding tables (Table 3, Table 4 and Table 5) provide structured transparency by showing how raw interview data were systematically aggregated into first- and second-order codes and then synthesised into higher-level conceptual categories. These tables are not merely descriptive; they operationalise the methodological logic of the analysis by illustrating the progression from empirical excerpts to abstract themes. The theme engagement rating (Table 5) further strengthens methodological rigour by quantifying the relative emphasis of each theme across sectors, following the comparative approach outlined by Urbinati et al. (2019) [9]. The method classifies intensity levels on a four-point scale: High, Medium, Low, and None, based on:

•: The number of participants from each domain who discussed the theme,
•: The depth of the participant engagement (detail, multi-point extracts), and
•: The diversity of concerns (number of distinct sub-themes or sub-codes addressed).

While Appendix A catalogues 37 dimensions from decades of literature, our coding distilled them into four higher-order consumer-centric themes (see Table 3). This shows convergence around contextual usability and divergence on governance and monetization. The four emergent themes Fit-for-Purpose Quality, Usability Challenges, Governance/Trust, and Monetisation draw primarily from contextual (relevance, timeliness), representational (consistency, documentation), and accessibility (traceability, usability) dimensions, while placing limited emphasis on intrinsic measures like precision or minimality. This empirical condensation demonstrates the practical narrowing of theoretical constructs in lived contexts.

4. Results

Although four initial clusters emerged during first-level coding—Fit-for-Purpose Quality, Usability and Format Frictions, Transparency and Governance, and Monetisation—the final analysis concentrates on the first three dominant themes. These three capture the most frequent, conceptually rich, and cross-sectorally consistent concerns, representing over 95% of all coded extracts. The Monetisation category, while relevant, appeared sparsely discussed and is therefore referenced briefly within the discussion section. These categories emerged consistently across diverse sectors and organisational types. While the extracts illustrate sectoral diversity, the emphasis hereafter shifts from description to analysis, showing how recurring patterns across cases translate into theoretical constructs of consumer-centric data quality.

4.1. Fit-for-Purpose Quality: Needs Expressed by Data Consumers

The theme of fit-for-purpose quality captures the critical requirements data consumers have regarding the specific operational contexts. Across sectors, participants emphasise data must be corrected to align with particular business decisions. The diversity of application domains—from financial services to AI and urban mobility—further illustrated how DQ forms a nuanced and dynamic concept, contingent on domain-specific workflows and decision-making needs.

Across all sectors, participants consistently emphasise the importance of data being internally consistent, for timeliness, credibility, and completeness. Appropriate documentation was particularly stressed as foundational to data usability (Table 6, Table 7 and Table 8).

Table 6. Participant statements on Fit-for-Purpose Quality.

Statement	Participant
“If even if, let’s say the accuracy is low, however, we measure the accuracy as long as they help in making business decisions.”	AI Startup (P19)
“I think accuracy and newness of data matters because what happens is like there is data available, but those surveys have been done in 2020. Know how relevant they are in 2023 is questionable.”	Mental Health Startup (P20)
“ There needs to be some reasonable level of accuracy, but the accuracy need not be 100% as long as the data is good enough to generate business decisions, that’s fine.”	AI Startup (P19)
“In the open source systems, not really the paid version because the paid version didn’t have all the variety… when we want to buy data, we have to see the quality.”	IT Consulting (P1)
“these are the basic things, to basically trust the data it has to be internally consistent.”	AI Startup (P19)
“If you take the names of districts in India, it should be consistent within the same data sources and across data sources.”	AI Startup (P19)
“Data validation is first thing… the context always changes with my digital footprint	Technology Private Firm (P6)
“If it has been used widely, all of that adds to credibility… if you cannot trust the data then you can’t do anything from it.”	Oil and Gas Multinational (P24)
“Data comes in some file format which is not yet standardized. It could be Excel files without exact metadata about when it was collected or what standards it follows. There’s no description of the columns, inconsistent nomenclature, and the data is mostly static.”	Urban Transport Analytics (P7)
“World Wide Web Consortium has given guidelines on metadata standards… all G20 countries adopted it except India.”	Finance/Law Non-Profit (P18)
“The ability to look at how data has transformed from source till date… whole transformation lineage is critical.”	Independent Consultant (P4)
“Publishing data methodology or data cookbook along with the dataset.”	Finance/Law Non-Profit (P18)

Table 7. Sector-wise needs expressed.

Sector	Participants (IDs)	Key Needs/Issues
IT/Tech and AI	P1, P6, P10, P19, P22	API documentation, accuracy for decisions, mission-critical data relevance
Finance and Analytics	P4, P5, P8, P14	Data freshness, lineage tracking, real-time updates
Urban/Smart Cities	P7, P12, P16	Standardized formats, metadata completeness, context-aware data
Healthcare	P17, P20, P25	Historical data accuracy, medical subcode specificity
Energy/Oil and Gas	P11, P24	Instrument reliability, credibility vs. real-world observations
Retail/Manufacturing	P15, P21	Large-volume processing, label consistency (e.g., product sizes)
Sustainability	P9, P23	Cross-departmental data cleanliness, triangulation validation
Banking/FinTech	P13	Truthfulness verification, transaction data integrity

Table 8. Sector-wise engagement rating for Theme 1.

Sector	Participants (IDs)	Depth/Diversity	Rating
IT/Tech and AI	P1, P6, P10, P19, P22	Detailed, multi-point	High
Finance and Analytics	P4, P5, P8, P14	Detailed, diverse issues	High
Urban/Smart Cities	P7, P12, P16	Moderate	High
Healthcare	P17, P20, P25	Moderate	High
Energy/Oil and Gas	P11, P24	Brief	Low
Retail/Manufact.	P15, P21	Brief	Low
Sustainability	P9, P23	Some diversity	Medium
Banking/FinTech	P13	Brief	Low

The quotes reflect the pragmatic acceptance among some users on absolute technical precision being sometimes secondary to usability and decision support. The reflections underscore quality being judged in relation to the task rather than generic metrics, highlighting the importance of contextual relevance. The participants also stress core attributes of internal consistency and accuracy, demonstrating how these attributes build confidence in data-driven processes. The concern resonates strongly in sectors like finance and healthcare, where faulty decisions carry high stakes.

These insights emphasise the role of transparent documentation and lineage in accelerating user adoption and fostering trust. Particularly in advanced analytics and AI, understanding data provenance forms a prerequisite for reliable modelling. Overall, the analysis shows accuracy and freshness as paramount. Furthermore, participants evaluate DQ situationally, prioritizing task relevance and ease of evaluation through documentation and metadata. Thus, platform designs should focus on comprehensive metadata provision, clear lineage tracking, and domain-specific quality indicators to serve heterogeneous consumer needs effectively.

Participants from nearly all sectors described nuanced needs, relating to business relevance, timely access, and documentation. For example, those in IT/Tech and AI and Finance and Analytics domains spoke in detail about requirements for API documentation, data accuracy, and freshness, while sectors like Sustainability and Retail raised more limited or specialized concerns. Table 7 summarizes these patterns and captures the intensity of engagement for each domain, measured by the number of participants, the detail and diversity of the comments, and the overall prominence of the engagement.

The sectoral contrasts observed in Table 6, Table 7 and Table 8 illustrate how domain-specific priorities shape interpretations of fit-for-purpose quality. Finance and AI sectors emphasise accuracy, freshness, and lineage, driven by compliance obligations and automation-dependent decision systems. Urban infrastructure and sustainability domains foreground standardisation and metadata completeness to support interoperability and collaborative governance across multiple stakeholders and data custodians. In comparison, healthcare and ESG actors stress credibility and documentation, reflecting regulatory scrutiny and ethical accountability. These distinctions demonstrate that data quality is contextually enacted, contingent on sectoral workflows, risk exposure, and governance frameworks.

4.2. Usability and Format Frictions: Common Problems Encountered

Participants commonly reported significant usability challenges related to data integration, cleaning, and inconsistencies caused by multi-format inflows and missing values. These frictions not only increased operational overheads but also contributed to mistrust when data providers failed to ensure completeness or transparency. The burden of cleaning and organizing data forms a recurring concern, especially in sectors processing large volumes of heterogeneous inputs (Table 9, Table 10 and Table 11).

Table 9. Participant statements on Usability and Format Frictions.

Statement	Participant
“Eighty percent of my team’s effort is spent cleaning data and making sure it is all right—so much time wasted.”	AI Startup (P19)
“Data cleaning takes up a lot of time.”	Software (P3)
“The data cleaning process is definitely there… more than 50% [of time] goes into data cleaning only.”	Financial Services (P8)
“Absence of data as well as missing information.”	IT Consulting (P1)
“I do not know whether it is an applicable data source or not… there are missing values.”	AI Startup (P19)
“Saving data in an organized format and handling multi-format data are some of the issues that we are facing when it comes to raw data handling.”	Banking (P13)
“Standardization is needed to scale and explore and move into the market quickly .”	Urban Transport Analytics (P7)
“The issue in most of the manufacturing area in India is that they’re not labelled the same way.”	Banking/FinTech (P25)

Table 10. Sector-wise usability issues and format frictions.

Sector	Participants (IDs)	Common Frictions
IT/Tech and AI	P3, P10, P19	80% of effort spent cleaning data, missing values, formatting overheads
Finance and Data Science	P5, P8, P13, P14	Multi-format chaos (CSV/Excel), manual noise reduction
Urban/Smart Cities	P7, P16	Non-standardized files, static datasets without metadata
Healthcare	P17, P25	Unfilled fields, inconsistent medical labels (e.g., “Size L” ambiguity)
Energy/Oil and Gas	P11	Sensor drift, instrument calibration issues
Retail/Manufacturing	P15, P21	Harmonization gaps (e.g., product IDs across vendors)
General Business	P1, P6, P9	Duplicate data, lack of documentation

Table 11. Sector-wise engagement rating for Theme 2.

Sector	Participants (IDs)	Depth/Diversity	Rating
IT/Tech and AI	P3, P10, P19	Heavy cleaning, missing values	High
Finance and Data Science	P5, P8, P13, P14	Multi-format issues, manual fixes	High
Urban/Smart Cities	P7, P16	Non-standardized files, metadata	Medium
Healthcare	P17, P25	Unfilled fields, inconsistent labels	Medium
Energy/Oil and Gas	P11	Sensor drift, calibration	Low
Retail/Manufact.	P15, P21	Harmonization gaps	Low
General Business	P1, P6, P9	Duplication, lack of documentation	High

The quotes highlight the heavy manual labour involved in preparing data for use, a critical barrier to scale and efficiency. Such extensive cleaning efforts reduce capacity for analysis and value extraction, impacting product timelines and innovation cycles. The participant underscores how even common DQ issues such as missing values propagate through workflows, undermining reliability and trust. The impact was felt acutely in modelling-intensive sectors like technology and finance.

The quotes also reflect a broad call for standardization to reduce format frictions, enabling interoperability and faster onboarding. Further, the analysis points to how fragmented data ecosystems increase costs and delay deployment. The analysis reveals usability and format challenges as central impediments to efficient data consumption, suggesting platform designs to prioritize automated cleaning aids, format standardization, and interoperability tools. Building trust also requires transparent communication about DQ issues from providers upfront.

Major usability pain points, such as extensive cleaning burdens and difficulties with multi-format data, were most vividly described by participants from IT, Finance, and General Business backgrounds, with others echoing similar but less frequent or less elaborate challenges. Table 9 presents domain-level engagement intensity.

The sectoral contrasts captured reveal that usability frictions intensify in ecosystems characterised by high data heterogeneity. Finance, technology and AI, and urban analytics sectors report substantial integration burdens due to multi-format inflows and variable schema quality, while retail and energy domains encountering more bounded operational challenges such as labelling inconsistencies or calibration issues. These differences affirm that cleaning overheads and interoperability gaps are structurally embedded within sectoral data architectures, reinforcing the importance of shared standards and collaborative governance mechanisms to streamline integration.

4.3. Platform Requirements: Transparency, Traceability, and Governance

Consumer expectations for data transparency and governance revolve around certification, authentication, standardization, and traceability guarantees from data providers. Participants expressed a need for verifiable quality credentials and common standards to reduce assessment burdens and increase trust. Governance mechanisms providing provenance and transformation histories become essential, particularly for sensitive or critical data (Table 12, Table 13 and Table 14).

The insights highlight the importance of quality metrics and standards, facilitating easier evaluation and comparability across providers. Insight accentuates the role of governance frameworks in operationalizing quality definitions. The participants emphasize the necessity for detailed lineage and governance artifacts to ensure data integrity over time, especially in domains with complex data flows or regulatory oversight.

Overall, transparency and governance emerged as essential for platform credibility and consumer trust. Implementing standardized certification, metadata to support lineage and version control, and clear governance policies will be indispensable design considerations for data marketplaces or platforms aiming to cater to complex consumer ecosystems.

Transparency-related needs such as demand for certification, transformation logs, and auditable processing was discussed in moderate detail by participants across several sectors. Table 14 captures these differences by domain.

The governance patterns summarised in Table 12, Table 13 and Table 14 reflect differentiated trust imperatives across sectors. Finance and legal domains prioritise version control and audit trails to align with stringent compliance regimes, whereas sustainability and public-sector actors emphasise methodology transparency and source validation to enable collaborative governance and stakeholder confidence. These variations highlight how governance mechanisms are selectively configured, shaped by regulatory maturity, institutional mandates, and the relational nature of trust within each domain.

5. Discussion and Implications

The discussion moves beyond descriptive reporting of user concerns to interpret findings through established theoretical lenses. Drawing on frameworks such as Wang and Strong’s (1996) [1] multidimensional model, Human–Data Interaction (HDI), and socio-technical systems theory, the analysis conceptualises data quality as a contextually negotiated construct shaped by both technical affordances and organisational governance. This theoretical integration clarifies how consumer-driven notions of “fit-for-purpose” quality reshape the meaning of data reliability and usability in platform contexts. Rather than a fixed attribute, quality is co-produced through users’ interactions with data, platforms, and governance mechanisms.

5.1. DQ as a Consumer-Driven Construct

This study contributes to the growing shift in the data governance literature from provider-led to consumer-led interpretations of DQ. Traditionally, DQ has been dominated by provider-centric frameworks with intrinsic measures such as correctness, consistency, and completeness. However, these approaches often fall short of capturing what actually matters to users in practice. Our findings challenge the orthodoxy: participants rarely described quality in abstract terms. Instead, quality was framed in relation to the decisions, workflows, and risks embedded in specific contexts of use.

Quality, from the consumer perspective, is judged by fitness for purpose and ability to integrate seamlessly into operational routines, Attributes such as contextual relevance, timeliness, and documented according to consumer needs emerged as essential. Across sectors such as artificial intelligence, oil and gas, and infrastructure, we observed varied tolerances for “imperfections,” contingent on the risk and stakes of the task. These findings echo recent critiques of static, universally applied quality metrics [27] and reinforce the idea of quality not being an inherent property of the dataset, but something constructed and enacted through use. The findings substantiate key tenets of Human–Data Interaction (HDI) [53,54] by demonstrating that data quality is experienced through usability, traceability, and contextual relevance rather than defined solely by intrinsic metrics. This experiential framing aligns with HDI’s principles of legibility, agency, and negotiability, wherein users actively interpret and shape data’s value in practice. In this view, participants’ emphasis on metadata and lineage reveals how data quality becomes a relational construct co-produced through ongoing human–data engagements rather than a static technical property.

Moreover, participants consistently called for richer, more customizable metadata, explicit lineage information, and dynamic documentation. These features allow consumers not only to assess but also to shape and co-produce quality standards in real time. The shift indicates data platforms should move beyond fixed metrics and instead design tools to empower consumers for negotiating quality for specific purposes.

5.2. Platform Frictions: Usability as a Neglected Priority

The persistent usability barriers reported in the analysis point to a significant disconnect between technical platform design and consumer priorities. While platforms often compete on volume, velocity, and cataloguing, participants from both technical and business backgrounds consistently described the bulk of effort going into cleaning, validating, and integrating external datasets before they became operationally useful.

Issues such as inconsistent formats, incomplete documentation, schema mismatches, and limited update transparency are seen as substantial productivity drains. Participants also expressed a recurring distrust of provider claims, due to poor interoperability and opaque preprocessing steps. These frictions reveal a critical gap in platform design thinking. From a socio-technical systems perspective, these frictions arise from the misalignment between human work practices and the technological infrastructures of data platforms. Data quality and usability are therefore emergent properties of interaction between people, processes, and tools rather than outcomes of system design alone. Documentation deficits, inconsistent schemas, and manual cleaning burdens highlight the need to re-engineer socio-technical configurations that support mutual adaptation between users and data technologies—an alignment central to sustainable data-quality management [50].

These findings echo prior research on the barriers posed by poor interoperability and fragmented governance [3]. For platforms to support scalable and trusted data exchange, usability must be treated as a fundamental design goal. Investments in schema standardization, automated validation tools, robust documentation practices, and clear communication of known limitations will evidently be required.

Platforms must enable rich metadata and lineage documentation to address fit-for-purpose needs while reducing cleaning and integration frictions through standardization and automation. Certification and transparent governance mechanisms establish trust and facilitate efficient data exchange. Addressing these interconnected challenges holistically will be essential to developing robust, user-centred data platforms capable of serving diverse sectors and evolving use cases.

5.3. The Central Role of Transparency and Governance

Trust in data platforms is strongly linked to DQ per se as well as governance mechanisms surrounding the platforms. Participants expected platforms to move beyond passive hosting and instead take active responsibility for ensuring data lineage, authentication, and accountability. Features such as detailed transformation logs, version control, certification, and complete provenance were repeatedly described as essential for building confidence, particularly in regulated or high-risk environments such as finance and energy.

This aligns with calls in the literature for participatory and consumer-centred data governance [5]. Frameworks such as the CODCA model provide conceptual grounding for these insights by emphasizing auditability, traceability, and user oversight as foundations of trust. The parallels are clear: participants’ demands for certification, lineage visibility, and verifiable quality credentials mirror CODCA’s design principles, suggesting that participatory governance is not merely normative but a functional requirement for trustworthy data-sharing ecosystems.

However, this study also draws attention to a deeper structural tension. While platforms face growing pressure for interoperability and integration across systems, expectations regarding transparency and certification remain sector-specific. The absence of shared standards and escalation mechanisms forces consumers to navigate fragmented and sometimes opaque governance environments. Participants voiced frustration at the lack of formal avenues to raise concerns or challenge provider claims—underscoring that trust is not given but actively sustained through visible, responsive, and participatory governance practices.

Collectively, these findings highlight the interdependence of data-quality assurance and governance design. Embedding audit trails, certification protocols, and user co-governance mechanisms would enable platforms to operationalize transparency and accountability while reinforcing consumer confidence across domains.

5.4. Toward Consumer-Centric Data Ecosystems

Taken together, the findings suggest a need to reimagine data platforms through a consumer-centric perspective. Rather than seeing users as passive recipients of data, platforms should recognize them as active stakeholders with distinct needs and the capacity to shape governance, standards, and value.

Realizing the shift will require several key changes:

•: Governance structures to allow users to shape DQ standards,
•: Platform features to promote traceability, offer certification, and provide clear avenues for raising concerns,
•: Transparent and flexible pricing systems to reflect demonstrable quality and relevance, and
•: Sustained investments in documentation, metadata tools, and automated usability enhancements.

The approach reframes DQ as something negotiated and dependent on context. Quality is shaped through everyday workflows, verification processes, and governance structures. Future innovations should focus on building participatory systems, such as user feedback mechanisms, stakeholder councils, or context-sensitive certification programs, involving consumers more directly in the lifecycle of data products.

These findings also converge with platform-governance research [27,28] that conceptualizes platforms as multi-sided socio-technical arenas where governance, value creation, and data quality are co-constituted. Users in our study articulated governance expectations such as traceability, certification, and grievance channels—that mirror the accountability mechanisms proposed in the literature. Embedding such user co-governance structures would operationalize the transition from compliance-oriented to participatory platform governance, where data quality, trust, and value evolve through iterative human–data and institutional interaction.

Evidently, DQ is not simply about accuracy or completeness, and rather depends on it being timely, well documented, and suitable for real-world tasks. Trust grows through data transparency. In other words, knowing where the data comes from, the nature of processing, and recent updates matter as much as the data per se. Users are open to paying for clean, reliable, and ready-to-use data. These and other results suggest rethinking quality from a consumer’s perspective offers a more practical path to building credible and accessible data ecosystems.

6. Conclusions

This study has explored how DQ is perceived, prioritised, and operationalised by Indian data consumers across diverse sectors. The applied data-centric perspective highlights a strong user consensus around certain core dimensions of DQ, particularly ones directly impacting efficiency and decision-making. Despite the theoretical proliferation of possible quality indicators, only a subset is consistently valued and used. Our findings underscore the essential need for user-informed prioritization in designing data platforms and quality assessment tools.

The findings indicate the limits of technical accuracy alone. Consumers value transparency about where the data comes from, how data changes, and whether the provider has taken steps to maintain data integrity. Documentation, version tracking, and clear governance structures become essential. Yet, many users still face routine struggles with messy formats, schema mismatches, and inconsistent updates, thus holding back the promise of data platforms.

More broadly, consumers continue to bear the burden of validating, cleaning, and adapting datasets to the workflows or use cases. The gap erodes trust and efficiency and highlights the limits of current data supply chains where responsibility for quality assurance is unevenly distributed. Sustainable data-sharing ecosystems must give consumers a stronger role in shaping quality checks and governance processes. Transparent pricing, visible quality improvements, and flexible pricing models become just as important as the data per se.

By reframing DQ as something built jointly by providers, platforms, and users, this study offers a path towards more reliable and consumer-friendly data markets. The analysis underscores the need for institutional mechanisms to formalise user input, whether through participatory standards-setting, routine feedback channels, or sector-specific certification regimes. Future work should explore ways to bring consumer voices directly into platform design and build shared quality practices that work across different sectors and data types. Policy interventions should likewise recognise data governance as not just a technical or economic issue, but a question of equity, accountability, and inclusion.

This study, however, has certain limitations. First, it draws on a sample of senior professionals in India, which may not fully represent the perspectives of data consumers in other regions or organizational levels. Second, the focus on sectors such as Computer Science and Technology, Finance, ESG, and Urban Infrastructure may overlook distinct challenges in other domains.

Future research should expand the scope by conducting comparative cross-country studies and large-scale quantitative surveys to test the prevalence of identified themes. Further exploration of automated usability tools, participatory governance models, and pricing mechanisms linked to data quality would provide actionable pathways for the design of next-generation data platforms.

Author Contributions

Conceptualization, A.G. and M.M.; methodology, M.M. and A.G.; software, M.M.; writing—original draft, M.M.; writing—review and editing, A.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by OMIDYAR NETWORK through funds (Grant No. SP/OMNI-20-0001.06) and Centre for Data for Public good (CDPG), India. The APC was funded by OMIDYAR NETWORK.

Institutional Review Board Statement

The study does not involve human biological materials, or human biological data. Therefore, it does not fall under the purview of ethics approval requirements as outlined by the Institutional Human Ethics Committee (IHEC) at the Indian Institute of Science (IISc).

Informed Consent Statement

All participants provided informed consent at the start of each interview. They were informed about the purpose of the research, the voluntary nature of participation, confidentiality assurances, and how the data would be used. Participants were explicitly told that, at any point, they could request the audio recording to be turned off and their response kept undocumented. They retained the right to withdraw their participation at any stage without consequence. No personally identifiable information (such as names or company details) has been disclosed.

Data Availability Statement

The data presented in this study are available on request from the corresponding author to maintain the privacy of the participants.

Acknowledgments

The authors would like to gratefully acknowledge Centre for Data for Public good (CDPG), India, for their support through multiple discussions on practical ways of pricing data. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest. The funding sponsors had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, and in the decision to publish the results.

Abbreviations

The following abbreviations are used in the manuscript:

AI	Artificial Intelligence
API	Application Programming Interface
CDISC	Clinical Data Interchange Standards Consortium
CODCA	Consumer-Oriented Data Control and Auditability
CEO	Chief Executive Officer
CTO	Chief Technology Officer
DCAT	Data Catalog Vocabulary
DQ	Data Quality
ESG	Environmental, Social, and Governance
IoT	Internet of Things
IT	Information Technology
ML	Machine Learning
OECD	Organisation for Economic Co-operation and Development
SLA	Service Level Agreement
SPIRIT 2013	Standard Protocol Items: Recommendations for Interventional Trials 2013

Appendix A

Appendix A is retained to demonstrate the conceptual breadth of the data quality literature; however, this study operationalises only the most user-salient dimensions, evidencing a move from theoretical comprehensiveness to practical sufficiency.

Table A1. Data quality dimensions.

	Fisher and Kingma, 2001 [55]	Pipino et al., 2002 [13]	Parssian et al., 2004 [56]	Shankarnarayanan and Chi, 2006 [57]	Batini et al., 2009 [58]	Sidi et al., 2009 [59]	Even et al., 2010 [60]	Moges et al., 2013 [61]	Woodall et al., 2013 [62]	Behkamal et al., 2014 [12]	Becker et al., 2015 [63]	Batini et al., 2015 [64]	Rao et al.,2015 [65]	Guo and Liu, 2015 [66]	Taleb et al., 2016 [67]	Vetro et al., 2016 [68]	Taleb and Serhani 2017 [69]	Zhang et al., 2018 [33]	Jesiļevska 2017 [70]	Färber et al., 2018 [71]	Ardagna et al.,2018 [72]	Cichy and Rass, 2019 [73]	Byabazaire et al., 2020 [74]	Gὀmez-Omella et al., 2022 [18]	Makhoul 2022 [75]
Part A
Accessibility		✔			✔	✔						✔						✔	✔	✔		✔			✔
Accuracy	✔		✔		✔	✔			✔	✔	✔	✔	✔		✔	✔	✔		✔	✔	✔	✔		✔	✔
Alignment								✔
Believability		✔				✔						✔
Completeness	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔		✔		✔	✔	✔		✔	✔	✔	✔		✔	✔
Compactness												✔
Concise Representation		✔			✔	✔		✔				✔
Correctness												✔
Consistency	✔				✔	✔				✔	✔			✔	✔		✔			✔	✔	✔			✔
Clarity												✔
Cohesion												✔
Confidentiality													✔
Conformity																								✔
Distinctness																					✔
Ease of Manipulation		✔				✔
Free of Error		✔
Interpretability		✔			✔	✔		✔											✔	✔					✔
Minimality												✔
Objectify		✔																	✔
Part B
Precision											✔	✔									✔				✔
Pertinence												✔
Pedigree											✔
Relevance	✔	✔									✔	✔						✔		✔					✔
Reliability												✔						✔
Reputation		✔			✔			✔				✔
Redundancy														✔
Simplicity												✔
Security		✔			✔	✔																			✔
Timeliness	✔	✔			✔	✔		✔			✔		✔	✔						✔	✔	✔		✔	✔
Traceability																✔									✔
Trust																				✔			✔
Understandability		✔				✔										✔				✔
Usability																		✔
Uniqueness					✔					✔														✔
Value-Added		✔				✔
Validity												✔
Volume													✔								✔

Note: Here, ✔ indicates that the corresponding literature considers the quality dimensions.

References

Wang, R.Y.; Strong, D.M. Beyond Accuracy: What Data Quality Means to Data Consumers. J. Manag. Inf. Syst. 1996, 12, 5–33. [Google Scholar] [CrossRef]
Reinsel, D.; Gantz, J.; Rydning, J. The Digitization of the World From Edge to Core. Framingham: International Data Corporation 2018, 16, 1–28. [Google Scholar]
Richter, H.; Slowinski, P.R. The Data Sharing Economy: On the Emergence of New Intermediaries. IIC Int. Rev. Intellect. Prop. Compet. Law 2019, 50, 4–29. [Google Scholar] [CrossRef]
Zhang, M.; Beltran, F.; Liu, J. A Survey of Data Pricing for Data Marketplaces. IEEE Trans. Big Data 2023, 9, 1038–1056. [Google Scholar] [CrossRef]
Koutroumpis, P.; Leiponen, A.; Thomas, L.D.W. Markets for Data. Ind. Corp. Change 2020, 29, 645–660. [Google Scholar] [CrossRef]
OECD. Smart City Data Governance; OECD: Paris, France, 2023; ISBN 9789264910201. [Google Scholar]
Johnson, J.; Hevia, A.; Yergin, R.; Karbassi, S.; Levine, A. Data Governance Frameworks for Smart Cities: Key Considerations for Data Management and Use; Covington: Palo Alto, CA, USA, 2022; Volume 2022. [Google Scholar]
Gerli, P.; Mora, L.; Neves, F.; Rocha, D.; Nguyen, H.; Maio, R.; Jansen, M.; Serale, F.; Albin, C.; Biri, F.; et al. World Smart Cities Outlook 2024 Acknowledgements Contributors (Data Collection); ESCAP: Bangkok, Thailand, 2014. [Google Scholar]
Urbinati, A.; Bogers, M.; Chiesa, V.; Frattini, F. Creating and Capturing Value from Big Data: A Multiple-Case Study Analysis of Provider Companies. Technovation 2019, 84–85, 21–36. [Google Scholar] [CrossRef]
Ghosh, D.; Garg, S. India’s Data Imperative: The Pivot towards Quality; 2005. Available online: https://www.niti.gov.in/sites/default/files/2025-06/FTH-Quaterly-Insight-june.pdf (accessed on 27 August 2025).
Batini, C.; Scannapieca, M. Data Quality: Concepts, Methodologies and Techniques; Springer: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
Behkamal, B.; Kahani, M.; Bagheri, E.; Jeremic, Z. A Metrics-Driven Approach for Quality Assessment of Linked Open Data. J. Theor. Appl. Electron. Commer. Res. 2014, 9, 64–79. [Google Scholar] [CrossRef]
Pipino, L.L.; Lee, Y.W.; Wang, R.Y. Data Quality Assessment. Commun. ACM 2002, 45, 211–218. [Google Scholar] [CrossRef]
Bovee, M.; Srivastava, R.P.; Mak, B. A Conceptual Framework and Belief-Function Approach to Assessing Overall Information Quality. Int. J. Intell. Syst. 2003, 18, 51–74. [Google Scholar] [CrossRef]
Ehrlinger, L.; Werth, B.; Wöß, W. Automated Continuous Data Quality Measurement with QuaIIe. Int. J. Adv. Softw. 2018, 11, 400–417. [Google Scholar]
Chug, S.; Kaushal, P.; Kumaraguru, P.; Sethi, T. Statistical Learning to Operationalize a Domain Agnostic Data Quality Scoring. arXiv 2021, arXiv:2108.08905. [Google Scholar] [CrossRef]
Liu, Q.; Feng, G.; Tayi, G.K.; Tian, J. Managing Data Quality of the Data Warehouse: A Chance-Constrained Programming Approach. Inf. Syst. Front. 2021, 23, 375–389. [Google Scholar] [CrossRef]
Meritxell, G.O.; Sierra, B.; Ferreiro, S. On the Evaluation, Management and Improvement of Data Quality in Streaming Time Series. IEEE Access 2022, 10, 81458–81475. [Google Scholar] [CrossRef]
Taleb, I.; Serhani, M.A.; Bouhaddioui, C.; Dssouli, R. Big Data Quality Framework: A Holistic Approach to Continuous Quality Management. J. Big Data 2021, 8, 76. [Google Scholar] [CrossRef]
Alwan, A.A.; Ciupala, M.A.; Brimicombe, A.J.; Ghorashi, S.A.; Baravalle, A.; Falcarin, P. Data Quality Challenges in Large-Scale Cyber-Physical Systems: A Systematic Review. Inf. Syst. 2022, 105, 101951. [Google Scholar] [CrossRef]
Sadiq, S.; Indulska, M. Open Data: Quality over Quantity. Int. J. Inf. Manag. 2017, 37, 150–154. [Google Scholar] [CrossRef]
Cai, L.; Zhu, Y. The Challenges of Data Quality and Data Quality Assessment in the Big Data Era. Data Sci. J. 2015, 14, 2. [Google Scholar] [CrossRef]
Abedjan, Z.; Golab, L.; Naumann, F. Data Profiling. In Proceedings of the 2017 ACM International Conference on Management of Data, Chicago, IL, USA, 14–19 May 2017; ACM: New York, NY, USA, 2017; pp. 1747–1751. [Google Scholar]
Podobnikar, T. Bridging Perceived and Actual Data Quality: Automating the Framework for Governance Reliability. Geosciences 2025, 15, 117. [Google Scholar] [CrossRef]
Tapsell, J.; Akram, R.N.; Markantonakis, K. Consumer Centric Data Control, Tracking and Transparency—A Position Paper. In Proceedings of the 2018 17th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/12th IEEE International Conference on Big Data Science and Engineering (TrustCom/BigDataSE), New York, NY, USA, 1–3 August 2018. [Google Scholar]
Acev, D.; Biyani, S.; Rieder, F.; Aldenhoff, T.T.; Blazevic, M.; Riehle, D.M.; Wimmer, M.A. Systematic Analysis of Data Governance Frameworks and Their Relevance to Data Trusts. In Management Review Quarterly; Springer: Berlin/Heidelberg, Germany, 2025. [Google Scholar] [CrossRef]
Otto, B.; Jarke, M. Designing a Multi-Sided Data Platform: Findings from the International Data Spaces Case. Electron. Mark. 2019, 29, 561–580. [Google Scholar] [CrossRef]
Veltri, G.A.; Lupiáñez-Villanueva, F.; Folkvord, F.; Theben, A.; Gaskell, G. The Impact of Online Platform Transparency of Information on Consumers’ Choices. Behav. Public Policy 2023, 7, 55–82. [Google Scholar] [CrossRef]
Ducuing, C.; Reich, R.H. Data Governance: Digital Product Passports as a Case Study. Compet. Regul. Netw. Ind. 2023, 24, 3–23. [Google Scholar] [CrossRef]
Agahari, W.; Ofe, H.; de Reuver, M. It Is Not (Only) about Privacy: How Multi-Party Computation Redefines Control, Trust, and Risk in Data Sharing. Electron. Mark. 2022, 32, 1577–1602. [Google Scholar] [CrossRef]
Yu, H.; Zhang, M. Data Pricing Strategy Based on Data Quality. Comput. Ind. Eng. 2017, 112, 1–10. [Google Scholar] [CrossRef]
Yang, J.; Zhao, C.; Xing, C. Big Data Market Optimization Pricing Model Based on Data Quality. Complexity 2019, 2019, 1–10. [Google Scholar] [CrossRef]
Zhang, D.; Wang, H.; Ding, X.; Zhang, Y.; Li, J.; Gao, H. On the Fairness of Quality-Based Data Markets. arXiv 2018, arXiv:1808.01624. [Google Scholar] [CrossRef]
Pei, J. A Survey on Data Pricing: From Economics to Data Science. IEEE Trans. Knowl. Data Eng. 2022, 34, 4586–4608. [Google Scholar] [CrossRef]
Miao, X.; Peng, H.; Huang, X.; Chen, L.; Gao, Y.; Yin, J. Modern Data Pricing Models: Taxonomy and Comprehensive Survey. arXiv 2023, arXiv:2306.04945. [Google Scholar] [CrossRef]
Majumdar, R.; Gurtoo, A.; Maileckal, M. Developing a Data Pricing Framework for Data Exchange. Future Bus. J. 2025, 11, 4. [Google Scholar] [CrossRef]
Malieckal, M.; Gurtoo, A.; Majumdar, R. Data Pricing for Data Exchange: Technology and AI. In Proceedings of the 2024 7th Artificial Intelligence and Cloud Computing Conference, Tokyo, Japan, 14–16 December 2024; Association for Computing Machinery: New York, NY, USA, 2025; pp. 392–401. [Google Scholar]
Chan, A.W.; Tetzlaff, J.M.; Gøtzsche, P.C.; Altman, D.G.; Mann, H.; Berlin, J.A.; Dickersin, K.; Hróbjartsson, A.; Schulz, K.F.; Parulekar, W.R.; et al. SPIRIT 2013 Explanation and Elaboration: Guidance for Protocols of Clinical Trials. BMJ 2013, 346, e7586. [Google Scholar] [CrossRef]
Juddoo, S.; George, C.; Duquenoy, P.; Windridge, D. Data Governance in the Health Industry: Investigating Data Quality Dimensions within a Big Data Context. Appl. Syst. Innov. 2018, 1, 43. [Google Scholar] [CrossRef]
Facile, R.; Muhlbradt, E.E.; Gong, M.; Li, Q.; Popat, V.; Pétavy, F.; Cornet, R.; Ruan, Y.; Koide, D.; Saito, T.I.; et al. Use of Clinical Data Interchange Standards Consortium (CDISC) Standards for Real-World Data: Expert Perspectives From a Qualitative Delphi Survey. JMIR Med. Inform. 2022, 10, e30363. [Google Scholar] [CrossRef] [PubMed]
Aggarwal, R.; Farag, S.; Martin, G.; Ashrafian, H.; Darzi, A. Patient Perceptions on Data Sharing and Applying Artificial Intelligence to Health Care Data: Cross-Sectional Survey. J. Med. Internet Res. 2021, 23, e26162. [Google Scholar] [CrossRef]
Miller, R.; Whelan, H.; Chrubasik, M.; Whittaker, D.; Duncan, P.; Gregório, J. A Framework for Current and New Data Quality Dimensions: An Overview. Data 2024, 9, 151. [Google Scholar] [CrossRef]
Etheredge, L.M. A Rapid-Learning Health System. Health Aff. 2007, 26, w107–w118. [Google Scholar] [CrossRef] [PubMed]
Kaye, J.; Whitley, E.A.; Lund, D.; Morrison, M.; Teare, H.; Melham, K. Dynamic Consent: A Patient Interface for Twenty-First Century Research Networks. Eur. J. Hum. Genet. 2015, 23, 141–146. [Google Scholar] [CrossRef]
Hoare, C.A.R. Data Reliability. ACM Sigplan Not. 1975, 10, 528–533. [Google Scholar] [CrossRef]
Chapple, J.N. Business Systems Techniques: For the Systems Professional; Longman: London, UK, 1976. [Google Scholar]
Fox, C.; Levitin, A.; Redman, T. The Notion of Data and Its Quality Dimensions. Inf. Process Manag. 1994, 30, 9–19. [Google Scholar] [CrossRef]
Ballou, D.P.; Pazer, H.L. Modeling Data and Process Quality in Multi-Input, Multi-Output Information Systems. Manag. Sci. 1985, 31, 150–162. [Google Scholar] [CrossRef]
Amicis, F.; Barone, D.; Batini, C. An Analytical Framework to Analyze Dependencies Among Data Quality Dimensions. In Proceedings of the 2006 International Conference on Information Quality, ICIQ 2006, Cambridge, MA, USA, 10–12 November 2006; pp. 369–383. [Google Scholar]
Chatfield, A.T.; Reddick, C.G. A Framework for Internet of Things-Enabled Smart Government: A Case of IoT Cybersecurity Policies and Use Cases in U.S. Federal Government. Gov. Inf. Q. 2019, 36, 346–357. [Google Scholar] [CrossRef]
Guest, G.; Bunce, A.; Johnson, L. How Many Interviews Are Enough? An Experiment with Data Saturation and Variability. Field Methods 2006, 18, 59–82. [Google Scholar] [CrossRef]
Hennink, M.M.; Kaiser, B.N.; Marconi, V.C. Code Saturation Versus Meaning Saturation: How Many Interviews Are Enough? Qual. Health Res. 2016, 27, 591–608. [Google Scholar] [CrossRef]
Mortier, R.; Haddadi, H.; Henderson, T.; McAuley, D.; Crowcroft, J. Human-data interaction: The human face of the data-driven society. arXiv 2014, arXiv:1412.6159. [Google Scholar] [CrossRef]
Sailaja, N.; Jones, R.; McAuley, D. Designing for human data interaction in data-driven media experiences. In Proceedings of the Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, Yokohama, Japan, 8–13 May 2021; pp. 1–7. [Google Scholar]
Fisher, C.W.; Kingma, B.R. Criticality of Data Quality as Exempli®ed in Two Disasters. Inf. Manag. 2001, 39, 109–116. [Google Scholar] [CrossRef]
Parssian, A.; Sarkar, S.; Jacob, V.S. Assessing Data Quality for Information Products: Impact of Selection, Projection, and Cartesian Product. Manag. Sci. 2004, 50, 967–982. [Google Scholar] [CrossRef]
Shankaranarayanan, G.; Cai, Y. Supporting Data Quality Management in Decision-Making. Decis. Support Syst. 2006, 42, 302–317. [Google Scholar] [CrossRef]
Batini, C.; Cappiello, C.; Francalanci, C.; Maurino, A. Methodologies for Data Quality Assessment and Improvement. ACM Comput. Surv. 2009, 41, 1–52. [Google Scholar] [CrossRef]
Sidi, F.; Shariat Panahy, P.H.; Affendey, L.S.; Jabar, M.A.; Ibrahim, H.; Mustapha, A. Data Quality: A Survey of Data Quality Dimensions. In Proceedings of the 2012 International Conference on Information Retrieval & Knowledge Management, Kuala Lumpur, Malaysia, 13–15 March 2012; IEEE: New York, NY, USA, 2012; pp. 300–304. [Google Scholar]
Even, A.; Shankaranarayanan, G.; Berger, P.D. Evaluating a Model for Cost-Effective Data Quality Management in a Real-World CRM Setting. Decis. Support. Syst. 2010, 50, 152–163. [Google Scholar] [CrossRef]
Moges, H.-T.; Dejaeger, K.; Lemahieu, W.; Baesens, B. A Multidimensional Analysis of Data Quality for Credit Risk Management: New Insights and Challenges. Inf. Manag. 2013, 50, 43–58. [Google Scholar] [CrossRef]
Woodall, P.; Koronios, A. An Investigation of How Data Quality Is Affected by Dataset Size in the Context of Big Data Analytics. In Proceedings of the International Conference on Information Quality, Xi’an, China, 1–3 August 2014. [Google Scholar]
Becker, D.; King, T.D.; McMullen, B. Big Data, Big Data Quality Problem. In Proceedings of the 2015 IEEE International Conference on Big Data (Big Data), Santa Clara, CA, USA, 29 October–1 November 2015; IEEE: New York, NY, USA, 2015. [Google Scholar]
Batini, C.; Rula, A.; Scannapieco, M.; Viscusi, G. From Data Quality to Big Data Quality. J. Database Manag. 2015, 26, 60–82. [Google Scholar] [CrossRef]
Rao, D.; Gudivada, V.N.; Raghavan, V.V. Data Quality Issues in Big Data. In Proceedings of the 2015 IEEE International Conference on Big Data (Big Data), Santa Clara, CA, USA, 29 October–1 November 2015; IEEE: New York, NY, USA, 2015. [Google Scholar]
Guo, J.; Liu, F. Automatic Data Quality Control of Observations in Wireless Sensor Network. IEEE Geosci. Remote Sens. Lett. 2015, 12, 716–720. [Google Scholar] [CrossRef]
Taleb, I.; Kassabi, H.T.E.; Serhani, M.A.; Dssouli, R.; Bouhaddioui, C. Big Data Quality: A Quality Dimensions Evaluation. In Proceedings of the 2016 Intl IEEE Conferences on Ubiquitous Intelligence & Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and Smart World Congress (UIC/ATC/ScalCom/CBDCom/IoP/SmartWorld), Toulouse, France, 18–21 July 2016. [Google Scholar]
Vetrò, A.; Canova, L.; Torchiano, M.; Minotas, C.O.; Iemma, R.; Morando, F. Open Data Quality Measurement Framework: Definition and Application to Open Government Data. Gov. Inf. Q. 2016, 33, 325–337. [Google Scholar] [CrossRef]
Taleb, I.; Serhani, M.A. Big Data Pre-Processing: Closing the Data Quality Enforcement Loop. In Proceedings of the 2017 IEEE 6th International Congress on Big Data, BigData Congress 2017, Honolulu, HI, USA, 25–30 June 2017; Institute of Electrical and Electronics Engineers Inc.: New York, NY, USA, 2017; pp. 498–501. [Google Scholar]
Jesiļevska, S. Data Quality Dimensions to Ensure Optimal Data Quality. Rom. Econ. J. 2017, 20, 89–103. [Google Scholar]
Färber, M.; Bartscherer, F.; Menne, C.; Rettinger, A. Linked Data Quality of DBpedia, Freebase, OpenCyc, Wikidata, and YAGO. Semant. Web 2018, 9, 77–129. [Google Scholar] [CrossRef]
Ardagna, D.; Cappiello, C.; Samá, W.; Vitali, M. Context-Aware Data Quality Assessment for Big Data. Future Gener. Comput. Syst. 2018, 89, 548–562. [Google Scholar] [CrossRef]
Cichy, C.; Rass, S. An Overview of Data Quality Frameworks. IEEE Access 2019, 7, 24634–24648. [Google Scholar] [CrossRef]
Byabazaire, J.; O’Hare, G.; Delaney, D. Using trust as a measure to derive data quality in data shared IoT deployments. In Proceedings of the 2020 29th International Conference on Computer Communications and Networks (ICCCN), Honolulu, HI, USA, 3–6 August 2020; pp. 1–9. [Google Scholar]
Makhoul, N. Review of Data Quality Indicators and Metrics, and Suggestions for Indicators and Metrics for Structural Health Monitoring. Adv. Bridge Eng. 2022, 3, 17. [Google Scholar] [CrossRef]

Figure 1. Coding structure employed.

Table 1. Summary of the literature studied.

Domain of Analysis	Objectives and Benefits	Barriers and Problems	References
DQ in Platforms in India (including Smart City and Real-Time Systems)	Improved access, usability, traceability; real-time, high-quality data for decision-making; context-specific governance and consumer empowerment	Fragmented datasets, poor documentation, limited consumer input; complex real-time governance; regulatory immaturity and disempowerment	Richter & Slowinski 2018 [3]; Koutroumpis et al., 2020 [5]; Ghosh & Garg 2025 [10]
DQ Assessment Frameworks and Methods	Multidimensional models; scalable, automated tools; real-time assessment adaptable to fragmented ecosystems	Limited real-world validation; provider-centric; requires structured datasets; consumer-friendly tools are largely missing	Wang & Strong 1996 [1] Batini & Scannapieco 2006 [11]; Behkamal et al., 2014 [12]; Pipino et al., 2002 [13]; Bovee et al., 2003 [14]; Ehrlinger et al., 2018 [15]; Chug et al., 2021 [16]; Liu et al., 2021 [17]; Gómez-Omella et al., 2022 [18]; Taleb et al., 2021 [19]; Zhang et al., 2023 [4]; Alwan et al., 2022 [20]; Sadiq and Indulska, 2017 [21]; Cai and Zhu, 2015 [22]; Abedjan et al., 2016 [23]
Consumer-Centric Governance in Data Platforms	Traceability, fairness, transparency, participatory governance, consumer trust	Weak operationalisation, limited consumer influence, regulatory immaturity	Podobnikar 2025 [24]; Tapsell et al., 2018 [25]; Acev et al., 2025 [26]; Otto & Jarke 2019 [27]; Veltri et al., 2020 [28]; Ducuing & Reich 2023 [29]; Agahari et al., 2022 [30]
Monetisation Models and Consumer Participation	Quality-driven, tiered pricing linked to consumer willingness to pay	Pricing opacity, inconsistent service quality, limited research on consumer preferences	Yu and Zhang, 2017 [31]; Yang et al., 2019 [32]; Zhang et al., 2018 [33]; Pei 2020 [34]; Miao et al., 2023 [35]; Zhang et al., 2023 [4]; Majumdar et al., 2024 [36]; Malieckal et al., 2024 [37]
Clinical and Business Protocols	Interoperability, continuous improvement, user-centred governance and consent	Regulatory complexity, sector specificity, limited applicability to open platforms	Chan et al., 2017 [38]; Juddoo et al., 2018 [39]; Facile et al., 2022 [40]; Aggarwal et al., 2021 [41]; Miller et al., 2024 [42]; Etheredge 2007 [43]; Kaye et al., 2015 [44]

Table 2. Participant profiles.

Case ID	Role	Sector	Organization Type
P1	Partner and Tech Evangelist	IT Consulting	Large Firm
P2	Director	Patent Analytics	Mid-Size Enterprise
P3	Product Owner	Software	Multinational Corporation
P4	Vice President	Independent Consultant	Independent
P5	Senior Data Scientist	Cross-Industry (Retail, Manufacturing)	Independent Consultant
P6	CEO	Technology	Private Firm
P7	Data Architect	Urban Transport Analytics	Government-Affiliated
P8	Quantitative Analyst	Financial Services	Multinational Corporation
P9	CEO	ESG and Workplace Strategy	Consulting Firm
P10	Founder Director	Computer Science and Technology	Startup
P11	Statistical Modelling Specialist	Oil and Gas Consulting	Consulting Firm
P12	Knowledge Management Consultant	Infrastructure	Large Firm
P13	Senior Manager	Banking	Multinational Corporation
P14	Consulting Partner	Financial Analytics	Large IT Firm
P15	Program Manager	Retail and Consumer Goods	IT Services Company
P16	Product Head for Traffic Solutions	Smart Cities/Urban Mobility	Private Firm
P17	CEO	Healthcare	Startup
P18	Founder	Finance and Law	Non-Profit
P19	Founder	Artificial Intelligence	Startup
P20	Founder	Mental Health	Startup
P21	Founder	Human Resources	Startup
P22	Head of Delivery Excellence	Cloud/Tech	Multinational Corporation
P23	CEO, Co-founder	Sustainability	Social Enterprise
P24	Opportunities Evaluation Manager	Oil and Gas	Multinational Corporation
P25	Co-Founder and CTO	Banking/FinTech	Startup

Table 3. First-level coding.

Quality Theme	Key Concepts and Keywords	Participants	Number of Extracts
Fit-for-Purpose Quality: Needs Expressed by Data Consumers	Consistency in Formats, Data Accuracy, Relevance to Task, Volume, Freshness, Frequency of Updating, Credibility, Completeness, Expert Validation, Documentation, API Documentation, Metadata, Data Handbook, Lineage	23 (92%)	92
Usability and Format Frictions: Common Problems Encountered	Missing data, Time Spent on Data Cleaning, Faulty Instruments, Data Duplication, Multi-Format Inflows (data/file), Lack of Documentation	19 (76%)	52
Transparency, Traceability, and Governance: Consumer Expectations and Platform Requirements	Certification, Authentication, Provider-Side Quality Checks, Establishing Common Standards, Version Control, Source Traceability, Transformation Logs	10 (40%)	22
Monetisation and Willingness to Pay: Pricing Concerns	Willingness to Pay Linked to Better-Quality, High-Quality Data Essential for Monetization	3 (12%)	3

Percentages represent the proportion of participants who mentioned each theme and do not sum to 100%, as a single participant may have contributed to multiple themes.

Table 4. Second-level Coding.

Code	Subcode	Keywords and Concepts	Participants	Number of Extracts
Fit-for-Purpose Quality	Contextual Relevance	Accuracy, Consistency, Data Hygiene, Usefulness, Timeliness, Completeness, Version Control, Metadata, Standards, Lineage, Transparency	23	80
	Temporal Dimensions	Freshness, Relevance, Timeliness, Real-Time Data, Historical Data, Mission-Critical	9	12
	Credibility Mechanisms	Authenticity, Standardization, Traceability, Veracity, Certification, Validation	7	16
Usability and Format Frictions	Cleaning Burden	Data Cleaning, Missing Values, Formatting Issues, Time-Consuming, Accuracy, Duplication, Multi-Format Data, Resource Drain, Value Addition.	10	21
	Integration Issues	Data Formats, Synchronization, Interoperability, Labelling, Vendor Lock-In, Standardization, Scalability, Harmonization, IoT Limitations.	9	19
	Documentation Gaps	Metadata, Standardization, Missing Information, Data Culture, Unstructured Data, Sampling Issues, Publication Details, Supplementary Fields.	9	12
Transparency, Traceability, and Governance	Verification Needs	Authentication, Certification, Triangulation, Data Validation, Trust, Credentials.	5	6
	Traceability Requirements	Version Control, Data Lineage, Processing History, Transparency, Documentation, Relevance.	2	3
	Standardization Gaps	Multiple Standards, Interoperability, DQ, Sector-Specific, Stakeholder Collaboration, Metadata Standards, Market Scalability.	6	13

Table 5. Theme engagement rating criteria.

Intensity	Criteria
High	≥3 participants from the domain, with detailed or multi-point discussion (several sub-themes addressed)
Medium	1–2 participants with moderate-to-deep engagement and moderate diversity
Low	1 participant with a brief mention, or 2+ participants but only surface-level comments (minimal diversity/detail)
None	No participants from the domain discussed the theme

Table 12. Participant statements on Platform Transparency, Traceability, and Governance.

Statement	Participant
“I need the certification of the data authentication of the data.”	Cloud/Tech (P22)
“standardization is needed to scale and explore and move into the market quickly or develop the market quickly.”	Urban Transport Analytics (P7)
“the verification of the data is the most important thing.”	ESG and Workplace Strategy (P9)
“Checking data quality becomes easier based on fixed parameters of data quality, because those can only be defined data.”	Finance/Law Non-Profit (P18)
“Is not easy because if you take any domain, not just one standard, even if you see sensor for sensors only, at least I have come across more than three standards.”	Software Multinational Corporation (P3)
“data standardization and the normalization are another challenge that you know I’ve been facing for in case of a data integration kind of thing”	Smart Cities/Urban Mobility (P16)
“now that becomes challenging, maintaining that entire, uh, like the history of the version control of the data and understanding what are all the changes that that has gone through over the course in time.”	Finance and Law Non-Profit (P18)
“One thing which when we’re talking about version control of the data, one thing which people do often tend to ignore is the sort of processing that has gone from the raw to the finalized data.”	Finance and Law Non-Profit (P18)

Table 13. Sector-wise platform requirements.

Sector	Participants (IDs)	Core Expectations
IT/Tech and Consulting	P1, P22	Provider-side certification (e.g., API governance checks)
Finance and Legal	P4, P14	Version control, transformation logs for auditability
Urban/Smart Cities	P7, P12	Adoption of DCAT standards, consortium-led quality parameters
Sustainability	P9, P23	Methodology transparency, source validation protocols
Healthcare	P10	Raw-to-finalized data processing history
Cross-Industry	P5	Lineage tracking for regulatory compliance

Table 14. Sector-wise engagement rating for Theme 3.

Sector	Participants (IDs)	Depth/Diversity	Rating
IT/Tech and Consulting	P1, P22	Certification, API governance checks	Medium
Finance and Legal	P4, P14	Version control, audit logs, compliance	Medium
Urban/Smart Cities	P7, P12	Standards (DCAT), quality parameters	Medium
Sustainability	P9, P23	Methodology transparency, source validation	Medium
Healthcare	P10	Data processing history	Low
Cross-Industry	P5	Lineage tracking	Low

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Malieckal, M.; Gurtoo, A. Mining User Perspectives: Multi Case Study Analysis of Data Quality Characteristics. Information 2025, 16, 920. https://doi.org/10.3390/info16100920

AMA Style

Malieckal M, Gurtoo A. Mining User Perspectives: Multi Case Study Analysis of Data Quality Characteristics. Information. 2025; 16(10):920. https://doi.org/10.3390/info16100920

Chicago/Turabian Style

Malieckal, Minnu, and Anjula Gurtoo. 2025. "Mining User Perspectives: Multi Case Study Analysis of Data Quality Characteristics" Information 16, no. 10: 920. https://doi.org/10.3390/info16100920

APA Style

Malieckal, M., & Gurtoo, A. (2025). Mining User Perspectives: Multi Case Study Analysis of Data Quality Characteristics. Information, 16(10), 920. https://doi.org/10.3390/info16100920

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Mining User Perspectives: Multi Case Study Analysis of Data Quality Characteristics

Abstract

1. Introduction

2. Background

2.1. DQ Assessment Frameworks and Methods

2.2. Consumer-Centric Governance in Data Platforms

2.3. Monetisation Models and Consumer Participation

2.4. Clinical and Business Protocols: Lessons for DQ

3. Materials and Methods

3.1. Case Selection and Participant Profile

3.2. Data Collection

Informed Consent Statement and Approvals

3.3. Data Analysis

4. Results

4.1. Fit-for-Purpose Quality: Needs Expressed by Data Consumers

4.2. Usability and Format Frictions: Common Problems Encountered

4.3. Platform Requirements: Transparency, Traceability, and Governance

5. Discussion and Implications

5.1. DQ as a Consumer-Driven Construct

5.2. Platform Frictions: Usability as a Neglected Priority

5.3. The Central Role of Transparency and Governance

5.4. Toward Consumer-Centric Data Ecosystems

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI