Determinants of Data Quality Dimensions for Assessing Highway Infrastructure Data Using Semiotic Framework
Abstract
:1. Introduction
2. Objectives
- To establish the data quality dimensions necessary for determining the data quality of highway infrastructure data to facilitate effective decision-making.
- To determine the importance of data quality dimensions at each level of decision-making.
- To determine the priority of dimensions within the semiotic framework categories.
3. Literature Review
3.1. Data and Data Quality
3.2. Data Quality Assessment Framework
3.3. Semiotic Framework
4. Methodology
- Step 1: Identification of data quality dimensions of highway infrastructure data using the semiotic framework.
- Step 2: Data Collection
- Step 3: Data Analysis
4.1. Identification of Critical Data Quality Dimensions of Highway Infrastructure Data
4.2. Importance of Data Quality Dimensions at Respective Decision-Making Levels
4.3. Ranking of Data Quality Dimensions within the Semiotic Framework
5. Results and Discussion
5.1. Critical Dimensions at Each Level of Decision-Making Hierarchy
5.1.1. Strategic Level
5.1.2. Network Level
5.1.3. Program Level
5.1.4. Project Selection Level
5.1.5. Project Level
5.2. Ranking of Dimensions within the Semiotic Framework Categories
5.2.1. Syntactics Category
5.2.2. Empiric Category
5.2.3. Semantic Category
5.2.4. Pragmatic Category
6. Conclusions
Practical Engineering and Real-World Applications of Semiotic Framework
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Press Information Bureau. NHAI Becomes the First Construction Sector Organisation to Go Fully Digital. 2020. Available online: https://pib.gov.in/indexd.aspx (accessed on 12 June 2020).
- Snyder, J.; Menard, A.; Spare, N. Big Data = Big Questions for the Engineering and Construction Industry; White Paper; First Myanmar Investment (FMI): Yangon, Myanmar, 2019. [Google Scholar]
- Thomas, E.; Schott, P.; Bowman, J.; Synder, J.; Spare, N. Construction Disconnected: Rethinking the Management of Project Data and Mobile Collaboration to Reduce Costs and Improve Schedules; Plan Grid; First Myanmar Investment (FMI): Yangon, Myanmar, 2018. [Google Scholar]
- Deibe, D.; Amor, M.; Doallo, R. Big Data Geospatial Processing for Massive Aerial LiDAR Datasets. Remote Sens. 2020, 12, 719. [Google Scholar] [CrossRef] [Green Version]
- Pierce, L.M.; McGovern, G.; Zimmerman, K.A. Practical Guide for Quality Management of Pavement Condition Data Collection; FHWA: Washington, DC, USA, 2013.
- Oh, E.; Lee, H. An Imbalanced Data Handling Framework for Industrial Big Data Using a Gaussian Process Regression-Based Generative Adversarial Network. Symmetry 2020, 12, 669. [Google Scholar] [CrossRef] [Green Version]
- Zhang, Y.; Kim, C.-W.; Zhang, L.; Bai, Y.; Yang, H.; Xu, X.; Zhang, Z. Long Term Structural Health Monitoring for Old Deteriorated Bridges: A Copula-ARMA Approach. Smart Struct. Syst. Int. J. 2020, 25, 285–299. [Google Scholar]
- Zhang, Z.; Liu, M.; Liu, X.; Wang, X.; Zhang, Y. Model Identification of Durability Degradation Process of Concrete Material and Structure Based on Wiener Process. Int. J. Damage Mech. 2021, 30, 537–558. [Google Scholar] [CrossRef]
- Batini, C.; Rula, A.; Scannapieco, M.; Viscusi, G. From data quality to bid data quality. J. Database Manag. 2015, 26, 60–82. [Google Scholar]
- Lee, I. Big Data: Dimensions, Evolution, Impacts, and Challenges. Bus. Horiz. 2017, 60, 293–303. [Google Scholar] [CrossRef]
- Sadiq, S.; Papotti, P. Big Data Quality-Whose Problem Is It? In Proceedings of the IEEE 32nd International Conference on Data Engineering (ICDE), Helsinki, Finland, 16–20 May 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 1446–1447. [Google Scholar]
- Saha, B.; Srivastava, D. Data Quality: The Other Face of Big Data. In Proceedings of the IEEE 30th International Conference on Data Engineering, Chicago, IL, USA, 31 March–4 April 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 1294–1297. [Google Scholar]
- Taleb, I.; el Kassabi, H.T.; Serhani, M.A.; Dssouli, R.; Bouhaddioui, C. Big Data Quality: A Quality Dimensions Evaluation. In Proceedings of the 2016 Intl IEEE Conferences on Ubiquitous Intelligence & Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and Smart World Congress (UIC/ATC/ScalCom/CBDCom/IoP/SmartWorld), Toulouse, France, 18–21 July 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 759–765. [Google Scholar]
- Elouataoui, W.; el Alaoui, I.; el Mendili, S.; Gahi, Y. An Advanced Big Data Quality Framework Based on Weighted Metrics. Big Data Cogn. Comput. 2022, 6, 153. [Google Scholar] [CrossRef]
- Cai, L.; Zhu, Y. The Challenges of Data Quality and Data Quality Assessment in the Big Data Era. Data Sci. J. 2015, 14, 2. [Google Scholar] [CrossRef] [Green Version]
- Ghasemaghaei, M.; Calic, G. Can Big Data Improve Firm Decision Quality? The Role of Data Quality and Data Diagnosticity. Decis. Support Syst. 2019, 120, 38–49. [Google Scholar] [CrossRef]
- Haug, A.; Zachariassen, F.; van Liempd, D. The Costs of Poor Data Quality. J. Ind. Eng. Manag. 2011, 4, 168–193. [Google Scholar]
- Laranjeiro, N.; Soydemir, S.N.; Bernardino, J. A Survey on Data Quality: Classifying Poor Data. In Proceedings of the IEEE 21st Pacific Rim International Symposium on Dependable Computing (PRDC), Zhangjiajie, China, 18–20 November 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 179–188. [Google Scholar]
- Sadiq, S.; Yeganeh, K.; Indulska, M. Cross-Disciplinary Collaborations in Data Quality Research. ECIS Proc. 2011, 78, 1–13. [Google Scholar]
- Sidi, F.; Ishak, I.; Affendey, L.S.; Jaya, M.I.; Suriani Affendey, L.; Jabar, M.A. A Review of Data Quality Research in Achieving High Data Quality Within Organization. J. Theor. Appl. Inf. Technol. 2017, 30, 12. [Google Scholar] [CrossRef]
- Yonke, C.L.; Walenta, C.; Talburt, J.R. The Job of the Information/Data Quality Professional; International Association for Information and data Quality (IAIDQ): Baltimore, MD, USA, 2011. [Google Scholar]
- Ballou, D.P.; Pazer, H.L. Modeling Data and Process Quality in Multi-Input, Multi-Output Information Systems. Manag. Sci. 1985, 31, 150–162. [Google Scholar] [CrossRef]
- Ballou, D.; Wang, R.; Pazer, H.; Tayi, G.K. Modeling Information Manufacturing Systems to Determine Information Product Quality. Manag. Sci. 1998, 44, 462–484. [Google Scholar] [CrossRef] [Green Version]
- Wand, Y.; Wang, R.Y. Anchoring Data Quality Dimensions in Ontological Foundations. Commun. ACM 1996, 39, 86–95. [Google Scholar] [CrossRef]
- English, L.P. Information Quality Applied: Best Practices for Improving Business Information, Processes and Systems; Wiley Publishing: Hoboken, NJ, USA, 2009; ISBN 047013447X. [Google Scholar]
- Redman, T.C. Data Quality for the Information Age; Artech House, Inc.: Norwood, MA, USA, 1997; ISBN 0890068836. [Google Scholar]
- Coleman, C. Managing Information Quality: Increasing the Value of Information in Knowledge-Intensive Products and Processes; Springer: Berlin/Heidelberg, Germany, 2007. [Google Scholar]
- Tan, S.G.; Cheng, D. Quality Assurance of Performance Data for Pavement Management Systems. In Design, Analysis, and Asphalt Material Characterization for Road and Airfield Pavements; ASCE: Reston, VA, USA, 2014; pp. 163–169. [Google Scholar]
- Price, R.; Shanks, G. Chapter 4 Data Quality and Decision Making. In Handbook on a Decision Support System; Springer: Berlin/Heidelberg, Germany, 2008; pp. 65–82. [Google Scholar]
- Samitsch, C. Data Quality and Its Impacts on Decision-Making: How Managers Can Benefit from Good Data; Springer: Berlin/Heidelberg, Germany, 2014; ISBN 3658082003. [Google Scholar]
- Krogstie, J. A Semiotic Approach to Data Quality. In Proceedings of the Lecture Notes in Business Information Processing; Springer: Berlin/Heidelberg, Germany, 2013; Volume 147, pp. 395–410. [Google Scholar]
- Huang, H. Big Data to Knowledge–Harnessing Semiotic Relationships of Data Quality and Skills in Genome Curation Work. J. Inf. Sci. 2018, 44, 785–801. [Google Scholar] [CrossRef] [Green Version]
- Long, J.A.; Seko, C.E. A New Method for Database Data Quality Evaluation at the Canadian Institute for Health Information (CIHI). In Proceedings of the 7th International Conference on Information Quality (IQ 2002), Tempe, AZ, USA, 24–28 February 2002; pp. 238–250. [Google Scholar]
- Lee, Y.W.; Strong, D.M.; Kahn, B.K.; Wang, R.Y. AIMQ: A Methodology for Information Quality Assessment. Inf. Manag. 2002, 40, 133–146. [Google Scholar] [CrossRef]
- Pipino, L.L.; Lee, Y.W.; Wang, R.Y.; Yang, R.Y. Data Quality Assessment. Commun. ACM 2002, 45, 211–218. [Google Scholar] [CrossRef]
- Sukumar, S.R.; Natarajan, R.; Ferrell, R.K. Quality of Big Data in Health Care. Int. J. Health Care Qual. Assur. 2015, 28, 621–634. [Google Scholar] [CrossRef]
- Jankalová, M.; Jankal, R. How to Characterise Business Excellence and Determine the Relation between Business Excellence and Sustainability. Sustainability 2020, 12, 6198. [Google Scholar] [CrossRef]
- Wang, R.Y. A Product Perspective on Total Data Quality Management. Commun. ACM 1998, 41, 58–65. [Google Scholar] [CrossRef]
- Del Pilar Angeles, M.; García-Ugalde, F. A Data Quality Practical Approach. Int. J. Adv. Softw. 2009, 1, 259–299. [Google Scholar]
- Sebastian-Coleman, L. Measuring Data Quality for Ongoing Improvement: A Data Quality Assessment Framework; Elsevier: Walthman, MA, USA, 2012; ISBN 0123977541. [Google Scholar]
- Vaziri, R.; Mohsenzadeh, M.; Habibi, J. TBDQ: A Pragmatic Task-Based Method to Data Quality Assessment and Improvement. PLoS ONE 2016, 11, e0154508. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Valverde, C.; Marotta, A.; Panach, J.I.; Vallespir, D. Towards a Model and Methodology for Evaluating Data Quality in Software Engineering Experiments. Inf. Softw. Technol. 2022, 151, 107029. [Google Scholar] [CrossRef]
- Liebenau, J.; Backhouse, J. Understanding Information: An Introduction; Palgrave Macmillan: London, UK, 1990; ISBN 0333536800. [Google Scholar]
- Azeroual, O.; Jha, M.; Nikiforova, A.; Sha, K.; Alsmirat, M.; Jha, S. A Record Linkage-Based Data Deduplication Framework with DataCleaner Extension. Multimodal Technol. Interact 2022, 6, 27. [Google Scholar] [CrossRef]
- Abedjan, Z.; Chu, X.; Deng, D.; Fernandez, R.C.; Ilyas, I.F.; Ouzzani, M.; Papotti, P.; Stonebraker, M.; Tang, N. Detecting Data Errors: Where Are We and What Needs to Be Done? Proc. VLDB Endow. 2016, 9, 993–1004. [Google Scholar] [CrossRef] [Green Version]
- Wang, R.Y.; Strong, D.M. Beyond Accuracy: What Data Quality Means to Data Consumers. J. Manag. Inf. Syst. 1996, 12, 5–33. [Google Scholar] [CrossRef]
- Crosby, P.B. Quality Is Free: The Art of Making Quality Certain; Signet Book: West Bengal, India, 1980; Volume 2247, ISBN 0451622472. [Google Scholar]
- Fu, Q.; Easton, J.M. Understanding Data Quality: Ensuring Data Quality by Design in the Rail Industry. In Proceedings of the IEEE International Conference on Big Data (Big Data), Boston, MA, USA, 11–14 December 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 3792–3799. [Google Scholar]
- Ramasamy, A.; Chowdhury, S. Big Data Quality Dimensions: A Systematic Literature Review. J. Inf. Syst. Technol. Manag. 2020, 17. [Google Scholar] [CrossRef]
- Madnick, S.; Zhu, H. Improving Data Quality through Effective Use of Data Semantics. Data Knowl. Eng. 2006, 59, 460–475. [Google Scholar] [CrossRef] [Green Version]
- English, L.P. Improving Data Warehouse and Business Information Quality: Methods for Reducing Costs and Increasing Profits; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 1999; ISBN 0471253839. [Google Scholar]
- Redman, T.C. Data Quality: The Field Guide; Digital Press: Oxford, UK, 2001; ISBN 1555582516. [Google Scholar]
- Batini, C.; Cappiello, C.; Francalanci, C.; Maurino, A. Methodologies for Data Quality Assessment and Improvement. ACM Comput. Surv. 2009, 41, 1–52. [Google Scholar] [CrossRef] [Green Version]
- Gao, B.; Zhou, Q.; Deng, Y. BIM-AFA: Belief Information Measure-Based Attribute Fusion Approach in Improving the Quality of Uncertain Data. Inf. Sci. 2022, 608, 950–969. [Google Scholar] [CrossRef]
- Madnick, S.; Wang, R.; Dravis, F.; Chen, X. Improving the Quality of Corporate Household Data: Current Practices and Research Directions. SSRN Electron. J. 2000, 365180. [Google Scholar] [CrossRef] [Green Version]
- Redman, T.C. Improve Data Quality for Competitive Advantage. MIT Sloan Manag. Rev. 1995, 36, 99. [Google Scholar]
- Hassenstein, M.J.; Vanella, P. Data Quality—Concepts and Problems. Encyclopedia 2022, 2, 498–510. [Google Scholar] [CrossRef]
- Gabr, M.I.; Helmy, Y.M.; Elzanfaly, D.S. Data Quality Dimensions, Metrics, and Improvement Techniques. Future Comput. Inf. J. 2021, 6, 25–44. [Google Scholar] [CrossRef]
- Jesiļevska, S. Data Quality Dimensions to Ensure Optimal Data Quality. Rom. Econ. J. 2017, 20, 63. [Google Scholar]
- Gyulgyulyan, E.; Ravat, F.; Astsatryan, H.; Aligon, J. Data Quality Impact in Business Inteligence. In Proceedings of the 2018 Ivannikov Memorial Workshop, (IVMEM), Yerevan, Armenia, 3–4 May 2018; IEEE: Piscataway, NJ, USA, 2019; pp. 47–51. [Google Scholar]
- Loshin, D. Enterprise Knowledge Management: The Data Quality Approach; Morgan Kaufmann: London, UK, 2001; ISBN 0124558402. [Google Scholar]
- Cappiello, C.; Ficiaro, P.; Pernici, B. HIQM: A Methodology for Information Quality Monitoring, Measurement, and Improvement. In Proceedings of the International Conference on Conceptual Modeling; Springer: Berlin/Heidelberg, Germany, 2006; pp. 339–351. [Google Scholar]
- Batini, C.; Cabitza, F.; Cappiello, C.; Francalanci, C.; di Milano, P. A Comprehensive Data Quality Methodology for Web and Structured Data. In Proceedings of the 2006 1st International Conference on Digital Information Management, Bangalore, India, 6–8 December 2006; IEEE: Piscataway, NJ, USA, 2006; pp. 448–456. [Google Scholar]
- Moraga, C.; Moraga, M.Á.; Caro, A.; Calero, C. SPDQM: SQuaRE-Aligned Portal Data Quality Model. In Proceedings of the 9th International Conference on Quality Software, QSIC, Jeju, Republic of Korea, 24–25 August 2009. [Google Scholar]
- Carlo, B.; Daniele, B.; Federico, C.; Simone, G. A Data Quality Methodology for Heterogeneous Data. Int. J. Database Manag. Syst. 2011, 3, 60–79. [Google Scholar] [CrossRef]
- Falkenberg, E.D. A Framework of Information System Concepts; The FRISCO Report (Web Edition); University of Leiden, Department of Computer Science: Leiden, The Netherlands, 1998; ISBN 3901882014. [Google Scholar]
- Kahn, M.G.; Raebel, M.A.; Glanz, J.M.; Riedlinger, K.; Steiner, J.F. A Pragmatic Framework for Single-Site and Multisite Data Quality Assessment in Electronic Health Record-Based Clinical Research. Med. Care 2012, 50, S21–S29. [Google Scholar] [CrossRef] [Green Version]
- Knoke, D.; Yang, S. Social Network Analysis; SAGE Publication: Thousand Oaks, CA, USA, 2019. [Google Scholar]
- Lee, Y.W.; Strong, D.M. Knowing-Why about Data Processes and Data Quality. J. Manag. Inf. Syst. 2003, 20, 13–39. [Google Scholar] [CrossRef] [Green Version]
- Alshikhi, O.A.; Abdullah, B.M. Information Quality: Definitions, Measurement, Dimensions, And Relationship with Decision Making. Eur. J. Bus. Innov. Res. 2018, 6, 36–42. [Google Scholar]
- Jayawardene, V.; Sadiq, S.; Indulska, M. An Analysis of Data Quality Dimensions; The University of Queensland: St Lucia, Australia, 2015; pp. 1–31. [Google Scholar]
- Tejay, G.; Dhillon, G.; Goyal Chin, A. Data Quality Dimensions for Information Systems Security: A Theoretical Exposition. In Security Management, Integrity, and Internal Control in Information Systems; IFIP TC-11 WG 11.1 &WG 11.5 Joint Working Conference 7; Springer: Berlin/Heidelberg, Germany, 2005; pp. 21–39. [Google Scholar]
- Tobler, E. A Needs Assessment of Arizona Agricultural Education Equine Science Curriculum. Ph.D. Dissertation, Utah State University, Logan, UT, USA, 2018. [Google Scholar]
- Johari, S.; Jha, K. Determinants of Workmanship: Defining Quality in Construction Industry. In Proceedings of the 35th Annual Conference; Leeds Beckett University: Leeds, UK, 2019; p. 761. [Google Scholar]
- Tripathi, K.K.; Jha, K.N. An Empirical Study on Performance Measurement Factors for Construction Organizations. KSCE J. Civ. Eng. 2018, 22, 1052–1066. [Google Scholar] [CrossRef]
- Assistant Librarian, D.S. Application of Garret Ranking Technique: Practical Approach. Int. J. Libr. Inf. Stud. 2016, 6, 135–140. [Google Scholar]
- Garrett, H.E.; Woodworth, R.S. Statistics in Psychology and Education; Vakils, Feffer and Simons Private Ltd.: Bombay, India, 1969; p. 329. [Google Scholar]
S. No. | Framework | Dimensions | References |
---|---|---|---|
1 | TDQM: Total Data Quality Management | Accuracy, objectivity, believability, reputation, access, security, relevance, value-added, timeliness, completeness, amount of data, interpretability, ease of understanding, concise representation, and consistent representation. | [38] |
2 | TIQM: Total Information Quality Management | Definition conformance, completeness, validity, accuracy, precision, non-duplication, the equivalence of redundant or distributed data, accessibility, timeliness, contextual clarity, derivation integrity, usability, usability, and rightness. | [51] |
3 | COLDQ: Cost-effect of Low Data Quality | Data model: Clarity of definition, comprehensiveness, flexibility, robustness, essentialness, attribute granularity, the precision of domains, homogeneity, naturalness, identifiability, obtainability, relevance, simplicity, and semantic and structural consistency. Data values: Accuracy, completeness, consistency, currency, null values, and timeliness. Information Policy: Accessibility, metadata, privacy, redundancy, security, and unit cost. Presentation: Appropriateness, correct interpretation, flexibility, format precision, portability, consistent representation, representation of null value, and use of storage. | [61] |
4 | AIMQ: A Methodology for Information Quality Assessment | Accessibility, appropriate amount, believability, completeness, concise representation, consistent representation, ease of operation, free-of-error, interpretability, objectivity, relevancy, reputation, security, timeliness, and understandability. | [34] |
5 | DQA: Data Quality Assessment | Accessibility, appropriate data, objectivity, believability, reputation, security, relevancy, value-added, timeliness, completeness, interpretability, ease of manipulation, understandability, concise representation, consistent representation, and free-of-error. | [35] |
6 | HIQM: Hybrid Information Quality Management | Accuracy, completeness, consistency, and timeliness. | [62] |
7 | CDQ: Comprehensive Methodology for Data Quality Management | Accuracy, completeness, and currency, Unstructured: Currency, relevance, and reliability. | [63] |
8 | DQPA: A Data Quality Assessment Framework | Accuracy, completeness, consistency, timeliness, uniqueness, and volatility. | [39] |
9 | SPDQM: Square-Aligned Portal Data Quality Model | Accuracy, traceability, correctness, expiration, completeness, consistency, accessibility, compliance, confidentiality, efficiency, precision, and understandability. Availability, accessibility, verifiability, confidentiality, portability, and recoverability. Validity, value-added, relevancy, specialisation, usefulness, efficiency, effectiveness, traceability, compliance, precision, concise representation, consistent representation, attractiveness, and readability. | [64] |
10 | HDQM: A Data Quality Methodology for Heterogeneous Data | Accuracy and currency. | [65] |
11 | DQAF: Data Quality Assessment Framework | Completeness, timeliness, validity, consistency, and integrity. | [40] |
12 | TBDQ: Task-Based Data Quality Method | Accuracy, completeness, consistency, and timeliness. | [41] |
13 | OODADQ: The Observe-Orient-Decide-Act Methodology | Speed and volume. | [36] |
14 | Semiotic Approach Data Quality-SESP model | Accuracy, consistency representation, unbiased, accessibility, up-to-date, traceability, security, believability, interpretability, ease of manipulation, understandability, completeness, appropriate amount of information, relevancy, concise representation, value-added, and reputation. | [32] |
Semiotic Levels | DQ Dimensions | DQ Dimensions Perspective |
---|---|---|
Empiric = It addresses issues that arise when data are utilised repeatedly. This level focuses on developing means of communication and data handling. | Accessibility | Accessibility implies that data must be accessible, obtainable, or retrievable when necessary for data to be accessible. |
Timeliness | Timeliness is concerned with the age of data and whether data are current. It is achieved if the recorded value is not out of date. | |
Security | As a dimension, security involves securing data and limiting access to it. | |
Syntactic = It focuses on the structures and formats of data. It deals with the physical form of data rather than their content. | Accuracy | The accuracy dimension is concerned with the conformity of the recorded value with the actual value. It implies that data are accurate, flawless, trustworthy, and error-free. |
Completeness | Completeness concerns capturing all values for a specific variable and preventing data loss. It implies that the data must have adequate breadth, depth, and scope for the given task. | |
Conciseness | Conciseness is a well-organised, concise, and condensed representation of data. | |
Consistency | Consistency is achieved when data are represented in the same format, are compatible with previous data, and are represented consistently. | |
Ease of operation | Ease of operation implies that data are manipulatable, integrated, customised, and utilised for multiple purposes. It is similar to flexibility. | |
Integrity | Integrity measures correctness and consists of semantic and physical integrity. Semantic integrity measures consistency and completeness concerning the rule of the description language. Physical integrity measures the correctness of implementation details. | |
Structure | Format or structure implies that data are in the correct format and structure. | |
Semantic = At the semantic level, dimensions are connected with information rather than data. Information is selected data to which meaning has been assigned in a particular context. It is concerned with meaning. | Ambiguity | Ambiguity arises due to improper representation and is when data can be interpreted in more than one way. |
Believability | Believability is concerned with whether data can be believed or regarded as credible. | |
Interpretability | Interpretability means that data should be interpreted; that is, it should be defined clearly and represented appropriately. | |
Definition | Meaningfulness or definition is concerned with the interpretation of data. The failure of this dimension results in meaningless data. | |
Reliability | Reliability in terms of concepts drawn from the field of quality control. | |
Understandability | Understandability concerns whether data are clear, readable, unambiguous, and easily comprehendible. | |
Validity | Data are valid when verified as genuine and satisfying appropriate standards related to other dimensions. | |
Pragmatic = It focuses on how individuals use information. It concerns the relationship between data, information, and behaviour in each context. | Appropriateness | Appropriateness as a data quality dimension means that data must be appropriate to the task at hand. |
Relevant | Relevancy is concerned with the applicability of data to the task at hand. It is a crucial dimension if the data do not address the customer’s needs and when the customer finds the data inadequate. | |
Value | Value is added as a dimension that addresses the benefits and advantages of using data. |
S. No. | Data Quality Dimensions | Mean | Std. Deviation | Rank |
---|---|---|---|---|
1 | Accuracy | 4.52 | 0.64 | 1 |
2 | Accessibility | 4.40 | 0.70 | 2 |
3 | Completeness | 4.36 | 0.76 | 3 |
4 | Consistency | 4.28 | 0.67 | 4 |
5 | Timeliness | 4.27 | 0.68 | 5 |
6 | Structure | 3.90 | 0.90 | 6 |
7 | Ambiguity | 3.90 | 0.98 | 7 |
8 | Integrity | 3.83 | 0.85 | 8 |
9 | Value | 3.72 | 0.88 | 9 |
10 | Validity | 3.63 | 1.04 | 10 |
11 | Reliability | 3.58 | 1.12 | 11 |
12 | Appropriateness | 3.58 | 1.12 | 12 |
13 | Relevant | 3.58 | 0.85 | 13 |
14 | Definition | 3.50 | 0.81 | 14 |
15 | Interpretability | 3.38 | 0.96 | 15 |
16 | Understandability | 3.38 | 1.10 | 16 |
17 | Believability | 3.36 | 1.01 | 17 |
18 | Ease of Operation | 3.35 | 0.99 | 18 |
19 | Security | 3.35 | 0.90 | 19 |
20 | Conciseness | 3.29 | 0.83 | 20 |
S. No. | Data Quality Attributes | Strategic Level | Network Level | Program Level | Project Selection Level | Project Level |
---|---|---|---|---|---|---|
1 | Accuracy | 4.6 | 4.4 | 4.7 | 4.8 | 4.4 |
2 | Consistency | 4.8 | 4.6 | 4.1 | 4.3 | 4.2 |
3 | Completeness | 4.8 | 4.4 | 4.2 | 4.6 | 4.3 |
4 | Structure | 4.8 | 4.3 | 3.6 | 4.2 | 3.7 |
5 | Integrity | 4.8 | 4.0 | 3.6 | 4.1 | 3.7 |
6 | Conciseness | 3.0 | 3.4 | 3.4 | 3.3 | 3.2 |
7 | Ease of Operation | 3.6 | 3.3 | 3.3 | 3.2 | 3.4 |
8 | Accessibility | 4.2 | 4.6 | 4.7 | 4.4 | 4.3 |
9 | Timeliness | 4.4 | 4.6 | 4.4 | 3.9 | 4.2 |
10 | Security | 3.8 | 3.4 | 3.4 | 3.2 | 3.3 |
11 | Definition | 4.2 | 3.8 | 3.6 | 3.8 | 3.3 |
12 | Ambiguity | 4.8 | 4.4 | 4.2 | 4.1 | 3.5 |
13 | Believability | 3.6 | 3.0 | 3.2 | 4.0 | 3.4 |
14 | Interpretability | 3.4 | 3.4 | 3.2 | 3.7 | 3.4 |
15 | Reliability | 4.2 | 3.4 | 3.5 | 2.9 | 3.7 |
16 | Understandability | 2.8 | 3.3 | 3.4 | 3.4 | 3.4 |
17 | Validity | 3.0 | 4.0 | 3.6 | 3.0 | 3.7 |
18 | Relevant | 4.2 | 3.8 | 3.6 | 3.7 | 3.4 |
19 | Value | 4.2 | 3.8 | 3.9 | 4.0 | 3.6 |
20 | Appropriateness | 3.8 | 3.4 | 3.6 | 3.9 | 3.6 |
Ranks | Percentage Position | Garret Score |
---|---|---|
1 | 7.14 | 79 |
2 | 21.43 | 66 |
3 | 35.71 | 57 |
4 | 50.00 | 50 |
5 | 64.29 | 43 |
6 | 78.57 | 34 |
7 | 92.86 | 22 |
S. No. | Factors | Rank | Total Number of Stakeholders | Total Score | Total Mean | Rank | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | 7 | ||||||
1 | Accuracy | 52 | 16 | 6 | 6 | 11 | 9 | 5 | 105 | 6695 | 63.76 | 1 |
2 | Consistency | 12 | 40 | 18 | 19 | 3 | 6 | 7 | 105 | 6051 | 57.63 | 2 |
3 | Completeness | 12 | 25 | 42 | 6 | 5 | 5 | 10 | 105 | 5897 | 56.16 | 3 |
4 | Structure | 6 | 5 | 15 | 45 | 24 | 6 | 4 | 105 | 5233 | 49.84 | 4 |
5 | Integrity | 8 | 6 | 9 | 16 | 49 | 10 | 7 | 105 | 4942 | 47.07 | 5 |
6 | Conciseness | 8 | 7 | 5 | 6 | 8 | 44 | 27 | 105 | 4113 | 39.17 | 6 |
7 | Ease of Operation | 7 | 6 | 10 | 7 | 5 | 25 | 45 | 105 | 3924 | 37.37 | 7 |
Ranks | Percentage Position | Garret Score |
---|---|---|
1 | 16.67 | 69.00 |
2 | 50.00 | 50.00 |
3 | 83.33 | 31.00 |
S. No. | Factors | Rank | Total Number of Stakeholders | Total Score | Total Mean | Rank | ||
---|---|---|---|---|---|---|---|---|
1 | 2 | 3 | ||||||
1 | Accessibility | 46 | 33 | 26 | 105 | 5630 | 53.62 | 1 |
2 | Timeliness | 38 | 47 | 20 | 105 | 5592 | 53.26 | 2 |
3 | Security | 21 | 25 | 59 | 105 | 4528 | 43.12 | 3 |
S. No. | Factors | Rank | Total Number of Stakeholders | Total Score | Total Mean | Rank | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | 7 | ||||||
1 | Ambiguity | 51 | 16 | 7 | 6 | 10 | 9 | 6 | 105 | 6652 | 63.35 | 1 |
2 | Definition | 12 | 25 | 41 | 7 | 5 | 5 | 10 | 105 | 5890 | 56.10 | 2 |
3 | Believability | 7 | 6 | 10 | 7 | 6 | 24 | 45 | 105 | 3933 | 37.46 | 3 |
4 | Interpretability | 8 | 7 | 9 | 16 | 48 | 10 | 7 | 105 | 4965 | 47.29 | 4 |
5 | Reliability | 12 | 39 | 17 | 19 | 5 | 6 | 7 | 105 | 6014 | 57.28 | 5 |
6 | Understandability | 8 | 7 | 5 | 6 | 8 | 45 | 26 | 105 | 4125 | 39.29 | 6 |
7 | Validity | 7 | 5 | 16 | 44 | 23 | 6 | 4 | 105 | 5276 | 50.25 | 7 |
S. No. | Factors | Rank | Total Number of Stakeholders | Total Score | Total Mean | Rank | ||
---|---|---|---|---|---|---|---|---|
1 | 2 | 3 | ||||||
1 | Relevant | 33 | 49 | 23 | 105 | 5440 | 51.81 | 2 |
2 | Value | 44 | 30 | 31 | 105 | 5497 | 52.35 | 1 |
3 | Appropriateness | 28 | 26 | 51 | 105 | 4813 | 45.84 | 3 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Krishna, C.M.; Ruikar, K.; Jha, K.N. Determinants of Data Quality Dimensions for Assessing Highway Infrastructure Data Using Semiotic Framework. Buildings 2023, 13, 944. https://doi.org/10.3390/buildings13040944
Krishna CM, Ruikar K, Jha KN. Determinants of Data Quality Dimensions for Assessing Highway Infrastructure Data Using Semiotic Framework. Buildings. 2023; 13(4):944. https://doi.org/10.3390/buildings13040944
Chicago/Turabian StyleKrishna, Chenchu Murali, Kirti Ruikar, and Kumar Neeraj Jha. 2023. "Determinants of Data Quality Dimensions for Assessing Highway Infrastructure Data Using Semiotic Framework" Buildings 13, no. 4: 944. https://doi.org/10.3390/buildings13040944