Towards Data-Driven Decisions in Agriculture—A Proposed Data Quality Framework for Grains Trials Research
Abstract
1. Introduction
- What are the fundamental principles of data quality and how can they be represented within a data quality framework (DQF) for grains trial research?
- Which data quality dimensions are the most useful for assessing and reporting data quality in grains trial research and how do we define those within a DQF?
- How can these quality dimensions be translated into practical tools, e.g., a trial data quality test or data quality statement that can facilitate improved data reuse and decision support from contributors and other end-users?
- How can the proposed DQF and its tools be implemented in a platform such as Online Farm Trials (https://www.farmtrials.com.au/, 6 January 2026) and what can we learn from these use cases about their impact?
2. Data Quality: Its Dimensions and Principles for Grains Trials and Research
3. DQF Conceptualisation for Grains Trial Research
4. Action Case Study Using Online Farm Trials to Demonstrate Data Quality Assessment and Reporting
- Inconsistent and incomplete trial data submissions. Many trials have missing metadata or incomplete documentation, impacting their utility and trustworthiness for research and decision-making purposes [41,42]. Many legacy trial reports, for instance, do not contain all the necessary information to generate searchable metadata within OFT [20]. An interview participant noted, “Sometimes the data isn’t complete that you access, sometimes it’s hard to get the exact parameters for which the data is gone in …” [41].
- Lack of standardisation across trial contributors. Contributors vary widely in how they collect, structure, report and label data, which impedes comparison and synthesis across trials [39,42]. In an interview conducted regarding OFT, a participant said “I think that, before you can use the data from OFT to make decisions or include it in planning, you need to have some way of making a choice about what the evidence is. Some way to indicate clearly level of quality and how reliable it really is. Then you can use it or not use it or use it with conditions” [38]. Another interview participant highlighted, “Good scientific rigour is that you should question all the data and make sure the statistics are right and the methodology makes sense to you to know yourself…you’ve got to take responsibility to know whether you trust the data or not” [42].
- Difficulty in assessing comparability and scientific validity. Users face challenges in comparing trials due to inconsistencies in methodologies, limited access to raw data and the absence of clear, uniform quality ratings [38]. As one interviewee explained, “Unless there is the capacity to go back to the raw data then how do you check the scientific validity of what is being available. You need this to be sure in what you are deciding and whether what you are looking at in the summaries captures what the raw data found” [38].
- Perceived trust and quality issues. Users repeatedly expressed concern about the accuracy and reliability of the available data, which affects their willingness to use OFT for critical decisions [42]. An interviewee made this statement about the quality of data and the trust in using it: “I think that, before you can use the data from OFT to make decisions or include it in planning, you need to have some way of making a choice about what the evidence is. Some way to indicate clearly level of quality and how reliable it really is. Then you can use it or not use it or use it with conditions” [38]. While enabling contributors to directly enter data improves efficiency, it has introduced challenges in maintaining consistent quality control, especially for non-mandatory metadata fields, as these may not be adequately monitored before publication [20].
- Challenges with metadata and FAIR Principles. While OFT aligns with FAIR data principles, the implementation varies, particularly in metadata quality and interoperability across systems [41,44]. Enhancing the FAIRness of data, especially by improving interoperability and advancing along the machine-readability continuum, remains a significant challenge, yet it also offers a valuable opportunity for OFT and the wider grains and agricultural research community to boost data accessibility, integration and impact [43].
4.1. Proposed Reporting on Trial Data Quality
4.1.1. Trial Data Quality Test
4.1.2. Trial Data Quality Statement
4.1.3. Process for Improving Data Quality in OFT
- Minimum trial metadata for admission of a trial into OFT include additional mandatory fields to enable detailed and uniform assessment of trial utility. These additional data entry fields include the following: improved trial location coordinates, crop variety information (compliant with industry standards), trial design details (experimental plot design, number of replicates, blocking, plot randomisation and plot size), harvest and sowing dates, soil sampling details (date sampled, what tests and sample depth), sowing details (tillage type and depth, row spacing, sowing depth), harvest details and treatment details (herbicide/insecticide/fungicide used, application timing and the rate of application).
- Provide more information about the trial contributor to check against the institutional environment assessment criteria.
- Identify trial locations with a unique identifier, so multiple trials conducted at one site can be linked.
- Implement common and globally accepted vocabulary and data standards for trial data (e.g., sowing rate as kg/ha rather than use of legacy measures and observations such as lb/acre).
- Automated data input from key technologies (e.g., geographic information systems (GIS), GPS, online structured data collection forms) to remove transcriptions errors, enable real-time data processing and remove the likelihood of other quality issues that arise in data collection processes.
- Validation of dates and their logical sequence (e.g., sowing date prior to harvest date).
- Trials should be assessed according to their classification. A key consideration for grains trials is that not all trials have been designed with multiple users and purposes in mind, and a framework should not seek to discredit a trial for its intended purpose. For example, quality assessments will be different for demonstration trials compared with research trials.
4.1.4. Validation and Refinement of the Proposed DQF
5. Discussion
5.1. Lessons Learned
5.2. Limitations
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A. Table of Data Quality Dimensions Used (X) in Key Publications on Data Quality (1983–2015)
| Bailey and Pearson (1983) [64] | Ives et al. (1983) [65] | Ballou and Pazer (1985) [66] | Delone and Mclean (1992) [67] | Wand and Wang (1996) [30] | Wang and Strong (1996) [68] | Redman (1997) [69] | Jarke (2002) [70] | Veregin (1999) [23] | Bovee (2003) [71] | Fisher and Kingma (2001) [72] | Pipino et al. (2002) [73] | Chapman (2005) [31] | Herzog et al. (2007) [74] | Abs (2009) [34] | Moges et al. (2013) [75] | Jayawardene et al. (2015) [76] | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ACCESSIBILITY | X | X | X | X | X | X | X | X | |||||||||
| ACCURACY AND PRECISION | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | ||
| ALIGNMENT | X | ||||||||||||||||
| APPROPRIATE AMOUNT OF DATA | X | X | X | X | X | ||||||||||||
| AVAILABILITY | X | X | |||||||||||||||
| BELIEVABILITY | X | X | X | ||||||||||||||
| COMPARABILITY | X | ||||||||||||||||
| COMPLETENESS | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | |
| CONCISE REPRESENTATION | X | X | X | ||||||||||||||
| CONSISTENCY/REPRESENTATIONAL CONSISTENCY | X | X | X | X | X | X | X | X | X | X | X | X | |||||
| CREDIBILITY | X | X | X | ||||||||||||||
| EASE OF UNDERSTANDING | X | X | X | ||||||||||||||
| FITNESS FOR USE | X | ||||||||||||||||
| FREE OF ERROR | X | ||||||||||||||||
| INTERPRETABILITY | X | X | X | X | X | X | X | X | X | ||||||||
| OBJECTIVITY | X | X | |||||||||||||||
| PORTABILITY | X | ||||||||||||||||
| RELEVANCY/RELEVANCE | X | X | X | X | X | X | X | X | X | X | X | ||||||
| RELIABILITY | X | X | X | ||||||||||||||
| REPUTATION | X | X | X | ||||||||||||||
| RESPONSIVENESS/RESPONSE TIME | X | ||||||||||||||||
| SECURITY/ACCESS SECURITY | X | X | X | ||||||||||||||
| TIME-RELATED DIMENSIONS (TIMELINESS) | X | X | X | X | X | X | X | X | X | X | X | X | X | X | X | ||
| TRACEABILITY | X | ||||||||||||||||
| VALUE-ADDED | X | X |
Appendix B. Questions for Assessment of Trial Data Quality
| Within variable: Within variable tests a data value for its compliance with a domain (e.g., does a numerical value fall within a defined range; does a categorical value comply with a pre-existing reference list or standard), including when a missing value is appropriate. | ||
| Trial Dataset | Data Quality Questions | Data Quality Tests |
| 1. Data resource | 1a. Does a disclaimer exist for the trial, including independence of the organisation? | 1a i. Disclaimer (Yes/No). 1a ii. Organisation classification. |
| 1b. Has all treatment data been published/disclosed? | 1b i. Tick box (Yes/No). | |
| 1c. Does the organisation have satisfactory data quality control procedures and trained personnel? | 1c i. Organisation quality control procedures (Yes/No). 1c ii. Organisation with statistically trained personnel (Yes/No). | |
| 1d. Does the organisation have procedures in place to manage confidentiality and change in trial status? | 1d i. Confidentiality procedure (Yes/No). | |
| 2. Trial dates, date loaded, published and last processed | 2a. What reference period for the trial, e.g., 2017 winter crop and when loaded into OFT? | 2a. Difference between growing season, year and date loaded into OFT (<5 = H, 5–10 = M, >10 = L). |
| 2b. Is there likely to be subsequent data from the same trial project (completeness)? | 2b. Grouped trials or subsequent trial. | |
| 2c. Date published on OFT. | 2c. Published/Not published. | |
| 2d. Have there been modifications or revisions of the trial data? | 2d. Has data against a trial in OFT changed (Yes/No). | |
| 2e. Is the date sequence logical? | 2e. Date last processed > date loaded >= growing year. | |
| 3. Trial number | 3a. Cross-check system assigned trial numbers with published number, as each trial on OFT is assigned a trial number. | 3a. Does system trial number = published number. |
| 4. Access and display level | 4a. Is the trial data publicly accessible via OFT? | 4a. Tick box (Yes/No). |
| 4b. Is the trial data accessible to organisations/individuals with different levels of permitted access? | 4b. Organisation control/grant permission access, e.g., login controlled profile. | |
| 4c. If the data is not publicly accessible, how is that represented/presented to users? | 4c. Directs to the trial contributor with their contact details. | |
| 5. Trial project code | 5a. Is the project code a valid project code for the funding organisation or trial contributor? | 5a. Cross-check funding organisation’s project code or trial contributor’s project code. |
| 6. Trial project name | 6a. Does the project name align with the project code? | 6a. Check with the funding organisation/trial contributor’s project code and project name. |
| 6b. Is the project name a valid project name for the funding organisation or trial contributor? | 6b. Cross-check funding organisation’s project name or trial contributor’s project name. | |
| 7. Grouped trials | 7a. Trials should be grouped if trials are assigned the same project code. | 7a. If project code = project code, group trials. |
| 8. Trial site name | 8a. If a trial site is used for >1 year, trial site is to be allocated to both with year postfix. | 8a. Example- location a _year 1, location a _year 2. |
| 8b. If a trial site is used for ≤1 year, all corner co-ordinates of the plot are documented. A location of sequential assignment is added to document for cause and effect. | 8b. System check for co-ordinates of all corners of the plot. | |
| 9. Locality | 9a. Does the locality exist? Is the spelling correct? | 9a. System check of locality name. |
| 10. Co-ordinates | 10a. Latitude: Check for entry error in latitude, including what datum is used for this set of coordinates? | 10a. System check for latitude. |
| 10b. Longitude: Check for entry error in longitude including what datum is used for this set of coordinates? | 10b. System check for longitude. | |
| 11. Trial site accuracy | 11a. Check that a class has been designated to reflect the accuracy of the trial site (i.e., nearest town or centre of the region) | 11a. Check for missing data. |
| 12. State or territory | 12a. Spatial checks and queries against coordinates. These can be auto populated. | 12a. Spatial reference can be auto populated. |
| 13. GRDC region | 13a. Check that a GRDC region has been allocated. | 13a. Check for missing data (e.g., Southern region). |
| 14. GRDC sub-region | 14a. Check that a GRDC sub-region has been allocated. | 14a. Check for missing data (e.g., Mallee). |
| 15. Researchers | 15a. Confirm key researchers. | 15a. Cross-reference with the funding organisation/trial contributor. |
| 15b. Confirm spelling of researchers’ names. | 15b. Researcher spell check. | |
| 16. Related programs | 16a. Check if any related program is listed. | 16a. All trial projects under a related program should be grouped together. |
| 17. Lead research organisation | 17a. Check against the organisation database. | 17a. Cross-reference with organisation database. |
| 17b. Check funding body/trial contributor database or contracts. | 17b. Cross-reference with funding body/trial contributor database. | |
| 18. Host research organisation | 18a. Check against the organisation database. | 18a. Cross reference with organisation database. |
| 18b. Does the organisation have satisfactory data quality control procedures and trained personnel? | 18b i. Organisation quality control procedures (Yes/No). 18b ii. Organisation with statistically trained personnel (Yes/No). | |
| 18c. Does the organisation have procedures in place to manage confidentiality and change in trial status? | 18c i. Confidentiality procedure (Yes/No). | |
| 19. Other trial partners | 19a. Check against the organisation database. | 19a. Cross-reference with organisation database. |
| 20. Funding sources | 20a. Check against funding sources. | 20a. Cross-reference with funding body/trial contributor database. |
| 20b. Sum of funding allocation of all sources = 100%. | 20b. Sum of % test = 100 (Yes/No). | |
| 21. Trial aim | 21a. Trial aim must be populated and spell-checked. | 21a i. Populated (Yes/No). 21a ii. Spell check. |
| 21b. Cross reference with funding body or trial contributor reports/database to confirm aim/research question is comparable. | 21b. Cross-reference with funding body/trial contributor reports/database. | |
| 22. Key message | 22a. Trial aim must be populated and spell checked | 22a i. Populated (Yes/No). 22a ii. Spell check. |
| 23. Acknowledgements | 23a. Spell check of inserted text. | 23a. Spell check. |
| 24. Internal notes (login-controlled profile) | 24a. Spell check of inserted text. | 24a. Spell check. |
| 25. Public notes | 25a. Spell check of inserted text. | 25a. Spell check. |
| 26. Hyperlink | 26a. Regular test to confirm hyperlink/website and web content still exists. | 26a. Web link test. |
| 27. Trial affected by adverse factors | 27a. Check if the trial has been affected by adverse factors (e.g., frost, flooding). | 27a. Populated (Yes/No). If yes, test 27b should be conducted. |
| 27b. What adverse factors affected the trial and the details of events that may influence the reliability of results (dates, damage estimate, crop growth stage at time of damage, etc.). | 27b. Spell check of inserted text | |
| 28. Linked trials | 28a. Cross-reference for trials at the same trial site in the same growing year. | 28a. Trials at same site (co-ordinates) in the same growing year. |
| 28b. Cross-reference for trials of the same program and project code. | 28b. Trials of the same program/project. | |
| 28c. Cross-reference for trials at the same trial site through time. | 28c. Trials at same site (co-ordinates) but with different growing years. | |
| 29. Feature trial | 29a. Feature trial must be in the set of linked trials. | 29a. Feature trial = in listed linked trials. |
| 30. Crop type | 30a. Crop type = one or multiple from the defined list of crop types. (defined list of crop types should reflect global standards). | 30a. Populated (Yes/No). |
| 31. Crop type variety | 31a. Variety information entered for crop type. | 31a. Populated (Yes/No). |
| 31b. Crop type variety matches existing national and international standards, e.g., BRAPI, NVT, PBR. | 31b. Cross-check variety with BRAPI, NVT, etc. | |
| 31c. ‘Unreleased’ varieties linked to a variety vocabulary (managed by GRDC). | 31c. Cross-reference unreleased varieties with variety library (https://www.ipaustralia.gov.au/plant-breeders-rights (accessed on 6 January 2026)). | |
| 32. Treatment type | 32a. Treatment type(s) entered for the trial. | 32a. Populated (Yes/No). |
| 32b. Do treatment types comply with global standards. | 32b. Cross-check BRAPI and crop ontology standards. | |
| 33. Sow rate | 33a. Sow rate entered for each crop type. Should be a required field. | 33a. Populated (Yes/No). |
| 33b. Values are numeric and in kg/ha. | 33b. Validation for numeric characters. | |
| 34. Target density | 34a. Values are numeric and in plants/m2 (note: method to assess plant density is to be recorded in metadata, e.g., quadrats). | 34a i. Validation for numeric characters. 34a ii. Metadata note regarding the method used to assess plant density. |
| 35. Sowing machinery | 35a. Sowing practice classified using international standards, e.g., direct drill, disc seeder, conventional tyne, point type, harrow chains, press wheel, etc. | 35a. Populated (Yes/No). |
| 36. Sowing date | 36a. Sowing date entered for the trial. If trial does not have a sowing date, then “not applicable” should be entered. | 36a. Populated (Yes/No). |
| 36b. Values for sowing date in standard time–date format (dd-mm-yyyy). | 36b. Validation for date format. | |
| 37. Sowing depth | 37a. Values are numeric and “mm depth from the soil surface”. | 37a i. Validation for numeric characters. 37a ii. Values converted from other measurement units, e.g., cm or inches to mm. |
| 38. Harvest date | 38a. Harvest date entered for the trial. If trial does not have a harvest date, then “not applicable” should be entered. | 38a. Populated (Yes/No). |
| 38b. Values for harvest date in standard time–date format (dd-mm-yyyy). | 38b. Validation for date format. | |
| 38c. Harvest time should be after sowing date. | 38c Harvest date > sowing date. | |
| 39. Plot size | 39a. Dimensions of plots in numeric metres, e.g., width = 2.40 m, length = 12.00 m. | 39a. Validation for numeric characters up to 2 decimal places. |
| 40. Plot replication | 40a. Number of replicates (numeric, as a whole number). | 40a. Validation for numeric characters and whole number. |
| 41. Plot randomisation | 41a. Yes/No test. If there is blocking and multifactorial then randomisation is required. | 41a. Populated (Yes/No). |
| 42. Plot blocking | 42a. Number of blocks (numeric as a whole number). | 42a. Validation for numeric characters and whole number. |
| 43. Trial design | 43a. Trial design layout file provided as attached file in OFT. | 43a. Trial design layout file attached (Yes/No). |
| 44. Paddock history | 44a. Year by treatment (crop, fertiliser, etc.). | 44a. Populated (Yes/No). |
| 45. Details of fertiliser(s) used | 45a. Fertiliser used (Yes/No). | 45a. Populated (Yes/No). If yes, test 45b should be conducted. |
| 45b. Details of fertiliser(s) used: fertiliser type(s), rate(s) of application, date(s) of application, method of application. | 45b. Standard fields—fertiliser type, rate of application (kg/ha), date of application (dd-mm-yyyy), application method. | |
| 46. Details of Fungicide(s) used | 46a. Fungicide used (Yes/No). | 46a. Populated (Yes/No) If yes, test 46b should be conducted |
| 46b. Details of fungicide(s) used: fungicide type(s), rate(s) of application, date(s) of application, method of application. | 46b. Standard fields—fungicide type, rate of application (kg/ha), date of application (dd-mm-yyyy), application method. | |
| 47. Details of Herbicide(s) used | 47a. Herbicide used (Yes/No). | 47a. Populated (Yes/No) If yes, test 47b should be conducted |
| 47b. Details of herbicide(s) used: herbicide type(s), rate(s) of application, date(s) of application, method of application. | 47b. Standard fields—herbicide type, rate of application (kg/ha), date of application (dd-mm-yyyy), application method. | |
| 48. Details of Insecticide(s) used | 48a. Insecticide used (Yes/No). | 48a. Populated (Yes/No). If yes, test 48b should be conducted. |
| 48b. Details of insecticide(s) used: insecticide type(s), rate(s) of application, date(s) of application, method of application. | 48b. Standard fields—insecticide type, rate of application (kg/ha), date of application (dd-mm-yyyy), application method. | |
| 49. Soil amelioration | 49a. Soil amelioration (Yes/No). | 49a. Populated (Yes/No). If yes, test 49b should be conducted. |
| 49b. Details of soil amelioration: year/date, treatment type, rate (kg/ha), ingredient qualities (e.g., Deep MAP—nutrients kg/ha; lime with ENV%). | 49b. Standard fields—treatment date (dd-mm-yyyy), treatment type, rate of application (kg/ha), ingredient quality. | |
| 50. Seed treatment | 50a. Any seed treatment performed (Yes/No). | 50a. Populated (Yes/No). If yes, test 50b should be conducted. |
| 50b. Product listed with Australian Pesticides and Veterinary Medicines Authority (APVMA). | 50b. Product matches APVMA standards. | |
| 51. Inoculant | 51a. Any inoculant used (Yes/No). | 51a. Populated (Yes/No). If yes, test 51b should be conducted. |
| 51b. Product listed with APVMA. | 51b. Product matches APVMA standards. | |
| 52. Tillage | 52a. Any tillage performed on the paddock (Yes/No). | 52a. Populated (Yes/No). If yes, test 52b should be conducted. |
| 52b. Details of tillage performed: date (dd-mm-yyyy), type, depth estimate. | 52b. Standard fields—treatment date (dd-mm-yyyy), tillage type, depth of tillage (numeric value). | |
| 53. Trial results | 53a. Do results and data types comply with measurement types? | 53a. Standard field for result data (numeric, text, etc.). |
| Between variables: Between variables tests, the data value between two variables for standard values and logical tests (e.g., oil content and crop type where oil content of barley = non-logical). | ||
| Trial dataset | Data quality questions | Data quality tests |
| 54. Are dates between variables sequential? | 54a. Do the dates for treatments, sowing, harvest, etc. all comply with a date sequence? | 54a. Check for sequence of the dates (e.g., sowing date < harvest date). |
| 55. Measurement type data value | 55a. This is when a measurement type result value does not comply with the crop type or treatment type. | 55a. The measurement value should comply with the crop and measurement type (e.g., grain yield is measured as kg/ha and not lt/ha). |
| 56. Trial site location | 56a. When the trial location details (coordinates) do not comply with other designated information (e.g., state, GRDC zone). | 56a. Verification checks for correct country, state and region. The same town can exist in multiple states or countries, so the trial site location should be selected correctly. |
| 57. Corrupt trial result | 57a. When the trial result entered is corrupt. | 57a. System checks to test trial result is entered in a machine-readable format and accessible to viewers. |
| Between records: Between records tests for data where a proper sequence can be expected in data, missing records occur or comparison of data for a measurement type. | ||
| Trial dataset | Data quality questions | Data quality tests |
| 58. Are dates between records sequential? | 58a. Do the dates for treatments, sowing, harvest, etc. all comply with a date sequence? | 58a. Check for sequence of the dates (e.g., sowing date < harvest date). |
| 59. Trial site numbers | 59a. This test checks that trial site numbers are retained and that new numbers are assigned to new trial sites. | 59a. System check to validate each trial site has a unique trial site number (1 trial site number cannot be assigned to more than 1 trial site). |
| 60. Grouping trials | 60a. Have trials been successfully grouped and links between trials maintained? | 60a. System checks to validate all the sub trials are linked to the trial project. |
| 61. Trial site names | 61a. This checks that trial site names are unique and that trial sites with the same name are identified and corrections are made. | 61a. System check to validate that trial site names are unique and that trial sites with the same name are identified and exceptions are out. |
| Between tables: Between tables tests for data values across two or multiple tables where relationships across tables can be violated (e.g., labels or measurement types do not match trial metadata and key messages). This could include dates, incorrect attachment of the trial report, incorrectly linked trials or grouped trials, trial contributor/s and researcher inconsistencies, incorrect site location, etc. | ||
| Trial dataset | Data quality questions | Data quality tests |
| 62. Link identifiers integrity retained | 62a. Has the permanent identifiers for tables in OFT been maintained and are they not corrupt? | 62a. Check if the integrity of the link for the tables is maintained. |
| 63. Are dates between tables sequential? | 63a. Do the dates for treatments, sowing, harvest, etc. all comply with a date sequence? | 63a. Check for sequence of the dates (e.g., sowing date < harvest date). |
| Between systems: Between tables tests for data values across two or multiple tables where relationships across tables can be violated (e.g., labels or measurement types do not match trial metadata and key messages). This could include dates, incorrect attachment of the trial report, incorrectly linked trials or grouped trials, trial contributor/s and researcher inconsistencies, incorrect site location, etc. | ||
| Trial dataset | Data quality questions | Data quality tests |
| 64. Web referencing | 64a. Is the trial discoverable online through search engines (e.g., Google search, Bing)? | 64a. Check for search engine optimisation for each trial on OFT. |
| 65. Interoperability test | 65a. Are data sourced from external interoperable systems for the same year as the trial (e.g., Bureau of Meteorology (BOM) data, CSIRO DAP)? 65b. Does the trial link with funding body/ trial contributor’s final reports? | 65a. System check to validate the year of data export, corresponding to a trial site. 65b. Check for funding body/trial contributor’s project codes to link. |
References
- Fraser, E.D.; Campbell, M. Agriculture 5.0: Reconciling production with planetary health. One Earth 2019, 1, 278–280. [Google Scholar] [CrossRef]
- OECD; FAO. Cereals. In OECD-FAO Agricultural Outlook 2023–2032; OECD Publishing: Paris, France, 2023. [Google Scholar] [CrossRef]
- Anderson, R.; Bayer, P.E.; Edwards, D. Climate change and the need for agricultural adaptation. Curr. Opin. Plant Biol. 2020, 56, 197–202. [Google Scholar] [CrossRef]
- Briat, J.-F.; Gojon, A.; Plassard, C.; Rouached, H.; Lemaire, G. Reappraisal of the central role of soil nutrient availability in nutrient management in light of recent advances in plant nutrition at crop and molecular levels. Eur. J. Agron. 2020, 116, 126069. [Google Scholar] [CrossRef]
- Qaim, M. Role of New Plant Breeding Technologies for Food Security and Sustainable Agricultural Development. Appl. Econ. Perspect. Policy 2020, 42, 129–150. [Google Scholar] [CrossRef]
- Khan, N.; Ray, R.L.; Sargani, G.R.; Ihtisham, M.; Khayyam, M.; Ismail, S. Current Progress and Future Prospects of Agriculture Technology: Gateway to Sustainable Agriculture. Sustainability 2021, 13, 4883. [Google Scholar] [CrossRef]
- Awasthi, G.; Nagar, V.; Mandzhieva, S.; Minkina, T.; Sankhla, M.S.; Pandit, P.P.; Aseri, V.; Awasthi, K.K.; Rajput, V.D.; Bauer, T.; et al. Sustainable Amelioration of Heavy Metals in Soil Ecosystem: Existing Developments to Emerging Trends. Minerals 2022, 12, 85. [Google Scholar] [CrossRef]
- Gerhards, R.; Andujar Sanchez, D.; Hamouz, P.; Peteinatos, G.G.; Christensen, S.; Fernandez-Quintanilla, C. Advances in site-specific weed management in agriculture—A review. Weed Res. 2022, 62, 123–133. [Google Scholar] [CrossRef]
- Weersink, A.; Fraser, E.; Pannell, D.; Duncan, E.; Rotz, S. Opportunities and challenges for big data in agricultural and environmental analysis. Annu. Rev. Resour. Econ. 2018, 10, 19–37. [Google Scholar] [CrossRef]
- Araújo, S.O.; Peres, R.S.; Barata, J.; Lidon, F.; Ramalho, J.C. Characterising the Agriculture 4.0 Landscape—Emerging Trends, Challenges and Opportunities. Agronomy 2021, 11, 667. [Google Scholar] [CrossRef]
- Osinga, S.; Paudel, D.; Mouzakitis, S.; Athanasiadis, I. Big data in agriculture: Between opportunity and solution. Agric. Syst. 2022, 195, 103298. [Google Scholar] [CrossRef]
- Fenz, S.; Neubauer, T.; Johannes, H.; Jurgen, F.; Wohlmuth, M.-L. AI- and data-driven pre-crop values and crop rotation matrices. Eur. J. Agron. 2023, 150, 126949. [Google Scholar]
- Nativi, S.; Mazzetti, P.; Santoro, M.; Papeschi, F.; Craglia, M.; Ochiai, O. Big data challenges in building the global earth observation system of systems. Environ. Model. Softw. 2025, 68, 1–26. [Google Scholar] [CrossRef]
- Nandyala, C.S.; Kim, H.-K. Big and meta data management for U-Agriculture mobile services. Int. J. Softw. Eng. Its Appl. 2016, 10, 257–270. [Google Scholar] [CrossRef]
- Jouanjean, M.-A.; Casalini, F.; Wiseman, L.; Gray, E. Issues around data governance in the digital transformation of agriculture. In OECD Food, Agriculture and Fisheries; Paper No 146; OECD Publishing: Paris, France, 2020; Volume Paper No 146. [Google Scholar]
- Provost, F.; Fawcett, T. Data science and its relationship to big data and data-driven decision making. Big Data 2013, 1, 51–59. [Google Scholar] [CrossRef]
- Maria, K.; Maria, B.; Andrea, K. Exploring actors, their constellations, and roles in digital agricultural innovations. Agric. Syst. 2021, 186, 102952. [Google Scholar] [CrossRef]
- Gonzalez-Vidal, A.; Ramallo-González, A.P.; Skarmeta, A.F. Intrinsic and extrinsic quality of data for open data repositories. ICT Express 2022, 8, 328–333. [Google Scholar] [CrossRef]
- Tenopir, C.; Rice, N.; Allard, S.; Baird, L.; Borycz, J.; Christian, L.; Grant, B.; Olendorf, R.; Sandusky, R. Data sharing, management, use, and re-use: Practices and perceptions of scientists worldwide. PLoS ONE 2020, 15, e0229003. [Google Scholar]
- Walters, J.; Light, K.; Robinson, N. Using agricultural metadata: A novel investigation of trends in sowing date in on-farm research trials using the Online Farm Trials database. F1000Research 2021, 9, 1305. [Google Scholar] [CrossRef]
- Nicholson, N.; Negrao Carvalho, R.; Štotl, I. A FAIR Perspective on Data Quality Frameworks. Data 2025, 10, 136. [Google Scholar] [CrossRef]
- Benos, L.; Tagarakis, A.C.; Dolias, G.; Berruto, R.; Kateris, D.; Bochtis, D. Machine Learning in Agriculture: A Comprehensive Updated Review. Sensors 2021, 21, 3758. [Google Scholar] [CrossRef] [PubMed]
- Veregin, H. Data quality parameters. Geogr. Inf. Syst. 1999, 1, 177–189. [Google Scholar]
- Phillips, P.W.B.; Relf-Eckstein, J.-A.; Jobe, G.; Wixted, B. Configuring the new digital landscape in western Canadian agriculture. NJAS-Wagening. J. Life Sci. 2019, 90–91, 100295. [Google Scholar] [CrossRef]
- Hatanaka, M.; Konefal, J.; Strube, J.; Glenna, L.; Conner, D. Data-Driven Sustainability: Metrics, Digital Technologies, and Governance in Food and Agriculture. Rural Sociol. 2022, 87, 206–230. [Google Scholar] [CrossRef]
- Roitsch, T.; Cabrera-Bosquet, L.; Fournier, A.; Ghamkhar, K.; Jimenez-Berni, J.; Pinto, F.; Ober, E. Review: New sensors and data-driven approaches- A path to next generation phenomics. Plant Sci. 2019, 282, 2–10. [Google Scholar] [CrossRef]
- Robinson, N.; Thompson, H.; Milne, R.; Wills, B.; Feely, P.; MacLeod, A.; Parker, J.; Walters, J. Online Farm Trials: Data Quality Framework for OFT Trial Resources; CeRDI Internal Report; CeRDI: Clermont-Ferrand, France, 2018; 60p. [Google Scholar]
- ISO 9001:2015; Quality Management Systems. Requirements. International Organisation for Standardisation: Geneva, Switzerland, 2015.
- Earley, S.; Henderson, D.; Seba. DAMA-DMBOK: Data Management Body of Knowledge, 2nd ed.; Technics Publications, LLC: Bradley Beach, NJ, USA, 2017. [Google Scholar]
- Wand, Y.; Wang, R.Y. Anchoring data quality dimensions in ontological foundations. Commun. ACM 1996, 39, 86–95. [Google Scholar] [CrossRef]
- Chapman, A.D. Principles of Data Quality; Report for the Global Biodiversity Information Facility; Global Biodiversity Information Facility: Copenhagen, Denmark, 2005. [Google Scholar]
- Redman, T.C. Data quality management past, present, and future: Towards a management system for data. In Handbook of Data Quality: Research and Practice; Sadiq, S., Ed.; Springer: Berlin/Heidelberg, Germany, 2013; pp. 15–40. [Google Scholar]
- ISO 8000-2; Data Quality. Vocabulary. International Organisations for Standardisation: Geneva, Switzerland, 2022.
- ABS; Australian Bureau of Statistics. 1520.0—ABS Data Quality Framework. 2009. Available online: https://www.abs.gov.au/ausstats/abs@.nsf/mf/1520.0 (accessed on 14 August 2023).
- Guillen-Aguinaga, M.; Aguinaga-Ontoso, E.; Guillen-Aguinaga, L.; Guillen-Grima, F.; Aguinaga-Ontoso, I. Data Quality in the Age of AI: A Review of Governance, Ethics, and the FAIR Principles. Data 2025, 10, 201. [Google Scholar] [CrossRef]
- Fan, W.; Geerts, F. Foundations of Data Quality Management; Springer Nature: Cham, Switzerland, 2022. [Google Scholar]
- Wilkinson, M.D.; Dumontier, M.; Aalbersberg, I.J.; Appleton, G.; Axton, M.; Baak, A.; Blomberg, N.; Boiten, J.-W.; da Silva Santos, L.B.; Bourne, P.E. The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data 2016, 3, 1–9. [Google Scholar] [CrossRef] [PubMed]
- Murphy, A.; McKenna, K.; Corbett, J.; Taylor, M. Online Farm Trials Impact Research (First Wave) Extended Timeframe Research Study; Centre for eResearch and Digital Innovation, Federation University, Australia: Ballarat, VIC, Australia, 2016; 106p. [Google Scholar]
- Walters, J.; Milne, R.; Thompson, H. Online Farm Trials: A national web-based information source for Australia grains research, development and extension. Rural Ext. Innov. Syst. J. 2018, 14, 117–123. [Google Scholar]
- Robinson, N.; Dahlhaus, P.; Feely, P.; Light, K.; MacLeod, A.; Milne, R.; Parker, J.; Thompson, H.; Walters, J.; Wills, B. Online Farm Trials (OFT)—The past, present and future. In Cells to Satellites, Proceedings of the 19th Australian Society of Agronomy Conference, Wagga Wagga, NSW, Australia, 25–29 August 2019; Australian Society of Agronomy: Winthrop, WA, Australia, 2019. [Google Scholar]
- Ollerenshaw, A.; Robinson, N.; Chadha, A.; Channon, J. A smart agriculture information system delivering research data for the adoption by the Australian grains industry. Smart Agric. Technol. 2024, 9, 100610. [Google Scholar] [CrossRef]
- Ollerenshaw, A.; Murphy, A.; Walters, J.; Robinson, N.; Thompson, H. Use of digital technology for research data and information transfer within the Australian grains sector: A case study using Online Farm Trials. Agric. Syst. 2023, 206, 103591. [Google Scholar] [CrossRef]
- Wills, B.; Parker, J.; Robinson, N.; Wong, M. Improving the FAIRness of Australia’s grains research sector data. In Cells to Satellites, Proceedings of the 19th Australian Society of Agronomy Conference, Wagga Wagga, NSW, Australia, 25–29 August 2019; Australian Society of Agronomy: Winthrop, WA, Australia, 2019. [Google Scholar]
- Walters, J.R.; Light, K. The Australian digital Online Farm Trials database increases the quality of systematic reviews and meta-analyses in grains crop research. Crop Pasture Sci. 2021, 72, 789–800. [Google Scholar] [CrossRef]
- Nousak, P.; Phelps, R. A scorecard approach to improving data quality. In Proceedings of the Data Warehousing and Enterprise Solutions, Sugi-27, Orlando, FL, USA, 14–17 April 2002. [Google Scholar]
- Government of NSW. NSW Government Standard for Data Quality Reporting. 2015. Available online: https://www.finance.nsw.gov.au/ict/resources/data-quality-standard (accessed on 28 March 2018).
- Cichy, C.; Rass, S. An overview of data quality frameworks. IEEE Access 2019, 7, 24634–24648. [Google Scholar] [CrossRef]
- Borgman, C.L. The conundrum of sharing research data. J. Am. Soc. Inf. Sci. Technol. 2012, 63, 1059–1078. [Google Scholar] [CrossRef]
- van Vlokhoven, H. The effect of open access on research quality. J. Informetr. 2019, 13, 751–756. [Google Scholar] [CrossRef]
- Moore, E.K.; Kriesberg, A.; Schroeder, S.; Geil, K.; Haugen, I.; Barford, C.; Johns, E.M.; Arthur, D.; Sheffield, M.; Ritchie, S.M. Agricultural data management and sharing: Best practices and case study. Agron. J. 2022, 114, 2624–2634. [Google Scholar] [CrossRef]
- Wolfert, S.; Ge, L.; Verdouw, C.; Bogaardt, M.-J. Big data in smart farming—A review. Agric. Syst. 2017, 153, 69–80. [Google Scholar] [CrossRef]
- Chergui, N.; Kechadi, M.T. Data analytics for crop management: A big data view. J. Big Data 2022, 9, 1–37. [Google Scholar] [CrossRef]
- Gupta, N.; Patel, H.; Afzal, S.; Panwar, N.; Mittal, R.S.; Guttula, S.; Jain, A.; Nagalapatti, L.; Mehta, S.; Hans, S. Data Quality Toolkit: Automatic assessment of data quality and remediation for machine learning datasets. arXiv 2021, arXiv:2108.05935. [Google Scholar] [CrossRef]
- Bhat, S.A.; Huang, N.-F. Big data and ai revolution in precision agriculture: Survey and challenges. IEEE Access 2021, 9, 110209–110222. [Google Scholar] [CrossRef]
- Rouhani, S.; Deters, R. Data trust framework using blockchain technology and adaptive transaction validation. IEEE Access 2021, 9, 90379–90391. [Google Scholar] [CrossRef]
- Juddoo, S. Overview of data quality challenges in the context of Big Data. In Proceedings of the International Conference on Computing, Communication and Security (ICCCS), Pointe aux Piments, Mauritius, 4–6 December 2015. [Google Scholar]
- Gudivada, V.; Apon, A.; Ding, J. Data quality considerations for big data and machine learning: Going beyond data cleaning and transformations. Int. J. Adv. Softw. 2017, 10, 1–20. [Google Scholar]
- ISO 8000-8; Data Quality. Information and Data Quality: Concepts and Measuring. International Organization for Standardization: Geneva, Switzerland, 2015.
- Susarla, A.; Gopal, R.; Thatcher, J.B.; Sarker, S. The Janus Effect of Generative AI: Charting the Path for Responsible Conduct of Scholarly Activities in Information Systems. Inf. Syst. Res. 2023, 34, 399–408. [Google Scholar] [CrossRef]
- Sikorska, J.; Bradley, S.; Hodkiewicz, M.; Fraser, R. DRAT: Data risk assessment tool for university–industry collaborations. Data-Centric Eng. 2020, 1, e17. [Google Scholar] [CrossRef]
- Higgins, S.; Schellberg, J.; Bailey, J.S. Improving productivity and increasing the efficiency of soil nutrient management on grassland farms in the UK and Ireland using precision agriculture technology. Eur. J. Agron. 2019, 106, 67–74. [Google Scholar] [CrossRef]
- Charlebois, S.; Latif, N.; Ilahi, I.; Sarker, B.; Music, J.; Vezeau, J. Digital Traceability in Agri-Food Supply Chains: A Comparative Analysis of OECD Member Countries. Foods 2024, 13, 1075. [Google Scholar] [CrossRef]
- DAFF. National Agricultural Traceability Strategy 2023 to 2033; Department of Agriculture, Fisheries and Forestry (DAFF): Canberra, Australia, 2023. [Google Scholar]
- Bailey, J.E.; Pearson, S.W. Development of a tool for measuring and analyzing computer user satisfaction. Manag. Sci. 1983, 29, 530–545. [Google Scholar] [CrossRef]
- Ives, B.; Olson, M.H.; Baroudi, J.J. The measurement of user information satisfaction. Commun. ACM 1983, 26, 785–793. [Google Scholar] [CrossRef]
- Ballou, D.P.; Pazer, H.L. Designing information systems to optimize the accuracy-timeliness tradeoff. Inf. Syst. Res. 1995, 6, 51–72. [Google Scholar] [CrossRef]
- DeLone, W.H.; McLean, E.R. Information systems success: The quest for the dependent variable. Inf. Syst. Res. 1992, 3, 60–95. [Google Scholar] [CrossRef]
- Wang, R.Y.; Strong, D.M. Beyond accuracy: What data quality means to data consumers. J. Manag. Inf. Syst. 1996, 12, 5–33. [Google Scholar] [CrossRef]
- Redman, T.C. Data Quality for the Information Age; Artech House, Inc.: Norwood, MA, USA, 1997. [Google Scholar]
- Jarke, M.; Lenzerini, M.; Vassiliou, Y.; Vassiliadis, P. Fundamentals of Data Warehouses; Springer Science & Business Media: Berlin, Germany, 2002. [Google Scholar]
- Bovee, M.; Srivastava, R.P.; Mak, B. A conceptual framework and belief-function approach to assessing overall information quality. Int. J. Intell. Syst. 2003, 18, 51–74. [Google Scholar] [CrossRef]
- Fisher, C.W.; Kingma, B.R. Criticality of data quality as exemplified in two disasters. Inf. Manag. 2001, 39, 109–116. [Google Scholar] [CrossRef]
- Pipino, L.L.; Lee, Y.W.; Wang, R.Y. Data quality assessment. Commun. ACM 2002, 45, 211–218. [Google Scholar] [CrossRef]
- Herzog, T.N.; Scheuren, F.J.; Winkler, W.E. Data Quality and Record Linkage Techniques; Springer: New York, NY, USA, 2007. [Google Scholar]
- Moges, H.-T.; Dejaeger, K.; Lemahieu, W.; Baesens, B. A multidimensional analysis of data quality for credit risk management: New insights and challenges. Inf. Manag. 2013, 50, 43–58. [Google Scholar] [CrossRef]
- Jayawardene, V.; Sadiq, S.; Indulska, M. An Analysis of Data Quality Dimensions; School of Information Technology and Electrical Engineering, The University of Queensland: Brisbane City, Australia, 2015. [Google Scholar]


| Data Quality Dimension | Tests and Reporting Assessments |
|---|---|
| Accessibility | Accessibility to the public; accessibility of data products |
| Accuracy | Coverage error; sample error; non-response error; response error; other sources of error; revisions to data |
| Coherence | Changes to data items; comparisons across data items; comparisons with previous releases; comparison with other available products |
| Institutional environment | Impartiality and objectivity; professional independence; data collection mandate; adequacy of resources; quality commitment; statistical confidentiality |
| Interpretability | Availability of information regarding the data; presentation of information |
| Relevance | Scope and coverage; reference period; geographic detail; main data outputs; classification and statistical standards; estimate variable types |
| Timeliness | Time lag between trial implementation and data availability; frequency and survey/trials |
| Institutional Environment | |
|
|
|
|
|
|
|
|
|
|
| Relevance | |
|
|
|
|
|
|
|
|
|
|
| Timeliness | |
|
|
|
|
|
|
|
|
|
|
| Accuracy | |
|
|
|
|
|
|
|
|
|
|
| Coherence | |
|
|
|
|
|
|
|
|
|
|
| Interpretability | |
|
|
|
|
|
|
|
|
|
|
| Accessibility | |
|
|
|
|
|
|
|
|
|
|
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Chadha, A.; Robinson, N.; Channon, J. Towards Data-Driven Decisions in Agriculture—A Proposed Data Quality Framework for Grains Trials Research. Data 2026, 11, 19. https://doi.org/10.3390/data11010019
Chadha A, Robinson N, Channon J. Towards Data-Driven Decisions in Agriculture—A Proposed Data Quality Framework for Grains Trials Research. Data. 2026; 11(1):19. https://doi.org/10.3390/data11010019
Chicago/Turabian StyleChadha, Aakansha, Nathan Robinson, and Judy Channon. 2026. "Towards Data-Driven Decisions in Agriculture—A Proposed Data Quality Framework for Grains Trials Research" Data 11, no. 1: 19. https://doi.org/10.3390/data11010019
APA StyleChadha, A., Robinson, N., & Channon, J. (2026). Towards Data-Driven Decisions in Agriculture—A Proposed Data Quality Framework for Grains Trials Research. Data, 11(1), 19. https://doi.org/10.3390/data11010019

