Next Article in Journal
Working Fluid Selection for Biogas-Powered Organic Rankine Cycle-Vapor Compression Cycle
Previous Article in Journal
Analyzing Small-Particle Contamination in Disposable Food Service Ware, Drinking Water, and Commercial Table Salt in Doha, Qatar
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Proceeding Paper

A Review of Global Microplastic (MP) Databases: A Study on the Challenges and Opportunities for Data Integration in the Context of MP Pollution †

by
Hussain Ahamed
1,
Marwa Al-Ani
2,
Ala Al-Ardah
1 and
Noora Al-Qahtani
3
1
College of Arts and Science, Qatar University, Doha 2713, Qatar
2
College of Engineering, Qatar University, Doha 2713, Qatar
3
Center for Advanced Materials (CAM), Office of the Vice President for Research & Graduate Studies (VPRGS), Qatar University, Doha 2713, Qatar
Presented at the 2025 11th International Conference on Advanced Engineering and Technology, Incheon, Republic of Korea, 21–23 March 2025.
Mater. Proc. 2025, 22(1), 6; https://doi.org/10.3390/materproc2025022006
Published: 21 July 2025

Abstract

Microplastic (MP) pollution is an escalating global environmental concern, with a growing body of research addressing diverse dimensions of this issue. Despite this progress, the field remains hindered by generating large, heterogeneous datasets that follow inconsistent reporting standards, resulting in fragmented and often incompatible databases. While various databases on MPs have been developed, they primarily operate in isolation, limiting the accessibility and cross-comparison of data. This study presents a foundational approach to aggregating and accessing existing MP pollution datasets. A comprehensive review of the currently available databases was conducted to evaluate their integration potential. It revealed key challenges such as non-standardized data formats, limited accessibility, and difficulty performing comparative analyses across sources. To address these barriers, a prototype web-based platform was developed that enables unified access to MP datasets. The architecture includes a smart standardization layer that harmonizes inputs from disparate sources. The integration of Large Language Models (LLMs) with Retrieval-Augmented Generation (RAG) techniques was proposed to facilitate natural language querying. This enables researchers to interact with the platform intuitively and extract meaningful insights more efficiently. The proposed system aims to enhance data discoverability, promote interoperability, and support robust, data-driven environmental research, paving the way toward more informed policy-making and scientific collaboration in the fight against MP pollution. With this platform, there is a potential for new discoveries and a future in which the tools to effectively combat this global issue are available, making the audience realize the potential for new discoveries.

1. Introduction

Microplastic (MP) pollution is not just a growing global ecological concern, it is an urgent one. Contamination has been reported in aquatic and terrestrial ecosystems, focusing on Africa [1,2]. The small size of MPs means that lower-trophic organisms often ingest them, leading to physiological damage such as intestinal blockages [3]. Over time, MPs degrade and enter food chains through water systems and direct ingestion, resulting in bioaccumulation and biomagnification that impact wildlife and humans. Despite the widespread issue of plastic waste mismanagement, there remains a significant lack of data on MPs in terrestrial and aquaculture environments [4,5]. Standardized methodologies are critical to accurately quantify MPs and trace their pollution sources [6]. However, fragmented datasets and inconsistent sampling, preparation, and identification methods hinder cross-study comparisons and broader assessments [7,8]. Efforts to harmonize methodologies, including updated reporting guidelines [9] and rescaled protocols [10], aim to improve the consistency of data. Centralized repositories, such as NOAA’s marine MP portal, and the application of FAIR (Findable, Accessible, Interoperable, and Reusable) data principles [11], offer potential for better data integration. This study proposes a web-based platform that consolidates diverse MP datasets using advanced visualization and analytical tools. Such a platform would support policy development and data-driven decision-making and significantly impact cross-disciplinary collaboration, making the audience realize the significance of their work in the fight against MP pollution.

2. Overview of Existing MP Databases

As research into MP contamination has progressed, multiple databases have been developed to facilitate data sharing and synthesis. These databases vary widely in scope, data availability, update frequency, and user accessibility. Below is a comparative overview of prominent MP databases. A comparative overview of major MP databases is presented in Table 1 [12,13,14,15,16,17,18,19,20].

2.1. Marine-Focused MP Databases

Marine MP pollution remains one of MP research’s most extensively studied areas. Key databases include the following:
-
NOAA NCEI Marine MP Database: This database includes 22,266 data points from global studies. It offers a GIS-based mapping interface and is updated annually [12].
-
LITTERBASE: Contains ~3000 entries on macro- and MPs. Freely accessible but updated irregularly [13].
-
One Earth–One Ocean (OEOO) MP Pollution Map: This open access interactive map visualizes pollution hotspots [14].
-
Adventure Scientists Freshwater MP Dataset: This dataset includes 2681 datapoints, mainly from North America. It is not publicly available [15].
-
European Marine MP Database: This database contains substantial data, but access and transparency are limited [16].

2.2. Other MP-Related Databases

Several additional databases target non-marine environments or specific analytical aspects of MP contamination:
-
Southern California Coastal Water Research Project (SCCWRP): Focuses on toxicity and organism exposure. Limited access and lacks standardized metadata [17].
-
International Pellet Watch (IPW): Tracks resin pellets and associated PAHs. Updates are unspecified, and access is limited [18].
-
SLoPP and SLoPP-E/FLOPP and FLOPP-E Spectral Libraries: Provide 261 Raman and 381 FTIR spectra for MP identification [19].
-
MP-SED Sediment Database: Contains 1064 entries on soil and sediment MP contamination [20].
Table 1. Comparative overview of major MP databases [12,13,14,15,16,17,18,19,20].
Table 1. Comparative overview of major MP databases [12,13,14,15,16,17,18,19,20].
Database NameCategoryDatapointsUser Accessibility
NOAA NCEI Marine MPs DatabaseMarine22,266Open Access, GIS Interface
LITTERBASEMarine3071Open Access, GIS Interface
One Earth One Ocean (OEOO) MP MapMarine3028Open Access, Visual Map
Adventure Scientists Freshwater MPs DataFreshwater2681Not Publicly Accessible
European MPs DatabaseMarineNot specified 1Restricted
Southern California Coastal Water Research ProjectMarineNot specified 1Open Access
International Pellet Watch (IPW)Resin PelletNot specified 1Open Access
SLoPP and SLoPP-E Spectral
Libraries
Raman Spectral261Open Access
FTIR Spectral Databases (FLOPP, FLOPP-E)FTIR Spectral381Open Access
MP-SED MP Database for SedimentsSediments1064Open Access
European Marine Observation and Data Network (EMODnet)MarineNot specified 1Restricted
1 Note: Datapoints not specified are either access-restricted or not explicitly disclosed in the source material.

3. Proposed Materials and Development Methods

A comprehensive assessment of global MP databases was conducted to inform the creation of a unified platform for MP data. This involved a two-step approach: manual web searches and the use of AI tools. Ten databases were screened, evaluating each for thematic coverage, accessibility, structural design, and functionality. After this initial assessment, a gap analysis was performed to identify key limitations related to the geographical scope, data standardization, and interoperability between the datasets.
To visualize this methodology, Figure 1 illustrates the full workflow developed by the authors, which includes four main stages: materials and development methods, data collection, data analysis, and platform development.
Based on the gap analysis results, a web platform using Python 3.10 for the backend was developed. The platform is designed to facilitate the integration of different MP datasets through standardized importers, enabling easier access and analysis.
-
Data Collection: The data collection phase focused on identifying and compiling existing global MP databases. A dual approach was employed, which included targeted manual web searches of known repositories and AI-powered tools to uncover additional relevant data sources.
-
Data Analysis: Each of the ten screened databases was evaluated for its overall utility and compatibility with integration into the proposed platform. Key factors considered during the evaluation included data coverage, ease of access, data structure, and available functionalities. Following this, a gap analysis identified specific challenges such as limited geographic representation, inconsistent data standards, and issues with interoperability across different datasets.
-
Platform Development: Informed by the gap analysis, the platform was developed to overcome these challenges and provide a unified system for data integration.

4. Results

Development of the Web Application

A dedicated web application was developed to support the integration, exploration, and interpretation of MP data. Figure 2 presents the system architecture of the platform, showing how various components work together to facilitate data ingestion, standardization, and user interaction. This platform enables researchers to efficiently browse, filter, analyze, and download datasets, thereby streamlining the actionable insights. The application uses Streamlit, a Python framework that is ideal for rapidly prototyping data-driven applications. It employs a backend that reads datasets from CSV files, defined in a configuration dictionary. The frontend interface is designed for intuitiveness and efficiency, significantly lowering the technical entry barrier for end users. Future plans to enhance the data utility and scalability include migrating to a Relational Database Management System (RDBMS). This transition aims to support larger datasets, improve performance under concurrent user loads, and enhance data security. The platform applies a standardization layer that normalizes and integrates diverse data inputs into a unified resource. Additionally, it incorporates Large Language Models (LLMs) to enable natural language querying, thereby allowing researchers to interact with the datasets more intuitively.
The platform comprises several integrated modules that are tailored explicitly for MPs data analysis. Figure 3 displays the user interface of the application, highlighting the main interactive modules designed for streamlined data exploration and interpretation:
-
Data Explorer: Facilitates interactive dataset browsing and filtering.
-
FTIR Viewer: Processes uploaded FTIR spectra, leveraging the SciPy library for peak detection and aiding in polymer material identification.
-
Polymer Search: Enables users to search for specific polymers across datasets.
-
Dashboard: Displays essential metrics such as the dataset volume and update frequency.
The application is fully developed in Streamlit and uses core Python libraries, including Pandas for data handling, Plotly 6.2.0 for dynamic visualizations, and SciPy for spectral data processing. The open-source application is published under the MIT License on GitHub, which does not use a fixed versioning system but supports semantic versioning through release tags. This approach promotes transparency, reusability, and community collaboration.
Building upon the existing prototype, a novel architecture integrating LLMs with Retrieval-Augmented Generation (RAG) is proposed. This approach enriches natural language output by retrieving relevant contextual information from standardized datasets, ensuring precise and verifiable research results. The architecture integrates a data standardization module, consolidating inputs into a unified repository. Users can submit queries in natural language, and results are returned in plain language via the LLM, guided by RAG principles. This enhancement is expected to reduce research time and increase productivity, particularly in areas requiring the rapid synthesis of large volumes of environmental data.

5. Limitations and Future Work

While the current platform demonstrates promise, several limitations have been identified:
-
File-Based Backend: The use of CSV files limits the scalability, hinders performance, and restricts support for concurrent multi-user access.
-
Dataset Standardization: There is a lack of well-defined inclusion criteria and consistent formatting across integrated datasets.
-
Data Harmonization Issues: Inconsistent data formats and units challenge seamless integration.
-
Geographical Imbalance: The current datasets lack representation from the Global South, limiting its global relevance.
To address these limitations, the following actions are proposed:
-
Database Migration: Transition to a relational or NoSQL database system to enhance performance, scalability, and multi-user support.
-
Advanced Data Harmonization Framework: Develop and deploy algorithms to automatically standardize formats, nomenclature, and units.
-
Global Partnership Initiative: Collaborate with institutions in underrepresented regions to improve inclusivity and develop localized data collection protocols.
-
Case Study Demonstrations: Conduct and publish case studies to illustrate the platform’s analytical accuracy and usability in real-world applications.
These priorities form part of the next development phase, with database migration and data harmonization being scheduled for implementation in the upcoming cycle.

6. Conclusions

MP contamination continues to pose a pressing global threat, compounded by the absence of standardized data collection, integration, and dissemination protocols. This study introduces a dedicated web-based platform to centralize MP datasets and provide an accessible, standardized interface for exploration and analysis. The platform significantly enhances research collaboration and productivity through its user-centric design and open-source framework. Current constraints are being actively addressed, including the reliance on flat-file data sources, lack of integrated data harmonization, and limited analytical tools. Future work will focus on transitioning to a scalable database architecture, expanding data harmonization efforts, and building global partnerships to ensure broader representation and improve the platform’s scientific utility.

Author Contributions

Conceptualization, H.A. and M.A.-A.; methodology, A.A.-A.; software and formal analysis, H.A.; validation, M.A.-A.; investigation, M.A.-A.; resources, N.A.-Q.; data curation, H.A. and M.A.-A.; writing—original draft preparation, H.A., M.A.-A., A.A.-A. and N.A.-Q.; writing—review and editing, M.A.-A., A.A.-A. and N.A.-Q.; visualization, H.A.; supervision, N.A.-Q.; project administration, N.A.-Q.; funding acquisition, N.A.-Q. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Qatar National Research Fund, a member of Qatar Foundation, through the UREP award [UREP31-074-1-016], which made this research possible. Additional support was provided by Qatar University through a student grant (QUST-1-CAM-2025-239), which enabled the completion and publication of this work. Statements made herein are solely the responsibility of the authors. Open access funding was provided by the Qatar National Library.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All data used are from open access sources. A preview of the developed platform is accessible at https://microplastics.streamlit.app/ (accessed on 8 July 2025).

Acknowledgments

The authors gratefully acknowledge the support of the Center for Advanced Materials (CAM) at Qatar University for their valuable contributions to this research.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Alimi, O.S.; Fadare, O.O.; Okoffo, E.D. Microplastics in African ecosystems: Current knowledge, abundance, associated contaminants, techniques, and research needs. Sci. Total. Environ. 2021, 755, 142422. [Google Scholar] [CrossRef] [PubMed]
  2. He, D.; Luo, Y.; Lu, S.; Liu, M.; Song, Y.; Lei, L. Microplastics in soils: Analytical methods, pollution characteristics and ecological risks. TrAC Trends Anal. Chem. 2018, 109, 163–172. [Google Scholar] [CrossRef]
  3. Debnath, R.; Prasad, G.S.; Amin, A.; Malik, M.M.; Ahmad, I.; Abubakr, A.; Borah, S.; Rather, M.A.; Impellitteri, F.; Tabassum, I.; et al. Understanding and addressing microplastic pollution: Impacts, mitigation, and future perspectives. J. Contam. Hydrol. 2024, 266, 104399. [Google Scholar] [CrossRef] [PubMed]
  4. Ali, N.; Khan, M.H.; Ali, M.; Sidra; Ahmad, S.; Khan, A.; Nabi, G.; Ali, F.; Bououdina, M.; Kyzas, G.Z. Insight into microplastics in the aquatic ecosystem: Properties, sources, threats, and mitigation strategies. Sci. Total. Environ. 2024, 913, 169489. [Google Scholar] [CrossRef] [PubMed]
  5. Miao, C.; Zhang, J.; Jin, R.; Li, T.; Zhao, Y.; Shen, M. Microplastics in aquaculture systems: Occurrence, ecological threats and control strategies. Chemosphere 2023, 340, 139924. [Google Scholar] [CrossRef] [PubMed]
  6. Zambrano-Pinto, M.V.; Tinizaray-Castillo, R.; Riera, M.A.; Maddela, N.R.; Luque, R.; Díaz, J.M.R. Microplastics as vectors of other contaminants: Analytical determination techniques and remediation methods. Sci. Total. Environ. 2024, 908, 168244. [Google Scholar] [CrossRef] [PubMed]
  7. Guo, Z.; Boeing, W.J.; Xu, Y.; Borgomeo, E.; Mason, S.A.; Zhu, Y.-G. Global meta-analysis of microplastic contamination in reservoirs with a novel framework. Water Res. 2021, 207, 117828. [Google Scholar] [CrossRef] [PubMed]
  8. Joyce, H.; Nash, R.; Frias, J.; White, J.; Cau, A.; Carreras-Colom, E.; Kavanagh, F. Monitoring microplastic pollution: The potential and limitations of Nephrops norvegicus. Ecol. Indic. 2023, 154, 110441. [Google Scholar] [CrossRef]
  9. Koelmans, A.A.; Kooi, M.; Redondo-Hasselerharm, P.E.; Mohamed Nor, N.H. Solving the nonalignment of methods and approaches used in microplastic research to consistently characterize risk. Environ. Sci. Technol. 2020, 54, 12307–12315. [Google Scholar] [CrossRef] [PubMed]
  10. Cowger, W.; Booth, A.M.; Hamilton, B.M.; Thaysen, C.; Primpke, S.; Munno, K.; Lusher, A.L.; Dehaut, A.; Vaz, V.P.; Liboiron, M.; et al. Reporting guidelines to increase the reproducibility and comparability of research on microplastics. Appl. Spectrosc. 2020, 74, 1066–1077. [Google Scholar] [CrossRef] [PubMed]
  11. Nyadjro, E.S.; Webster, J.A.B.; Boyer, T.P.; Cebrian, J.; Collazo, L.; Kaltenberger, G.; Larsen, K.; Lau, Y.H.; Mickle, P.; Toft, T.; et al. The NOAA NCEI marine microplastics database. Sci. Data 2023, 10, 726. [Google Scholar] [CrossRef] [PubMed]
  12. NOAA National Centers for Environmental Information. Marine Microplastics Database. 2025. Available online: https://www.ncei.noaa.gov/products/microplastics (accessed on 17 November 2024).
  13. Alfred Wegener Institute Helmholtz Centre for Polar and Marine Research. LITTERBASE: Online Portal for Marine Litter. Available online: https://litterbase.awi.de/ (accessed on 17 March 2025).
  14. One Earth—One Ocean e.V. Microplastic Pollution Map (in collaboration with UNEP GPML). 2023. Available online: https://data.unep.org/app/dataset/gpml-one-earth-one-ocean--oeoo--microplastic-pollution-map (accessed on 17 March 2025).
  15. Adventure Scientists. Global Microplastics Initiative. 2023. Available online: https://www.adventurescientists.org/microplastics.html (accessed on 17 March 2025).
  16. EMODnet Chemistry. Microplastics Dataset Portal. 2023. Available online: https://www.emodnet-chemistry.eu/ (accessed on 17 March 2025).
  17. Southern California Coastal Water Research Project. Microplastics Health Effects Database. 2023. Available online: https://microplastics.sccwrp.org/ (accessed on 17 March 2025).
  18. International Pellet Watch—Tokyo University of Agriculture and Technology. Global Monitoring of POPs Using Beach Plastic Resin Pellets. 2023. Available online: http://www.pelletwatch.org/ (accessed on 17 March 2025).
  19. Rochman Lab—Spectral Libraries for Microplastics Research. 2025. Available online: https://rochmanlab.wordpress.com/spectral-libraries-for-microplastics-research/ (accessed on 17 March 2025).
  20. U.S. Army Corps of Engineers, Dredging Operations and Environmental Research Program. Microplastic Database. 2025. Available online: https://doer.el.erdc.dren.mil/microplasticdatabase.html (accessed on 17 March 2025).
Figure 1. Workflow developed by the authors for the assessment and integration of microplastic (MP) databases. The process includes four main stages: (1) Proposed Materials and Development Methods, (2) Data Collection using manual and AI-assisted searches, (3) Data Analysis through evaluation and gap analysis, and (4) Platform Development for unified data integration.
Figure 1. Workflow developed by the authors for the assessment and integration of microplastic (MP) databases. The process includes four main stages: (1) Proposed Materials and Development Methods, (2) Data Collection using manual and AI-assisted searches, (3) Data Analysis through evaluation and gap analysis, and (4) Platform Development for unified data integration.
Materproc 22 00006 g001
Figure 2. Architecture of the MP data integration platform developed by the authors. The diagram illustrates the modular structure of the system, including data ingestion, standardization, user interface components, and analytical tools designed to support microplastic data exploration and interpretation.
Figure 2. Architecture of the MP data integration platform developed by the authors. The diagram illustrates the modular structure of the system, including data ingestion, standardization, user interface components, and analytical tools designed to support microplastic data exploration and interpretation.
Materproc 22 00006 g002
Figure 3. User interface displaying key features of the MP data platform developed by the authors. The interface includes modules for dataset exploration, polymer search, FTIR spectrum analysis, and dashboard metrics, all integrated into a streamlined and user-friendly web application.
Figure 3. User interface displaying key features of the MP data platform developed by the authors. The interface includes modules for dataset exploration, polymer search, FTIR spectrum analysis, and dashboard metrics, all integrated into a streamlined and user-friendly web application.
Materproc 22 00006 g003
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ahamed, H.; Al-Ani, M.; Al-Ardah, A.; Al-Qahtani, N. A Review of Global Microplastic (MP) Databases: A Study on the Challenges and Opportunities for Data Integration in the Context of MP Pollution. Mater. Proc. 2025, 22, 6. https://doi.org/10.3390/materproc2025022006

AMA Style

Ahamed H, Al-Ani M, Al-Ardah A, Al-Qahtani N. A Review of Global Microplastic (MP) Databases: A Study on the Challenges and Opportunities for Data Integration in the Context of MP Pollution. Materials Proceedings. 2025; 22(1):6. https://doi.org/10.3390/materproc2025022006

Chicago/Turabian Style

Ahamed, Hussain, Marwa Al-Ani, Ala Al-Ardah, and Noora Al-Qahtani. 2025. "A Review of Global Microplastic (MP) Databases: A Study on the Challenges and Opportunities for Data Integration in the Context of MP Pollution" Materials Proceedings 22, no. 1: 6. https://doi.org/10.3390/materproc2025022006

APA Style

Ahamed, H., Al-Ani, M., Al-Ardah, A., & Al-Qahtani, N. (2025). A Review of Global Microplastic (MP) Databases: A Study on the Challenges and Opportunities for Data Integration in the Context of MP Pollution. Materials Proceedings, 22(1), 6. https://doi.org/10.3390/materproc2025022006

Article Metrics

Back to TopTop