Prompt Design through ChatGPT’s Zero-Shot Learning Prompts: A Case of Cost-Sensitive Learning on a Water Potability Dataset
Abstract
:1. Introduction
2. Literature Review
3. Materials and Methods
4. Results and Discussion
Overview of Classifier Metrics for Different Generated Code Snippets
5. Discussions
6. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A
References
- Yaroshenko, I.; Kirsanov, D.; Marjanovic, M.; Lieberzeit, P.A.; Korostynska, O.; Mason, A.; Frau, I.; Legin, A. Real-time water quality monitoring with chemical sensors. Sensors 2020, 20, 3432. [Google Scholar] [CrossRef]
- Ahuja, L.R.; Ma, Q.L.; Rojas, K.W.; Boesten, J.J.; Farahani, H.J. A field test of root zone water quality model—Pesticide and bromide behavior. Pestic. Sci. 1996, 48, 101–108. [Google Scholar] [CrossRef]
- Shrestha, S.; Kazama, F. Assessment of surface water quality using multivariate statistical techniques: A case study of the Fuji river basin, Japan. Environ. Model. Softw. 2007, 22, 464–475. [Google Scholar] [CrossRef]
- Singh, K.P.; Malik, A.; Sinha, S. Water quality assessment and apportionment of pollution sources of Gomti river (India) using multivariate statistical techniques—A case study. Anal. Chim. Acta 2005, 538, 355–374. [Google Scholar] [CrossRef]
- Smeti, E.M.; Thanasoulias, N.C.; Kousouris, L.P.; Tzoumerkas, P.C. An approach for the application of statistical process control techniques for quality improvement of treated water. Desalination 2007, 213, 273–281. [Google Scholar] [CrossRef]
- Yang, Y.J.; Haught, R.C.; Goodrich, J.A. Real-time contaminant detection and classification in a drinking water pipe using conventional water quality sensors: Techniques and experimental results. J. Environ. Manag. 2009, 90, 2494–2506. [Google Scholar] [CrossRef]
- Chaves, P.; Tsukatani, T.; Kojiri, T. Operation of storage reservoir for water quality by using optimization and artificial intelligence techniques. Math. Comput. Simul. 2004, 67, 419–432. [Google Scholar] [CrossRef]
- Gevrey, M.; Rimet, F.; Park, Y.S.; Giraudel, J.L.; Ector, L.; Lek, S. Water quality assessment using diatom assemblages and advanced modelling techniques. Freshw. Biol. 2004, 49, 208–220. [Google Scholar] [CrossRef]
- Letcher, R.A.; Jakeman, A.J.; Calfas, M.; Linforth, S.; Baginska, B.; Lawrence, I. A comparison of catchment water quality models and direct estimation techniques. Environ. Model. Softw. 2002, 17, 77–85. [Google Scholar] [CrossRef]
- Hedger, R.D.; Atkinson, P.M.; Malthus, T.J. Optimizing sampling strategies for estimating mean water quality in lakes using geostatistical techniques with remote sensing. Lakes Reserv. Res. Manag. 2001, 6, 279–288. [Google Scholar] [CrossRef]
- Allinson, M.; Shiraishi, F.; Kamata, R.; Kageyama, S.; Nakajima, D.; Goto, S.; Allinson, G. A pilot study of the water quality of the Yarra River, Victoria, Australia, using in vitro techniques. Bull. Environ. Contam. Toxicol. 2011, 87, 591–596. [Google Scholar] [CrossRef]
- Nyende-Byakika, S.; Ndambuki, J.M.; Onyango, M.S.; Morake, L. Potability analysis of raw water from Bospoort dam, South Africa. Water Pract. Technol. 2016, 11, 634–643. [Google Scholar] [CrossRef]
- Pehlivan, R.; Emre, H. Potability and hydrogeochemisty of the Sarma Stream water, Duzce, Turkey. Water Resour. 2017, 44, 315–330. [Google Scholar] [CrossRef]
- Achio, S.; Kutsanedzie, F.; Ameko, E. Comparative analysis on the effectiveness of various filtration methods on the potability of water. Water Qual. Res. J. Can. 2016, 51, 42–46. [Google Scholar] [CrossRef]
- Elizabeth, K.M.; Rajpramikh, K.E. Potability of Water among the Tribals of Vizianagaram Sub-plan Area, Andhra Pradesh: Microbiological and Physico-Chemical Analysis. Anthropologist 2000, 2, 181–184. [Google Scholar] [CrossRef]
- Spackman, C.; Burlingame, G.A. Sensory politics: The tug-of-war between potability and palatability in municipal water production. Soc. Stud. Sci. 2018, 48, 350–371. [Google Scholar] [CrossRef]
- Mahajan, M.; Bhardwaj, K. Potability analysis of drinking water in various regions of Ludhiana District, Punjab, India. Int. Res. J. Pharm. 2017, 8, 87–90. [Google Scholar] [CrossRef]
- Lvova, L.; Di Natale, C.; Paolesse, R. Chemical sensors for water potability assessment. Bottled Packag. Water 2019, 4, 177–208. [Google Scholar]
- Abanyie, S.K.; Boateng, A.; Ampofo, S. Investigating the potability of water from dug wells: A case study of the Bolgatanga Township, Ghana. Afr. J. Environ. Sci. Technol. 2016, 10, 307–315. [Google Scholar]
- Opafola, O.T.; Oladepo, K.T.; Ajibade, F.O.; David, A.O. Potability assessment of packaged sachet water sold within a tertiary institution in southwestern Nigeria. J. King Saud Univ. Sci. 2020, 32, 1999–2004. [Google Scholar] [CrossRef]
- Chauhan, J.S.; Badwal, T.; Badola, N. Assessment of potability of spring water and its health implication in a hilly village of Uttarakhand, India. Appl. Water Sci. 2020, 10, 201. [Google Scholar] [CrossRef]
- Arulnangai, R.; Sihabudeen, M.M.; Vivekanand, P.A.; Kamaraj, P. Influence of physico chemical parameters on potability of ground water in ariyalur area of Tamil Nadu, India. Mater. Today Proc. 2021, 36, 923–928. [Google Scholar] [CrossRef]
- An, H.; Li, X.; Huang, Y.; Wang, W.; Wu, Y.; Liu, L.; Ling, W.; Li, W.; Zhao, H.; Lu, D.; et al. A new ChatGPT-empowered, easy-to-use machine learning paradigm for environmental science. Eco-Environ. Health 2024, 3, 131–136. [Google Scholar] [CrossRef]
- Barberio, A. Large Language Models in Data Preparation: Opportunities and Challenges; Scuola di Ingegneria Industriale e dell’Informazione: Milan, Italy, 2022. [Google Scholar]
- Hassani, H.; Silva, E.S. The role of ChatGPT in data science: How ai-assisted conversational interfaces are revolutionizing the field. Big Data Cogn. Comput. 2023, 7, 62. [Google Scholar] [CrossRef]
- Roumeliotis, K.I.; Tselikas, N.D. ChatGPT and Open-AI Models: A Preliminary Review. Future Internet 2023, 15, 192. [Google Scholar] [CrossRef]
- Mujahid, M.; Rustam, F.; Shafique, R.; Chunduri, V.; Villar, M.G.; Ballester, J.B.; Diez, I.d.l.T.; Ashraf, I. Analyzing sentiments regarding ChatGPT using novel BERT: A machine learning approach. Information 2023, 14, 474. [Google Scholar] [CrossRef]
- Lubiana, T. Ten Quick Tips for Harnessing the Power of ChatGPT. GPT-4 in Computational Biology. PLOS Comput. Biol. 2023, 19, e1011319. [Google Scholar] [CrossRef]
- OpenAI. ChatGPT [3.5]. 2024. Available online: https://chat.openai.com/c/53c0468f-e40d-439c-a90b-e224d64afdc8 (accessed on 2 February 2024).
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need, Carlifornia. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
Cost-Sensitive Learning Approach | Zero-Shot Learning Prompts on ChatGPT | Metric Results of the Human-Written Code Using Logistic Regressor as a Base Classifier NB: Cost-Sensitive Learning Not Catered for. | Metric Results of the ChatGPT-Generated Code Snippets [Both Logistic Regressor and Gradient Boosting Classifiers] (Appendix A) NB: Cost-Sensitive Learning Is Catered for. |
---|---|---|---|
1. Feature Engineering | Prompt 1 | Logistic regressor is a base model: | Logistic regressor: Gradient boosting: |
2. Handling Imbalanced Classes | Prompt 2 | Logistic regressor is a base model: | Logistic regressor: Gradient boosting: |
3. Regularization | Prompt 3 | Logistic regressor is a base model: | Logistic regressor: Gradient boosting: |
4. Hyperparameter Tuning | Prompt 4 | Logistic regressor is a base model: | Logistic regressor: Gradient boosting: |
5. Ensemble Methods | Prompt 5 | Logistic regressor is a base model: | Logistic regressor: Gradient boosting: |
6. Cross-Validation | Prompt 6 | Logistic regressor is a base model: | Logistic regressor: Gradient boosting: |
7. Feature Selection | Prompt 7 | Logistic regressor is a base model: | Logistic regressor: Gradient boosting: |
8. Optimizing Decision Threshold | Prompt 8 | Logistic regressor is a base model: | Logistic regressor: Gradient boosting: |
Metrics | Result | Description |
---|---|---|
Precision | 0% | A precision of 0% indicates that none of the positive predictions made by the model were correct. |
Recall | 0% | A recall of 0% indicates that the model failed to correctly identify any of the actual positive instances. |
F1-Score | 0% | An F1-score of 0% indicates that both precision and recall are extremely low. |
ROC AUC | 49% | ROC AUC is close to 0.5, indicating that the model’s ability to distinguish between classes is almost equivalent to random chance. |
Model | Precision | Recall | F1-Score | ROC AUC |
---|---|---|---|---|
Bagging | 0% | 0% | 0% | 50% |
Adaboost | 0% | 0% | 0% | 50% |
Stacking | 56% | 49% | 52% | 60% |
Model | Precision | Recall | F1-Score | ROC AUC |
---|---|---|---|---|
Selected Features | 0% | 0% | 0% | 50% |
Top Features | 100% | 0.5% | 1.1% | 50% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Phorah, K.; Sibiya, M.; Sumbwanyambe, M. Prompt Design through ChatGPT’s Zero-Shot Learning Prompts: A Case of Cost-Sensitive Learning on a Water Potability Dataset. Informatics 2024, 11, 27. https://doi.org/10.3390/informatics11020027
Phorah K, Sibiya M, Sumbwanyambe M. Prompt Design through ChatGPT’s Zero-Shot Learning Prompts: A Case of Cost-Sensitive Learning on a Water Potability Dataset. Informatics. 2024; 11(2):27. https://doi.org/10.3390/informatics11020027
Chicago/Turabian StylePhorah, Kokisa, Malusi Sibiya, and Mbuyu Sumbwanyambe. 2024. "Prompt Design through ChatGPT’s Zero-Shot Learning Prompts: A Case of Cost-Sensitive Learning on a Water Potability Dataset" Informatics 11, no. 2: 27. https://doi.org/10.3390/informatics11020027
APA StylePhorah, K., Sibiya, M., & Sumbwanyambe, M. (2024). Prompt Design through ChatGPT’s Zero-Shot Learning Prompts: A Case of Cost-Sensitive Learning on a Water Potability Dataset. Informatics, 11(2), 27. https://doi.org/10.3390/informatics11020027