Official Statistics and Big Data Processing with Artificial Intelligence: Capacity Indicators for Public Sector Organizations
Abstract
:1. Introduction
2. Methodology
2.1. Survey Methodology
- IP—Basic Information Panel
- OS—Official Statistics Production Information
- BD—Big data use in Official Statistics
- RC—Rationalization of Statistical Cadre
2.2. Data Collection
2.3. Official Statistics and Big Data Processing Capacity Measures
2.3.1. Official Statistics Capacity Measures
- (a)
- Data collection/recording for dissemination
- (b)
- Liaison with other departments on data
- (c)
- Data Privacy
2.3.2. Big Data Processing Capacity Measures
- (a)
- Big data 3Vs
- (b)
- Big data literacy
- (c)
- Big data workings
- (d)
- Big data skills
2.4. Official Statistics Capacity Indicator
- Big data 3Vs (Volume, Velocity, Variety);
- Big data literacy;
- Big data workings;
- Big data skills.
3. Results and Findings
3.1. Key Descriptive Findings
- OUs record/collect data for official or public use regularly 83% (142/171)
- OUs collect/record data for:Administrative use only 29% (40/139)
- Statistical use only 10% (14/139)
- Both (Admn./Statistical) 57% (79/139)
- OUs have electronic data recording 93% (150/160)
- Production of data products by the collected/recorded DADs 68% (97/142)
- OUs disseminate their data products officially or publicly 87% (89/102)
- OUs supply data to statistical organizations periodically 48% (68/142)
- OUs routinely obtain data from statistics organizations 23% (38/168)
- OUs have to conduct their collection to fulfill data needs 39% (66/169)
- OUs reported a confidentiality level
- Low 30% (40/135)
- Medium 36% (49/135)
- High confidentiality level 34% (46/135)
- OUs have an internal policy of 57% (78/137)
- OUs reported unreported data sources (data gaps) 27% (47/171)
- OUs ensure the importance of unreported data sources for POS 27% (47/171)
Module II (Big Data Use in Official Statistics)
- (a)
- Data recording and storage system
- (b)
- 3Vs of data
- (c)
- Big data literacy
- (d)
- Current and Upcoming work with big data
- Administrative Data (72%, 115/160)
- Behavioral Data (26%, 41/160)
- Communication/Tracking Devices data (25%, 40/160)
- Sensors Data (16%, 26/160)
- Commercial/Transactional Records (14%, 22/160)
- Opinion Records (19%, 31/160)
- (e)
- IT personnel to handle Big Data
- (f)
- IT infrastructure to deal with big data
- (g)
- Statistical capacity of IT human resources
- (h)
- Training needs of IT human resources
- (i)
- IT and Statistical outsourcing
- (j)
- Liaison with other departments on data
- (k)
- Reasons for lacking big data use
3.2. Selection of Optimum Data Reduction Approach
3.2.1. Official Statistics Relative Capacity Indicator (OSRCI)
3.2.2. Big Data Processing Relative Capacity Indicator (BDRCI)
4. Conclusions
Supplementary Materials
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Zhang, D.; Pee, L.G.; Pan, S.L.; Cui, L. Big Data Analytics, Resource Orchestration, and Digital Sustainability: A Case Study of Smart City Development. Gov. Inf. Q. 2022, 39, 101626. [Google Scholar] [CrossRef]
- Andronie, M.; Lăzăroiu, G.; Karabolevski, O.L.; Ștefănescu, R.; Hurloiu, I.; Dijmărescu, A.; Dijmărescu, I. Remote Big Data Management Tools, Sensing, and Computing Technologies, and Visual Perception and Environment Mapping Algorithms in the Internet of Robotic Things. Electronics 2022, 12, 22. [Google Scholar] [CrossRef]
- Pramanik, S.; Bandyopadhyay, S.K. Analysis of Big Data. In Encyclopedia of Data Science and Machine Learning; IGI Global: Hershey, PA, USA, 2022; pp. 97–115. ISBN 9781799892205. [Google Scholar]
- Zhong, Y.; Chen, L.; Dan, C.; Rezaeipanah, A. A Systematic Survey of Data Mining and Big Data Analysis in the Internet of Things. J. Supercomput. 2022, 78, 18405–18453. [Google Scholar] [CrossRef]
- Guo, J.; Liu, R.; Cheng, D.; Shanthini, A.; Vadivel, T. RETRACTED ARTICLE: Urbanization Based on IoT Using Big Data Analytics the Impact of Internet of Things and Big Data in Urbanization. Arab. J. Sci. Eng. 2022, 48, 4147. [Google Scholar] [CrossRef]
- Ateya, A.A.; Sayed, M.S.; Abdalla, M.I. Multilevel Hierarchical Clustering Protocol for Wireless Sensor Networks. In Proceedings of the 2014 International Conference on Engineering and Technology (ICET), Cairo, Egypt, 19–20 April 2014; pp. 1–6. [Google Scholar]
- Ateya, A.A.; Algarni, A.D.; Hamdi, M.; Koucheryavy, A.; Soliman, N.F. Enabling Heterogeneous IoT Networks over 5G Networks with Ultra-Dense Deployment—Using MEC/SDN. Electronics 2021, 10, 910. [Google Scholar] [CrossRef]
- Wang, J.; Xu, C.; Zhang, J.; Zhong, R. Big Data Analytics for Intelligent Manufacturing Systems: A Review. J. Manuf. Syst. 2022, 62, 738–752. [Google Scholar] [CrossRef]
- Rogge, N.; Agasisti, T.; De Witte, K. Big Data and the Measurement of Public Organizations’ Performance and Efficiency: 450 The State-of-the-Art. Public Policy Adm. 2017, 32, 263–281. [Google Scholar] [CrossRef]
- Abbas, S.W.; Ahmad, M.; Rasul, S. Leveraging Big Data for Official Statistics: Some Recent Developments. Adv. Appl. Stat. 2019, 54, 99–136. [Google Scholar] [CrossRef]
- Mc Cartney, A.M.; Anderson, J.; Liggins, L.; Hudson, M.L.; Anderson, M.Z.; TeAika, B.; Geary, J.; Cook-Deegan, R.; Patel, H.R.; Phillippy, A.M. Balancing Openness with Indigenous Data Sovereignty: An Opportunity to Leave No One behind in the Journey to Sequence All of Life. Proc. Natl. Acad. Sci. USA 2022, 119, e2115860119. [Google Scholar] [CrossRef] [PubMed]
- Mills, D.; Pudney, S.; Pevcin, P.; Dvorak, J. Evidence-Based Public Policy Decision-Making in Smart Cities: Does Extant Theory Support Achievement of City Sustainability Objectives? Sustainability 2021, 14, 3. [Google Scholar] [CrossRef]
- Telleria, J.; Garcia-Arias, J. The Fantasmatic Narrative of ‘Sustainable Development’. A Political Analysis of the 2030 Global Development Agenda. Environ. Plan. C Politics Space 2022, 40, 241–259. [Google Scholar] [CrossRef]
- Jutting, J. Capacity Building, Yes—But How to Do It? United Nations World Data Forum: New York, NY, USA, 2016. [Google Scholar]
- Lebada, A.M. Member States, Statisticians Address SDG Monitoring Requirements; IISD Knowledge Sharing Hub: Winnipeg, MB, Canada, 2016; Volume 8. [Google Scholar]
- Ardiansyah, A.; Ilyas, A.; Haeranah, H. Reformulation of Statistical Data Sources: Big Data New Data Sources Supporting Future Official Statistics. Injurity 2023, 2, 424–443. [Google Scholar] [CrossRef]
- Alshahrani, A.; Dennehy, D.; Mäntymäki, M. An Attention-Based View of AI Assimilation in Public Sector Organizations: The Case of Saudi Arabia. Gov. Inf. Q. 2022, 39, 101617. [Google Scholar] [CrossRef]
- Chohan, S.R.; Hu, G. Strengthening Digital Inclusion through E-Government: Cohesive ICT Training Programs to Intensify Digital Competency. Inf. Technol. Dev. 2022, 28, 16–38. [Google Scholar] [CrossRef]
- Li, X.; Liu, H.; Wang, W.; Zheng, Y.; Lv, H.; Lv, Z. Big Data Analysis of the Internet of Things in the Digital Twins of Smart City Based on Deep Learning. Future Gener. Comput. Syst. 2022, 128, 167–177. [Google Scholar] [CrossRef]
- Abbas, S.W.; Rasul, S.; Ahmad, M. Unreported Data Sources in Public Sector Organizations. Stat. J. IAOS 2019, 35, 359–370. [Google Scholar] [CrossRef]
- United Nations 460 Economic Commission for Europe-UNECE Questionnaire about the Skills Necessary for People Working with Big Data in the Statistical Organisations; UN: New York, NY, USA, 2014.
- UN Global Working Group on Big Data for 462 Official Statistics-UNSD Analysis of Big Data Survey 2015 on Skills, Training and Capacity Building; UN: New York, NY, USA, 2015.
- United Nations Sta-464 Tistics Division-UNSD Results of the UNSD/UNECE Survey on Organizational Context and Individual Projects of Big Data; UN: New York, NY, USA, 2015.
- Zekos, G.I. Risk Management Developments. In Economics and Law of Artificial Intelligence; Springer International Publishing: Cham, Switzerland, 2021; pp. 147–232. ISBN 9783030642532. [Google Scholar]
- Ogrean, C. Relevance of Big Data for Business and Management. Exploratory Insights (Part II). Stud. Bus. Econ. 2019, 14, 169–180. [Google Scholar] [CrossRef] [Green Version]
- Landgraf, A.J.; Lee, Y. Dimensionality Reduction for Binary Data through the Projection of Natural Parameters. J. Multivar. Anal. 2020, 180, 104668. [Google Scholar] [CrossRef]
Codes | DESCRIPTION |
---|---|
OS1 | Collection/recording of data in an organization |
OS2 | Data collection for statistical purpose |
OS3 | Production of statistics from data |
OS5 | Dissemination of Data products officially/publicly |
OS6 | Data supply to statistical organizations |
OS7 | Have a framework to deal with privacy-related issues |
OS9 | Conduct self-data collection through surveys |
OS10 | Acquire data from statistical organizations periodically |
OS11 | Have unreported data sources |
OS13 | Unreported data sources are important |
BD1 | Electronic data recording |
BD2J | Big data Recording |
BD2JA | Accessible data storage |
BD3J | Big data production |
BD4 | Big data awareness |
BD5 | Big data importance for POS |
BD6J | Big data value in POS |
BD7 | Big data working |
BD9 | In future working with Big data |
BD10J | Potential Big data is an administrative source |
BD11 | Well IT equipped and have enough resources |
BD12 | Have well trained IT Staff |
BD13J | Have enough data processing IT skills |
BD14J | Have statistical skills by the IT human resource |
BD15J | Usage of Big data processing tools |
BD16 | Training needs for IT staff |
BD17J | Training needs for Big data processing skills |
BD18 | Public-private partnership over data solution needs |
BD20 | Mutual data interests with other public departments |
BD22 | Public-public liaison over data solution needs |
RC1 | Statistical or data processing post exists in the department |
Volume of Data | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
<1 GB | 1–10 GB | >10 GB | Total | |||||||
Velocity of data | Daily | 42.6 | 17.6 | 2.0 | 62.2 | |||||
Monthly | 15.5 | 12.8 | 0.7 | 29.1 | ||||||
Yearly | 6.1 | 2.0 | 0.7 | 8.8 | ||||||
Total | 64.2 | 32.4 | 3.4 | 100.0 | ||||||
No OUs | OUs | Possible OUs |
Dimensions for Capacity Measures | Measures Used (Nos.) | Deviance Explained (%) | |||
---|---|---|---|---|---|
Exponential Family PCA | Logistic PCA | Convex Logistic PCA | |||
Overall Capacity Indicator | 31 | 18 | 16 | 39 | |
Official Statistics Capacity Indicator | 13 | 28 | 24 | 55 | |
OS Sub-dimensions | 1. Data Collection and Dissemination | 5 | 63 | 52 | 83 |
2. Liaison with other Depts. on Data | 4 | 52 | 41 | 89 | |
3. Data Privacy | 5 | 52 | 40 | 85 | |
Big Data Processing Capacity Indicator | 18 | 23 | 21 | 50 | |
BD Sub-dimensions | 1. Big data 3Vs | 4 | 77 | 70 | 92 |
2. Big data Literacy | 3 | 64 | 49 | 95 | |
3. Big data Workings | 6 | 44 | 36 | 77 | |
4. Big data Skills | 7 | 42 | 33 | 78 |
Department Name | OSRCI | Sub-D1 RCI | Sub-D2 RCI | Sub-D3 RCI |
---|---|---|---|---|
State Bank of Pakistan | 100 | 100 | 100 | 100 |
PPARC Establishment Division | 87.1 | 100 | 82.3 | 75.8 |
Pakistan Council for Science & Technology (PCST) | 80.7 | 100 | 77.1 | 75.8 |
FBISE Islamabad | 76.9 | 100 | 39.1 | 75.8 |
Gwadar Port Authority | 64.6 | 75.7 | 46.7 | 100 |
Pakistan Bureau of Statistics (ACO Wing) | 64.6 | 100 | 5.2 | 75.8 |
Directorate of Research and Statistics FBR | 61.3 | 63.4 | 48.1 | 45.9 |
Ministry of Information Technology | 56.2 | 100 | 5.2 | 85.3 |
Capital Development Authority | 54.9 | 71.5 | 43.2 | 85.3 |
Civil Services Academy | 54.5 | 34.8 | 77.1 | 37.9 |
Department Name | OSRCI | Sub-D1 RCI | Sub-D2 RCI | Sub-D3 RCI |
---|---|---|---|---|
Bureau of Statistics Punjab | 100 | 100 | 100 | 100 |
Provincial Disaster Management Authority Punjab | 92.8 | 100 | 100 | 75.8 |
Crop Reporting Service | 85.7 | 100 | 100 | 49.9 |
Directorate of Industries (IPWM) | 72.2 | 100 | 33.9 | 74.1 |
Punjab Vocational Training Council | 71.8 | 100 | 5.2 | 100 |
Directorate General of Monitoring & Evaluation | 70.0 | 100 | 77.1 | 75.8 |
Population Welfare Department Punjab | 66.6 | 34.8 | 100 | 37.9 |
Literacy & Non-Formal Basic Education Department | 64.7 | 100 | 43.2 | 100 |
Faisalabad Institute of Cardiology Faisalabad | 64.6 | 100 | 5.2 | 75.8 |
Excise Taxation and Narcotics Control Department | 64.1 | 100 | 48.1 | 75.8 |
Department Name | BDRCI | Sub-D1 RCI | Sub-D2 RCI | Sub-D3 RCI | Sub-D4 RCI |
---|---|---|---|---|---|
State Bank of Pakistan | 100 | 100 | 100 | 100 | 100 |
National Disaster Management Authority | 89.1 | 100 | 100 | 94 | 79.7 |
Department of Auditor General of Pakistan | 77.9 | 100 | 80.5 | 81.6 | 65.1 |
Provincial Election Commissioner Baluchistan | 66.1 | 60.1 | 100 | 34.9 | 100 |
Gwadar Port Authority | 58.1 | 100 | 34.8 | 25.8 | 67.2 |
Pakistan Bureau of Statistics (ACO Wing) | 58 | 70.6 | 34.8 | 20.4 | 85 |
FBISE Islamabad | 57.6 | 100 | 100 | 49.8 | 26.7 |
Pakistan Health Research Council | 53.9 | 100 | 100 | 20.4 | 47.4 |
Directorate of Research and Statistics FBR | 52.9 | 60.1 | 100 | 42.9 | 52.6 |
PPARC Establishment Division | 49.5 | 18.3 | 100 | 45.8 | 44.7 |
Capital Development Authority | 45.6 | 100 | 100 | 36.9 | 20.3 |
Department Name | BDRCI | Sub-D1 RCI | Sub-D2 RCI | Sub-D3 RCI | Sub-D4 RCI |
---|---|---|---|---|---|
Provincial Disaster Management Authority Punjab | 100 | 100 | 100 | 100 | 100 |
DG Public Relations Punjab Lahore | 80.8 | 100 | 100 | 49.8 | 90.4 |
PITB Citizen Feedback Monitoring Program | 79.1 | 70.6 | 100 | 49.8 | 100 |
Punjab Proc. Regularity Authority (PPRA) | 77.7 | 89.8 | 100 | 68.2 | 85 |
Director General Health Punjab | 75.9 | 94.5 | 100 | 72.2 | 67.2 |
Crop Reporting Service | 71.1 | 28.8 | 100 | 94 | 49.9 |
Pakistan Kidney & Liver Institute | 70.8 | 60.1 | 100 | 84.7 | 67.2 |
Livestock and Dairy Development Department | 67.5 | 100 | 100 | 100 | 12.1 |
Literacy & Non-Formal Basic Education Dept. | 65.7 | 100 | 80.5 | 66.2 | 52.6 |
Bureau of Statistics Punjab | 63.6 | 60.1 | 80.5 | 78.7 | 67.2 |
Excise Taxation and Narcotics Control Dept. | 63.3 | 70.6 | 100 | 22.4 | 82.3 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Abbas, S.W.; Hamid, M.; Alkanhel, R.; Abdallah, H.A. Official Statistics and Big Data Processing with Artificial Intelligence: Capacity Indicators for Public Sector Organizations. Systems 2023, 11, 424. https://doi.org/10.3390/systems11080424
Abbas SW, Hamid M, Alkanhel R, Abdallah HA. Official Statistics and Big Data Processing with Artificial Intelligence: Capacity Indicators for Public Sector Organizations. Systems. 2023; 11(8):424. https://doi.org/10.3390/systems11080424
Chicago/Turabian StyleAbbas, Syed Wasim, Muhammad Hamid, Reem Alkanhel, and Hanaa A. Abdallah. 2023. "Official Statistics and Big Data Processing with Artificial Intelligence: Capacity Indicators for Public Sector Organizations" Systems 11, no. 8: 424. https://doi.org/10.3390/systems11080424
APA StyleAbbas, S. W., Hamid, M., Alkanhel, R., & Abdallah, H. A. (2023). Official Statistics and Big Data Processing with Artificial Intelligence: Capacity Indicators for Public Sector Organizations. Systems, 11(8), 424. https://doi.org/10.3390/systems11080424