Secure Internal Data Markets
Abstract
:1. Introduction
- A comparison of global and internal data markets with respect to security-relevant parameters from a general perspective.
- A novel abstract basic model for a secure internal data market, including audit and control.
- Flexible tailoring capabilities in the abstract model through the inclusion of pre- and post-processing capabilities.
- A set of open research questions that need consideration in order to make internal data markets work securely in environments with limited resources.
2. Background and Related Work
3. Security Issues of Internal Data Markets
3.1. Security Personal
3.2. Integration
3.3. Market Management Capabilities
3.4. Data Volume
3.5. Knowledge on Data
3.6. User Model
3.7. Market Model
3.8. Interfaces
3.9. Data Locality
3.10. GDPR Responsibility
3.11. Institutional Attention
4. A Generalised Model for Secure Internal Data Markets
4.1. Model Overview
- Depending on the data record, prosumers act both on the owner’s side and on the consumer’s side, always according to the rules of the respective company. Therefore, it can also be modelled that a data owner acquires their own data, which can be useful with regard to the pre- and post-processing methods.
- VAS providers—actors who use data from others, develop it further by, for example, enriching it with other data, or perform additional calculations—also act in both roles: they purchase the data to be processed as consumers, and act as owners when passing on the data. Any kick-backs to the owners of the original data are simply handled via policies, just as they are when used by a normal consumer.
- The storage entity, a secure, hardened and highly efficient database containing all data held in the data market.
- The brokering module, an engine that maps the trading of the data. This includes not only the provision of information about the data to enable a trade, but also the necessary methods for billing, payment and policy management. In addition, the brokering module has to call on the selected post-processing routines that need to process the data before actual delivery to the respective data recipient.
- The pre- and post-processing modules (marked in yellow in Figure 1) are of special importance for the pre- and post-processing of the data, as they are quasi internal VAS providers for important tools that a data market should provide. The main difference between these two types of methods is the time of application—the pre-processing methods are executed before the data is stored in the repository, whereas the post-processing methods are executed at the time of data delivery by the brokering module. While the pre-processing module do not need additional intelligence included, as they are invoked on the raw data and never run again on any data residing in the storage entity, the controller for the post-processing modules requires far more control over the whole brokering process; in many anonymisation techniques, it is of vital importance to keep track of the data that the specific recipient has already received at the time of provisioning with a new batch of data. This is especially important for fingerprinting, as receiving two differently fingerprinted copies of the same data set might enable a recipient to remove parts of the fingerprint. Especially when using fingerprinting methods based on anonymisation [33], this also requires tracking capabilities on the side of the post-processing modules.
4.2. Audit and Control
- Hashes of the incoming data sets.
- Processing steps executed on the data in the pre-processing stage with the following:
- –
- Timestamp of the execution in order to be able to track back enrichments based on other data sets and enable reconstructing the respective versions.
- –
- Version number from version control, if available.
- –
- Hashes of input and output data sets.
- –
- Meta data on the execution, such as number of records in the data sets and user names in the case of manual execution.
- Access rights to the data in the data store.
- Post-processing steps executed on data delivery through the brokering module, including the same information as that for the pre-processing steps, as well as the request information from the data consumer.
4.3. Comparison to Other Data Market Approaches
5. Open Research Questions
5.1. Concrete Data Market Models
5.2. Flexible Security Model
5.3. Policies and Basic Market Model
5.4. Automated Deployment
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
A&C | Audit and Control |
GDPR | General Data Protection Regulation |
SOX | Sarbanes–Oxley Act |
VAS | Value Added Services |
References
- Meyer, M.H.; Zack, M.H. The Design and Development of Information Products. Sloan Manag. Rev. 1996, 37, 43–59. [Google Scholar]
- Wijnhoven, F. Models of information markets: Analysis of markets, identification of services, and design models. Inf. Sci. 2001, 4, 117–128. [Google Scholar] [CrossRef] [Green Version]
- Daily, J.; Peterson, J. Predictive maintenance: How big data analysis can improve maintenance. In Supply Chain Integration Challenges in Commercial Aerospace; Springer: Berlin/Heidelberg, Germany, 2017; pp. 267–278. [Google Scholar]
- Popper, N.; Endel, F.; Mayer, R.; Bicher, M.; Glock, B. Planning Future Health: Developing Big Data and System Modelling Pipelines for Health System Research. SNE Simul. Notes Eur. 2017, 27, 203–208. [Google Scholar] [CrossRef]
- Niyato, D.; Alsheikh, M.A.; Wang, P.; Kim, D.I.; Han, Z. Market model and optimal pricing scheme of big data and Internet of Things (IoT). In Proceedings of the 2016 IEEE International Conference on Communications (ICC), Kuala Lumpur, Malaysia, 22–27 May 2016; pp. 1–6. [Google Scholar]
- Ivanschitz, B.P.; Lampoltshammer, T.J.; Mireles, V.; Revenko, A.; Schlarb, S.; Thurnay, L. A Data Market with Decentralized Repositories. DeSemWeb@ISWC. 2018. Available online: https://openreview.net/pdf?id=rkgzBg7yeX (accessed on 11 May 2021).
- The IOTA Marketplace. Available online: https://blog.iota.org/iota-data-marketplace-cb6be463ac7f (accessed on 11 May 2021).
- Food Data Market. Available online: https://fundingbox.com/spaces/ledger-ledger-news-and-updates/5d8c6ed052317832f858fc59 (accessed on 11 May 2021).
- The FeatureCloud Project. Available online: https://featurecloud.eu/ (accessed on 11 May 2021).
- Schlarb, S.; Karl, R.; King, R.; Lampoltshammer, T.J.; Thurnay, L.; Ivanschitz, B.P.; Mireles, V. Using Blockchain Technology to Manage Membership and Legal Contracts in a Distributed Data Market. In Proceedings of the 2019 Sixth International Conference on Software Defined Systems (SDS), Rome, Italy, 10–13 June 2019; pp. 272–277. [Google Scholar]
- Ivanschitz, B.P.; Lampoltshammer, T.J.; Mireles, V.; Revenko, A.; Schlarb, S.; Thurnay, L. A Semantic Catalogue for the Data Market Austria; SEMANTICS Posters & Demos: Vienna, Austria, 2018. [Google Scholar]
- Horváth, M.; Buttyán, L. Problem domain analysis of iot-driven secure data markets. In International ISCIS Security Workshop; Springer: Berlin/Heidelberg, Germany, 2018; pp. 57–67. [Google Scholar]
- Koronios, A.; Redman, T.; Gao, J. Internal Data Markets: The Opportunity and First Steps. In Proceedings of the 2009 Fourth International Conference on Cooperation and Promotion of Information Resources in Science and Technology, Beijing, China, 21–23 November 2009; pp. 127–130. [Google Scholar]
- Lings, I.N. Internal marketing and supply chain management. J. Serv. Mark. 2000, 14, 27–43. [Google Scholar] [CrossRef] [Green Version]
- Fernandez, R.C.; Subramaniam, P.; Franklin, M.J. Data Market Platforms: Trading Data Assets to Solve Data Problems [Vision Paper]. arXiv 2020, arXiv:2002.01047. [Google Scholar] [CrossRef]
- Liang, F.; Yu, W.; An, D.; Yang, Q.; Fu, X.; Zhao, W. A Survey on Big Data Market: Pricing, Trading and Protection. IEEE Access 2018, 6, 15132–15154. [Google Scholar] [CrossRef]
- Zhao, Y.; Yu, Y.; Li, Y.; Han, G.; Du, X. Machine learning based privacy-preserving fair data trading in big data market. Inf. Sci. 2019, 478, 449–460. [Google Scholar] [CrossRef]
- Agarwal, A.; Dahleh, M.; Sarkar, T. A marketplace for data: An algorithmic solution. In Proceedings of the 2019 ACM Conference on Economics and Computation, Phoenix, AZ, USA, 24–28 June 2019; pp. 701–726. [Google Scholar]
- Joita, L.; Rana, O.F.; Freitag, F.; Chao, I.; Chacin, P.; Navarro, L.; Ardaiz, O. A catallactic market for data mining services. Future Gener. Comput. Syst. 2007, 23, 146–153. [Google Scholar] [CrossRef] [Green Version]
- Lorenzo, B.; Gómez-Cuba, F.; García-Rois, J.; Gonzalez-Castano, F.J.; Burguillo, J.C. A microeconomic approach to data trading in user provided networks. In Proceedings of the 2015 IEEE Globecom Workshops (GC Wkshps), San Diego, CA, USA, 6–10 December 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 1–7. [Google Scholar]
- Holt, T.J.; Smirnova, O.; Chua, Y.T. Exploring and Estimating the Revenues and Profits of Participants in Stolen Data Markets. Deviant Behav. 2016, 37, 353–367. [Google Scholar] [CrossRef]
- Nget, R.; Cao, Y.; Yoshikawa, M. How to Balance Privacy and Money through Pricing Mechanism in Personal Data Market. arXiv 2017, arXiv:1705.02982. [Google Scholar]
- Elvy, S.A. Paying for privacy and the personal data economy. Colum. Law Rev. 2017, 117, 1369. [Google Scholar]
- Keerthana, K.; Stefie, C.; Priyadharshini, R.; Veeralakshmi, P. Safe and Secure Data Markets using Merkle Hash Algorithm. Int. J. Res. Eng. Sci. Manag. 2019, 2, 94–96. [Google Scholar]
- Özyilmaz, K.R.; Doğan, M.; Yurdakul, A. IDMoB: IoT data marketplace on blockchain. In Proceedings of the 2018 Crypto Valley Conference on Blockchain Technology (CVCBT), Zug, Switzerland, 20–22 June 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 11–19. [Google Scholar]
- Fang, H. Managing data lakes in big data era: What’s a data lake and why has it became popular in data management ecosystem. In Proceedings of the 2015 IEEE International Conference on Cyber Technology in Automation, Control, and Intelligent Systems (CYBER), Shenyang, China, 8–12 June 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 820–824. [Google Scholar]
- Kieseberg, P.; Schantl, J.; Frühwirt, P.; Weippl, E.R.; Holzinger, A. Witnesses for the Doctor in the Loop. In Proceedings of the International Conference on Brain Informatics and Health, London, UK, 30 August–2 September 2015; pp. 369–378. [Google Scholar]
- Huang, L.; Joseph, A.D.; Nelson, B.; Rubinstein, B.I.; Tygar, J.D. Adversarial machine learning. In Proceedings of the 4th ACM Workshop on Security and Artificial Intelligence, Chicago, IL, USA, 21 October 2011; pp. 43–58. [Google Scholar]
- Poll, E. LangSec revisited: Input security flaws of the second kind. In Proceedings of the 2018 IEEE Security and Privacy Workshops (SPW), San Francisco, CA, USA, 24 May 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 329–334. [Google Scholar]
- Sassaman, L.; Patterson, M.L.; Bratus, S.; Locasto, M.E. Security applications of formal language theory. IEEE Syst. J. 2013, 7, 489–500. [Google Scholar] [CrossRef]
- Sarbanes, P. Sarbanes-oxley act of 2002. In The Public Company Accounting Reform and Investor Protection Act; US Congress: Washington, DC, USA, 2002. [Google Scholar]
- Regulation, G.D.P. Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46. Off. J. Eur. Union OJ 2016, 59, 294. [Google Scholar]
- Kieseberg, P.; Schrittwieser, S.; Mulazzani, M.; Echizen, I.; Weippl, E. An algorithm for collusion-resistant anonymization and fingerprinting of sensitive microdata. Electron. Mark. 2014, 24, 113–124. [Google Scholar] [CrossRef]
- Zhao, Y.; Wang, H.; Su, H.; Zhang, L.; Zhang, R.; Wang, D.; Xu, K. Understand love of variety in wireless data market under sponsored data plans. IEEE J. Sel. Areas Commun. 2020, 38, 766–781. [Google Scholar] [CrossRef]
- Bruschi, F.; Rana, V.; Pagani, A.; Sciuto, D. Acknowledging Value of Personal Information: A Privacy Aware Data Market for Health and Social Research. DLT@ ITASEC. 2020. Available online: http://ceur-ws.org/Vol-2580/DLT_2020_paper_6.pdf (accessed on 11 May 2021).
- Liang, J.; Jiang, W.; Li, S. OmniLytics: A Blockchain-based Secure Data Market for Decentralized Machine Learning. arXiv 2021, arXiv:2107.05252. [Google Scholar]
- Khapre, S.P.; Dhasarathan, C.; Puviyarasi, T.; Goundar, S. Blockchain-Based Data Market (BCBDM) Framework for Security and Privacy: An Analysis. In Applications of Big Data in Large-and Small-Scale Systems; IGI Global: Hershey, PA, USA, 2021; pp. 186–205. [Google Scholar]
- Ehteram, H.; Toghani, M.T.; Maddah-Ali, M.A. BlockMarkchain: A Secure Decentralized Data Market with a Constant Load on the Blockchain. arXiv 2020, arXiv:2003.11424. [Google Scholar]
- Zheng, X. Data trading with differential privacy in data market. In Proceedings of the 2020 the 6th International Conference on Computing and Data Engineering, Sanya, China, 4–6 January 2020; pp. 112–115. [Google Scholar]
- Jung, K.; Lee, J.; Park, K.; Park, S. PRIVATA: Differentially Private Data Market Framework using Negotiation-based Pricing Mechanism. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Beijing, China, 3–7 November 2019; pp. 2897–2900. [Google Scholar]
- Gadd, M.; Newman, P. The data market: Policies for decentralised visual localisation. arXiv 2018, arXiv:1801.05607. [Google Scholar]
Global Data Markets | Internal Data Markets | |
---|---|---|
Dedicated security personal | Typically available | Typically not available |
System and Infrastructure | Modelled after DM requirements | Already existing infrastructure |
Management capabilities | Can be patched/fixed/changed | External software |
Volume | High | Diverse |
Data intelligence | Little to none | Can be high |
Users | Little control | High level of control possible |
Market model | Generic | Can be specifically tailored |
Interfaces | Control possible | Typically no control |
Data locality | Typically low | Typically high |
GDPR responsibility | None | Can be fully responsible |
Institutional attention | DM is main focus | DM is a side issue |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kieseberg, P.; Schrittwieser, S.; Weippl, E. Secure Internal Data Markets. Future Internet 2021, 13, 208. https://doi.org/10.3390/fi13080208
Kieseberg P, Schrittwieser S, Weippl E. Secure Internal Data Markets. Future Internet. 2021; 13(8):208. https://doi.org/10.3390/fi13080208
Chicago/Turabian StyleKieseberg, Peter, Sebastian Schrittwieser, and Edgar Weippl. 2021. "Secure Internal Data Markets" Future Internet 13, no. 8: 208. https://doi.org/10.3390/fi13080208