Mitigating IoT Privacy-Revealing Features by Time Series Data Transformation
Abstract
:1. Introduction
- We propose a novel method to protect device-level privacy in IoT data sharing by transforming time series datasets, preventing inference of IoT device types.
- We design an efficient traffic reconstruction method that preserves the value of the original data while protecting sensitive information. We evaluate the data utility of the transformed dataset using Euclidean distance and replicating studies, showing that our method effectively obfuscates privacy-revealing traffic patterns without sacrificing data utility.
2. Related Works
3. IoT Device Membership Inference Attack
4. Methodology
4.1. Overview
4.2. LSTM-Based Transformer
4.3. Time Series Decomposition
4.4. Heuristic Method for Decomposition
4.5. Utility Assessment
4.6. Benchmark Datasets
5. Results
5.1. Visualizing Transformed Data
5.2. Similarity Measurement
5.3. Utility Measurement
6. Conclusions and Future Work
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Sherratt, D.; Gharakheili, H.H.; Radford, A.; Wijenayake, C.; Vishwanath, A.; Sivaraman, V. Characterizing and Classifying IoT Traffic in Smart Cities and Campuses. In Proceedings of the 2017 IEEE Conference on Computer Communications Workshops, INFOCOM Workshops, Atlanta, GA, USA, 1–4 May 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 559–564. [Google Scholar]
- Xin, Y.; Kong, L.; Liu, Z.; Chen, Y.; Li, Y.; Zhu, H.; Gao, M.; Hou, H.; Wang, C. Machine Learning and Deep Learning Methods for Cybersecurity. IEEE Access 2018, 6, 35365–35381. [Google Scholar] [CrossRef]
- Ambusaidi, M.A.; He, X.; Nanda, P.; Tan, Z. Building an Intrusion Detection System Using a Filter-Based Feature Selection Algorithm. IEEE Trans. Comput. 2016, 65, 2986–2998. [Google Scholar] [CrossRef]
- Injadat, M.; Moubayed, A.; Nassif, A.B.; Shami, A. Multi-Stage Optimized Machine Learning Framework for Network Intrusion Detection. IEEE Trans. Netw. Serv. Manag. 2020, 18, 1803–1816. [Google Scholar] [CrossRef]
- Chowdhury, R.R.; Aneja, S.; Aneja, N.; Abas, P.E. Packet-level and IEEE 802.11 MAC Frame-level Network Traffic Traces Data of the D-Link IoT Devices. Data Brief 2021, 37, 107208. [Google Scholar] [CrossRef]
- Hindy, H.; Bayne, E.; Bures, M.; Atkinson, R.; Tachtatzis, C.; Bellekens, X. Machine Learning Based IoT Intrusion Detection System: An MQTT Case Study (MQTT-IoT-IDS2020 Dataset). In Proceedings of the 12th International Networking Conference, Online, 19–21 September 2020; Ghita, B., Shiaeles, S., Eds.; Springer International Publishing: Cham, Switzerland, 2021; pp. 73–84. [Google Scholar]
- Xiong, S.; Sarwate, A.D.; Mandayam, N.B. Network Traffic Shaping for Enhancing Privacy in IoT Systems. IEEE/ACM Trans. Netw. 2022, 30, 1162–1177. [Google Scholar] [CrossRef]
- Apthorpe, N.J.; Reisman, D.; Feamster, N. Closing the Blinds: Four Strategies for Protecting Smart Home Privacy from Network Observers. arXiv 2017, arXiv:1705.06809. [Google Scholar]
- Nikaein, N.; Laner, M.; Zhou, K.; Svoboda, P.; Drajic, D.; Popovic, M.; Krco, S. Simple Traffic Modeling Framework for Machine Type Communication. In Proceedings of the ISWCS 2013: The Tenth International Symposium on Wireless Communication Systems, Ilmenau, Germany, 27–30 August 2013; pp. 1–5. [Google Scholar]
- Hamza, A.; Gharakheili, H.H.; Benson, T.A.; Sivaraman, V. Detecting Volumetric Attacks on LoT Devices via SDN-Based Monitoring of MUD Activity. In Proceedings of the 2019 ACM Symposium on SDN Research, SOSR ’19, San Jose, CA, USA, 3–4 April 2019; pp. 36–48. [Google Scholar]
- Das, A.K.; Pathak, P.H.; Chuah, C.; Mohapatra, P. Uncovering Privacy Leakage in BLE Network Traffic of Wearable Fitness Trackers. In Proceedings of the 17th International Workshop on Mobile Computing Systems and Applications, HotMobile 2016, St. Augustine, FL, USA, 23–24 February 2016; Chu, D., Dutta, P., Eds.; ACM: New Yok, NY, USA, 2016; pp. 99–104. [Google Scholar]
- Foukarakis, M.; Antoniades, D.; Antonatos, S.; Markatos, E.P. Flexible and High-performance Anonymization of NetFlow Records using Anontool. In Proceedings of the Third International Conference on Security and Privacy in Communication Networks and the Workshops, SecureComm 2007, Nice, France, 17–21 September 2007; IEEE: Piscataway, NJ, USA, 2007; pp. 33–38. [Google Scholar]
- Moore, D.; Keys, K.; Koga, R.; Lagache, E.; Claffy, K.C. The CoralReef Software Suite as a Tool for System and Network Administrators. In Proceedings of the 15th Conference on Systems Administration (LISA 2001), San Diego, CA, USA, 2–7 December 2001; Burgess, M., Ed.; USENIX: Berkeley, CA, USA, 2001; pp. 133–144. [Google Scholar]
- Xu, J.J.; Fan, J.; Ammar, M.H.; Moon, S.B. Prefix-Preserving IP Address Anonymization: Measurement-Based Security Evaluation and a New Cryptography-Based Scheme. In Proceedings of the 10th IEEE International Conference on Network Protocols (ICNP 2002), Paris, France, 12–15 November 2002; IEEE Computer Society: Washington, DC, USA, 2002; pp. 280–289. [Google Scholar]
- Xu, J.J.; Fan, J.; Ammar, M.H.; Moon, S.B. On the Design and Performance of Prefix-preserving IP Traffic Trace Anonymization. In Proceedings of the 1st ACM SIGCOMM Internet Measurement Workshop, IMW 2001, San Francisco, CA, USA, 1–2 November 2001; Paxson, V., Ed.; ACM: New Yok, NY, USA, 2001; pp. 263–266. [Google Scholar]
- Foukarakis, M.; Antoniades, D.; Polychronakis, M. Deep Packet Anonymization. In Proceedings of the Second European Workshop on System Security, EUROSEC 2009, Nuremburg, Germany, 31 March 2009; Markatos, E.P., Costa, M., Eds.; ACM: New Yok, NY, USA, 2009; pp. 16–21. [Google Scholar]
- Somolinos, R.; Carrero, A.M.; Hernando, M.E.; Carrasco, M.P.; Tello, J.C.; Sánchez-de-Madariaga, R.; Fragua, J.A.; Serrano, P.; Salvador, C.H. Service for the Pseudonymization of Electronic Healthcare Records Based on ISO/EN 13606 for the Secondary Use of Information. IEEE J. Biomed. Health Inform. 2015, 19, 1937–1944. [Google Scholar] [CrossRef]
- Faldum, A. On the Trustworthiness of Error-Correcting Codes. IEEE Trans. Inf. Theory 2007, 53, 4777–4784. [Google Scholar] [CrossRef]
- Buttyán, L.; Holczer, T. Traffic Analysis Attacks and Countermeasures in Wireless Body Area Sensor Networks. In Proceedings of the 2012 IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks, WoWMoM, San Francisco, CA, USA, 25–28 June 2012; IEEE Computer Society: Washington, DC, USA, 2012; pp. 1–6. [Google Scholar]
- Slagell, A.; Li, Y.; Luo, K. Sharing Network Logs for Computer Forensics: A New Tool for the Anonymization of Netflow Records. In Proceedings of the Workshop of the 1st International Conference on Security and Privacy for Emerging Areas in Communication Networks, Athens, Greece, 5–9 September 2005; pp. 37–42. [Google Scholar]
- Farah, T.; Trajkovic, L. Anonym: A Tool for Anonymization of the Internet Traffic. In Proceedings of the 2013 IEEE International Conference on Cybernetics, CYBCO 2013, Lausanne, Switzerland, 13–15 June 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 261–266. [Google Scholar]
- Slagell, A.J.; Lakkaraju, K.; Luo, K. FLAIM: A Multi-level Anonymization Framework for Computer and Network Logs. In Proceedings of the 20th Conference on Systems Administration (LISA 2006), Washington, DC, USA, 3–8 December 2006; LeFebvre, W., Ed.; USENIX: Berkeley, CA, USA, 2006; pp. 63–77. [Google Scholar]
- Amar, Y.; Haddadi, H.; Mortier, R. An Information-Theoretic Approach to Time-Series Data Privacy. In Proceedings of the 1st Workshop on Privacy by Design in Distributed Systems, P2DS@EuroSys 2018, Porto, Portugal, 23 April 2018; ACM: New York, NY, USA, 2018; pp. 1–6. [Google Scholar]
- Ren, W.; Tong, X.; Du, J.; Wang, N.; Li, S.; Min, G.; Zhao, Z. Privacy Enhancing Techniques in the Internet of Things Using Data Anonymisation. Inf. Syst. Front. 2021. [Google Scholar] [CrossRef]
- Park, N.; Mohammadi, M.; Gorde, K.; Jajodia, S.; Park, H.; Kim, Y. Data Synthesis based on Generative Adversarial Networks. Proc. VLDB Endow. 2018, 11, 1071–1083. [Google Scholar] [CrossRef]
- Choi, E.; Biswal, S.; Malin, B.A.; Duke, J.; Stewart, W.F.; Sun, J. Generating Multi-label Discrete Electronic Health Records using Generative Adversarial Networks. arXiv 2017, arXiv:1703.06490. [Google Scholar]
- Choi, E.; Biswal, S.; Malin, B.A.; Duke, J.; Stewart, W.F.; Sun, J. Generating Multi-label Discrete Patient Records using Generative Adversarial Networks. In Proceedings of the Machine Learning for Health Care Conference, MLHC 2017, Boston, MA, USA, 18–19 August 2017; Doshi-Velez, F., Fackler, J., Kale, D.C., Ranganath, R., Wallace, B.C., Wiens, J., Eds.; PMLR: London, UK, 2017; Volume 68, pp. 286–305. [Google Scholar]
- Torkzadehmahani, R.; Kairouz, P.; Paten, B. DP-CGAN: Differentially Private Synthetic Data and Label Generation. arXiv 2020, arXiv:2001.09700. [Google Scholar]
- Zhang, J.; Cormode, G.; Procopiuc, C.M.; Srivastava, D.; Xiao, X. PrivBayes: Private Data Release via Bayesian Networks. ACM Trans. Database Syst. 2017, 42, 1–41. [Google Scholar] [CrossRef]
- Mogren, O. C-RNN-GAN: Continuous Recurrent Neural Networks with Adversarial training. arXiv 2016, arXiv:1611.09904. [Google Scholar]
- Esteban, C.; Hyland, S.L.; Rätsch, G. Real-valued (Medical) Time Series Generation with Recurrent Conditional GANs. arXiv 2017, arXiv:1706.02633. [Google Scholar]
- Yoon, J.; Jarrett, D.; van der Schaar, M. Time-series Generative Adversarial Networks. In Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, Vancouver, BC, Canada, 8–14 December 2019; pp. 5509–5519. [Google Scholar]
- Lear, E.; Droms, R.E.; Romascanu, D. Manufacturer Usage Description Specification. RFC 2019, 8520, 1–60. [Google Scholar]
- Hamza, A.; Ranathunga, D.; Gharakheili, H.H.; Benson, T.A.; Roughan, M.; Sivaraman, V. Verifying and Monitoring IoTs Network Behavior Using MUD Profiles. IEEE Trans. Dependable Secur. Comput. 2022, 19, 1–18. [Google Scholar] [CrossRef]
- Hu, H.; Salcic, Z.; Sun, L.; Dobbie, G.; Yu, P.S.; Zhang, X. Membership Inference Attacks on Machine Learning: A Survey. ACM Comput. Surv. (CSUR) 2021, 54, 1–37. [Google Scholar] [CrossRef]
- Triple Flaw in Nest’s Dropcam Opens the Door to Burglars. Available online: https://www.bitdefender.com/blog/hotforsecurity/triple-flaw-nests-dropcam-opens-door-burglars/ (accessed on 16 May 2023).
- Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- Salinas, D.; Flunkert, V.; Gasthaus, J.; Januschowski, T. DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks. Int. J. Forecast. 2020, 36, 1181–1191. [Google Scholar] [CrossRef]
- El Emam, K. Seven Ways to Evaluate the Utility of Synthetic Data. IEEE Secur. Priv. 2020, 18, 56–59. [Google Scholar] [CrossRef]
- Ring, M.; Wunderlich, S.; Grüdl, D.; Landes, D.; Hotho, A. Flow-based benchmark Data Sets for Intrusion Detection. In Proceedings of the 16th European Conference on Cyber Warfare and Security (ECCWS), Dublin, Ireland, 29–30 June 2017; ACPI: South Oxfordshire, UK, 2017; pp. 361–369. [Google Scholar]
- Booij, T.M.; Chiscop, I.; Meeuwissen, E.; Moustafa, N.; den Hartog, F.T. ToN_IoT: The Role of Heterogeneity and the Need for Standardization of Features and Attack Types in IoT Network Intrusion Data Sets. IEEE Internet Things J. 2022, 9, 485–496. [Google Scholar] [CrossRef]
- Wen, Q.; Zhou, T.; Zhang, C.; Chen, W.; Ma, Z.; Yan, J.; Sun, L. Transformers in Time Series: A Survey. arXiv 2022, arXiv:2202.07125. [Google Scholar]
Symbol | Description |
---|---|
X | A multivariate time series |
The i-th feature value of time series X at time t | |
T | The total number of observations in time series X |
t | The index of the measurement in time |
Source time series data | |
The transformed time series data with a similar feature to | |
The predictive function used to transform to | |
The conditional distribution of a time series X from time to T for the i-th feature | |
The past time series data for the i-th feature up to time | |
The time point from which we assume to be unknown at prediction time |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, F.; Tang, Y.; Fang, H. Mitigating IoT Privacy-Revealing Features by Time Series Data Transformation. J. Cybersecur. Priv. 2023, 3, 209-226. https://doi.org/10.3390/jcp3020012
Wang F, Tang Y, Fang H. Mitigating IoT Privacy-Revealing Features by Time Series Data Transformation. Journal of Cybersecurity and Privacy. 2023; 3(2):209-226. https://doi.org/10.3390/jcp3020012
Chicago/Turabian StyleWang, Feng, Yongning Tang, and Hongbing Fang. 2023. "Mitigating IoT Privacy-Revealing Features by Time Series Data Transformation" Journal of Cybersecurity and Privacy 3, no. 2: 209-226. https://doi.org/10.3390/jcp3020012
APA StyleWang, F., Tang, Y., & Fang, H. (2023). Mitigating IoT Privacy-Revealing Features by Time Series Data Transformation. Journal of Cybersecurity and Privacy, 3(2), 209-226. https://doi.org/10.3390/jcp3020012