Journal Home > Volume 26 , Issue 6

The pervasiveness of the smart Internet of Things (IoTs) enables many electric sensors and devices to be connected and generates a large amount of dataflow. Compared with traditional big data, the streaming dataflow is faced with representative challenges, such as high speed, strong variability, rough continuity, and demanding timeliness, which pose severe tests of its efficient management. In this paper, we provide an overall review of IoT dataflow management. We first analyze the key challenges faced with IoT dataflow and initially overview the related techniques in dataflow management, spanning dataflow sensing, mining, control, security, privacy protection, etc. Then, we illustrate and compare representative tools or platforms for IoT dataflow management. In addition, promising application scenarios, such as smart cities, smart transportation, and smart manufacturing, are elaborated, which will provide significant guidance for further research. The management of IoT dataflow is also an important area, which merits in-depth discussions and further study.


menu
Abstract
Full text
Outline
About this article

Dataflow Management in the Internet of Things: Sensing, Control, and Security

Show Author's information Dawei WeiHuansheng Ning( )Feifei ShiYueliang WanJiabo XuShunkun YangLi Zhu
School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China
Beijing Engineering Research Center for Cyberspace Data Analysis and Applications, Beijing 100083, China
Beijing Engineering Research Center for Cyberspace Data Analysis and Applications, Beijing 100083, China
Run Technologies Co., Ltd., Beijing 100192, China
School of Information Engineering, Xinjiang Institute of Engineering, Urumqi 830023, China
School of Reliability and Systems Engineering, Beihang University, Beijing 100191, China
Northern Cloud Research Team, Engineering University of PAP, Xi’an 710086, China

Abstract

The pervasiveness of the smart Internet of Things (IoTs) enables many electric sensors and devices to be connected and generates a large amount of dataflow. Compared with traditional big data, the streaming dataflow is faced with representative challenges, such as high speed, strong variability, rough continuity, and demanding timeliness, which pose severe tests of its efficient management. In this paper, we provide an overall review of IoT dataflow management. We first analyze the key challenges faced with IoT dataflow and initially overview the related techniques in dataflow management, spanning dataflow sensing, mining, control, security, privacy protection, etc. Then, we illustrate and compare representative tools or platforms for IoT dataflow management. In addition, promising application scenarios, such as smart cities, smart transportation, and smart manufacturing, are elaborated, which will provide significant guidance for further research. The management of IoT dataflow is also an important area, which merits in-depth discussions and further study.

Keywords: Internet of Things (IoTs), security, privacy, dataflow, management

References(69)

[1]
B. Safaei, A. Mohammadsalehi, K. Talaei, S. Zarbaf, and A. Ejlali, Impacts of mobility models on RPL-based mobile IoT infrastructures: An evaluative comparison and survey, IEEE Access, .
[2]
H. S. Ning, D. G. Belanger, Y. L. Xia, V. Piuri, and A. Y. Zomaya, Guest editorial special issue on big data analytics and management in internet of things, IEEE Internet of Things Journal, vol. 2, no. 4, pp. 265-267, 2015.
[3]
M. Strohbach, H. Ziekow, V. Gazis, and N. Akiva, Towards a big data analytics framework for IoT and smart city applications, in Modeling and Processing for Next-Generation Big-Data Technologies, F. Xhafa, L. Barolli, A. Barolli, and P. Papajorgji, Eds. Switzerland: Springer International Publishing, 2015, pp. 257-282.
DOI
[4]
D. Puthal, R. Ranjan, S. Nepal, and J. J. Chen, IoT and big data: An architecture with data flow and security issues, in Cloud Infrastructures, Services, and IoT Systems for Smart Cities. Brindisi, Italy: Springer, 2017, pp. 243-252.
DOI
[5]
M. Mohammadi, A. Al-Fuqaha, S. Sorour, and M. Guizani, Deep learning for IoT big data and streaming analytics: A survey, IEEE Communications Surveys & Tutorials, vol. 20, no. 4, pp. 2923-2960, 2018.
[6]
J. Singh, J. Powles, T. Pasquier, and J. Bacon, Data flow management and compliance in cloud computing, IEEE Cloud Computing, vol. 2, no. 4, pp. 24-32, 2015.
[7]
D. Carney, U. Ģetintemel, M. Cherniack, C. Convey, S. Lee, G. Seidman, M. Stonebraker, N. Tatbul, and S. Zdonik, Monitoring streams: A new class of data management applications, in Proc. 28th Int. Conf. Very Large Data Bases, San Francisco, CA, USA, 2002, pp. 215-226.
DOI
[8]
Rydning John Reinsel David, Gantz John, Data age 2025: The evolution of data to life-critical don’t focus on big data; focus on the data that’s big IDC white paper, http://www.innovation4.cn/library/r21572, 2020.
[9]
J. C. Yang, C. F. Ma, J. B. Man, H. F. Xu, G. Zheng, and H. B. Song, Cache-enabled in cooperative cognitive radio networks for transmission performance, Tsinghua Science and Technology, vol. 25, no. 1, pp. 1-11, 2020.
[10]
M. M. Rathore, A. Ahmad, A. Paul, and G. Jeon, Efficient graph-oriented smart transportation using internet of things generated big data, in Proc. 11th Int. Conf. Signal-Image Technology and Internet-Based Systems (SITIS), Bangkok, Thailand, 2015, pp. 512-519.
DOI
[11]
D. Nallaperuma, R. Nawaratne, T. Bandaragoda, A. Adikari, S. Nguyen, T. Kempitiya, D. De Silva, D. Alahakoon, and D. Pothuhera, Online incremental machine learning platform for big data-driven smart traffic management, IEEE Transactions on Intelligent Transportation Systems, vol. 20, no. 12, pp. 4679-4690, 2019.
[12]
J. Li, M. Siddula, X. Z. Cheng, W. Cheng, Z. Tian, and Y. S. Li, Approximate data aggregation in sensor equipped IoT networks, Tsinghua Science and Technology, vol. 25, no. 1, pp. 44-55, 2020.
[13]
D. Kim, J. Son, D. Seo, Y. Kim, H. Kim, and J. T. Seo, A novel transparent and auditable fog-assisted cloud storage with compensation mechanism, Tsinghua Science and Technology, vol. 25, no. 1, pp. 28-43, 2020.
[14]
L. Santos, C. Rabadão, and R. Gonçalves, Flow monitoring system for IoT networks, in New Knowledge in Information Systems and Technologies, Á. Rocha, H. Adeli, L. P. Reis, and S. Costanzo, Eds. Switzerland: Springer, 2019, pp. 420-430.
[15]
M. M. Gaber, A. Zaslavsky, and S. Krishnaswamy, Mining data streams: A review, ACM SIGMOD Record, vol. 34, no. 2, pp. 18-26, 2005.
[16]
C. Wickramaarachchi and Y. Simmhan, Continuous dataflow update strategies for mission-critical applications, in 2013 IEEE 9th International Conference on e-Science, Beijing, China, 2013, pp. 155-163.
DOI
[17]
V. Bhatnagar, S. Kaur, and S. Chakravarthy, Clustering data streams using grid-based synopsis, Knowledge and Information Systems, vol. 41, no. 1, pp. 127-152, 2014.
[18]
D. G. Murray, F. McSherry, R. Isaacs, M. Isard, P. Barham, and M. Abadi, Naiad: A timely dataflow system, in Proc. 24th ACM Symp. Operating Systems Principles, Farminton, PA, USA, 2013, pp. 439-455.
DOI
[19]
M. Sandstede, Online analysis of distributed dataflows with timely dataflow, arXiv preprint arXiv: 1912.09747, 2019.
[20]
B. Ellis, Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data. Indianapolis, IN, USA: John Wiley & Sons, 2014.
[21]
H. S. Ning, Unit and Ubiquitous Internet of Things. Boca Raton, FL, USA: CRC Press, 2013.
[22]
List of automation protocols-Wikipedia, the free encyclopedia, https://en.wikipedia.org/wiki/List_of_automation_protocols, 2020.
[23]
X. M. Zeng, X. S. Chen, G. L. Shao, T. He, and L. Wang, DTA-HOC: Online HTTPS traffic service identification using DNS in large-scale networks, Tsinghua Science and Technology, vol. 25, no. 2, pp. 239-254, 2020.
[24]
N. K. Giang, M. Blackstock, R. Lea, and V. C. M. Leung, Developing IoT applications in the fog: A distributed dataflow approach, in Proc. 5th Int. Conf. Internet of Things (IoT), Seoul, the Republic of Korea, 2015, pp. 155-162.
DOI
[25]
Y. Teranishi, T. Kimata, H. Yamanaka, E. Kawai, and H. Harai, Dynamic data flow processing in edge computing environments, in Proc. IEEE 41st Annu. Computer Software and Applications Conf. (COMPSAC), Turin, Italy, 2017, pp. 935-944.
DOI
[26]
F. Paganelli, S. Turchi, and D. Giuli, A web of things framework for RESTful applications and its experimentation in a smart city, IEEE Systems Journal, vol. 10, no. 4, pp. 1412-1423, 2016.
[27]
S. Sagar, M. Lefrançois, I. Rebaï, M. Khemaja, S. Garlatti, J. Feki, and L. Médini, Modeling smart sensors on top of SOSA/SSN and WoT TD with the semantic smart sensor network (S3N) modular ontology, in ISWC 2018: 17th Internal Semantic Web Conf., Monterey, CA, USA, 2018, pp. 163-177.
[28]
L. Sciullo, C. Aguzzi, M. Di Felice, and T. S. Cinotti, WoT store: Enabling things and applications discovery for the W3C web of things, in Proc. 16th IEEE Annu. Consumer Communications and Networking Conf. (CCNC), Las Vegas, NV, USA, 2019, pp. 1-8.
DOI
[29]
R. De Virgilio and R. Torlone, A general methodology for context-aware data access, in Proc. 4th ACM Int. Workshop on Data Engineering for Wireless and Mobile Access, Baltimore, MD, USA, 2005, pp. 9-15.
DOI
[30]
A. Kos, S. Tomazic, J. Salom, N. Trifunovic, M. Valero, and V. Milutinovic, Big data processing: Data flow vs. control flow (new benchmarking methodology), in Proc. 2014 Int. Conf. Identification, Information and Knowledge in the Internet of Things, Beijing, China, 2014, pp. 56-59.
DOI
[31]
, S. Sankaranarayanan, J. J. P. C. Rodrigues, V. Sugumaran, and S. Kozlov, Data flow and distributed deep neural network based low latency IoT-edge computation model for big data environment, Engineering Applications of Artificial Intelligence, vol. 94, p. 103785, 2020.
[32]
W. S. Gan, J. C. W. Lin, H. C. Chao, and J. Zhan, Data mining in distributed environment: A survey, WIREs Data Mining and Knowledge Discovery, vol. 7, no. 6, p. e1216, 2017.
[33]
S. Teerapittayanon, B. McDanel, and H. T. Kung, Distributed deep neural networks over the cloud, the edge and end devices, in Proc. 2017 IEEE 37th Int. Conf. Distributed Computing Systems (ICDCS), Atlanta, GA, USA, 2017, pp. 328-339.
DOI
[34]
A. A. Matos and J. Cederquist, Information flow in a distributed security setting, arXiv preprint arXiv: 1901.01111, 2019.
[35]
S. Nakamura, T. Enokido, L. Barolli, and M. Takizawa, Capability-based information flow control model in the IoT, in Innovative Mobile and Internet Services in Ubiquitous Computing, L. Barolli, F. Xhafa, and O. K. Hussain, Eds. Switzerland: Springer, 2020, pp. 63-71.
DOI
[36]
D. Basin, M. Harvan, F. Klaedtke, and E. Zăinescu, MONPOLY: Monitoring usage-control policies, in Runtime Verification, S. Khurshid and K. Sen, Eds. San Francisco, CA, USA: Springer, 2012, pp. 360-364.
DOI
[37]
D. Basin, F. Klaedtke, and S. Müller, Policy monitoring in first-order temporal logic, in Computer Aided Verification, T. Touili, B. Cook, and P. Jackson, Eds. Edinburgh, UK: Springer, 2010, pp. 1-18.
DOI
[38]
X. W. Zhang, J. P. Seifert, and R. Sandhu, Security enforcement model for distributed usage control, in Proc. 2008 IEEE Int. Conf. Sensor Networks, Ubiquitous, and Trustworthy Computing, Taichung, China, 2008, pp. 10-18.
DOI
[39]
M. Harvan and A. Pretschner, State-based usage control enforcement with data flow tracking using system call interposition, in Proc. 2009 3rd Int. Conf. Network and System Security, Gold Coast, Australia, 2009, pp. 373-380.
DOI
[40]
P. Bellini, F. Bugli, P. Nesi, G. Pantaleo, M. Paolucci, and I. Zaza, Data flow management and visual analytic for big data smart city/IoT, in Proc. 2019 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/ SCALCOM/UIC/ATC/CBDCom/IOP/SCI), Leicester, UK, 2019, pp. 1529-1536.
DOI
[41]
Apache, Apache nifi, http://nifi.apache.org, 2020.
[42]
T. F. J. M. Pasquier, J. Singh, D. Eyers, and J. Bacon, Camflow: Managed data-sharing for cloud services, IEEE Transactions on Cloud Computing, vol. 5, no. 3, pp.472-484, 2017.
[43]
Y. Z. Wu, Y. Q. Lyu, and Y. C. Shi, Cloud storage security assessment through equilibrium analysis, Tsinghua Science and Technology, vol. 24, no. 6, pp. 738-749, 2019.
[44]
A. Sivanathan, D. Sherratt, H. H. Gharakheili, V. Sivaraman, and A. Vishwanath, Low-cost flow-based security solutions for smart-home IoT devices, in Proc. 2016 IEEE Int. Conf. Advanced Networks and Telecommunications Systems (ANTS), Bangalore, India, 2016, pp. 1-6.
DOI
[45]
T. L. Yu, V. Sekar, S. Seshan, Y. Agarwal, and C. R. Xu, Handling a trillion (unfixable) flaws on a billion devices: Rethinking network security for the internet-of-things, in Proc. 14th ACM Workshop on Hot Topics in Networks, Philadelphia, PA, USA, 2015, pp. 1-7.
DOI
[46]
R. V. Nehme, H. S. Lim, E. Bertino, and E. A. Rundensteiner, StreamShield: A stream-centric approach towards security and privacy in data stream environments, in Proc. 2009 ACM SIGMOD Int. Conf. Management of Data, Providence, Rhode Island, USA, 2009, pp. 1027-1030.
DOI
[47]
[48]
M. Hirzel, H. Andrade, B. Gedik, G. Jacques-Silva, R. Khandekar, V. Kumar, M. Mendell, H. Nasgaard, S. Schneider, R. Soulé, et al., IBM streams processing language: Analyzing big data in motion, IBM Journal of Research and Development, vol. 57, nos. 3&4, pp. 7:1-7:11, 2013.
[49]
Amazon Web Services, Amazon timestream, https://aws.amazon.com/timestream/, 2020.
[50]
L. Neumeyer, B. Robbins, A. Nair, and A. Kesari, S4: Distributed stream computing platform, in Proc. 2010 IEEE Int. Conf. Data Mining Workshops, Sydney, Australia, 2010, pp. 170-177.
DOI
[51]
N. Garg, Apache Kafka. Birmingham, UK: Packt Publishing, 2013.
[52]
Å. Hugo, B. Morin, and K. Svantorp, Bridging MQTT and Kafka to support C-ITS: A feasibility study, in Proc. 2020 21st IEEE Int. Conf. Mobile Data Management (MDM), Versailles, France, 2020, pp. 371-376.
DOI
[53]
R. Wiska, N. Habibie, A. Wibisono, W. S. Nugroho, and P. Mursanto, Big sensor-generated data streaming using Kafka and impala for data storage in wireless sensor network for CO2 monitoring, in Proc. 2016 Int. Workshop on Big Data and Information Security (IWBIS), Jakarta, Indonesia, 2016, pp. 97-102.
DOI
[54]
M. T. Tun, D. E. Nyaung, and M. P. Phyu, Performance evaluation of intrusion detection streaming transactions using apache Kafka and spark streaming, in Proc. 2019 Int. Conf. Advanced Information Technologies (ICAIT), Yangon, Myanmar, 2019, pp. 25-30.
DOI
[55]
K. Yu, Y. Zhou, D. Li, Z. Zhang, and K. Q. Huang, A large-scale distributed video parsing and evaluation platform, in Chinese Conference on Intelligent Visual Surveillance, Z. Zhang and K. Huang, Eds. Beijing, China: Springer, 2016, pp. 37-43.
DOI
[56]
A. Batyuk and V. Voityshyn, Apache storm based on topology for real-time processing of streaming data from social networks, in Proc. 2016 IEEE 1st Int. Conf. Data Stream Mining and Processing (DSMP), Lviv, Ukraine, 2016, pp. 345-349.
DOI
[57]
A. Arasu, B. Babcock, S. Babu, J. Cieslewicz, M. Datar, K. Ito, R. Motwani, U. Srivastava, and J. Widom, Stream: The Stanford data stream management system, in Data Stream Management. Berlin, Germany: Springer, 2016, pp. 317-336.
DOI
[58]
M. Blackstock and R. Lea, Toward a distributed data flow platform for the web of things (distributed node-RED), in Proc. 5th Int. Workshop on Web of Things, Cambridge, MA, USA, 2014, pp. 34-39.
DOI
[59]
M. Blackstock and R. Lea, IoT mashups with the WoTKit, in Proc. 2012 3rd IEEE Int. Conf. Internet of Things, Wuxi, China, 2012, pp. 159-166.
DOI
[60]
X. Li, Y. Han, F. J. Yu, and G. Chen, Multi-sensor data real-time monitoring and management system based on onboard UAV for ocean observation, in Proc. 4th Int. Conf. Machinery, Materials and Information Technology Applications, Xi’an, China, 2017, 174-181.
DOI
[61]
A. Benabbas and D. Nicklas, Quality-aware sensor data stream management in a living lab environment, in Proc. 2019 IEEE Int. Conf. Pervasive Computing and Communications Workshops (PerCom Workshops), Kyoto, Japan, 2019, pp. 445-446.
DOI
[62]
L. Gurgen, C. Roncancio, C. Labbé, A. Bottaro, and V. Olive, SstreaMWare: A service oriented middleware for heterogeneous sensor data management, in Proc. 5th Int. Conf. Pervasive Services, Sorrento, Italy, 2008, pp. 121-130.
DOI
[63]
D. J. Abadi, Y. Ahmad, M. Balazinska, U. Çetintemel, M. Cherniack, J. H. Hwang, W. Lindner, A. S. Maskey, A. Rasin, E. Ryvkina, et al., The design of the borealis stream processing engine, in Proc. 2005 CIDR Conf., Asilomar, CA, USA, 2005, pp. 277-289.
[64]
P. B. Gibbons, B. Karp, Y. Ke, S. Nath, and S. Seshan, IrisNet: An architecture for a worldwide sensor web, IEEE Pervasive Computing, vol. 2, no. 4, pp. 22-33, 2003.
[65]
J. Shneidman, P. Pietzuch, J. Ledlie, M. Roussopoulos, M. Seltzer, and M. Welsh, Hourglass: An Infrastructure for Connecting Sensor Networks and Applications. Columbia, MA, USA: Harvard University, 2004.
[66]
S. Madden and M. J. Franklin, Fjording the stream: An architecture for queries over streaming sensor data, in Proc. 18th Int. Conf. Data Engineering, San Jose, CA, USA, 2002, pp. 555-566.
[67]
R. Lea and M. Blackstock, City hub: A cloud-based IoT platform for smart cities, in Proc. 2014 IEEE 6th Int. Conf. Cloud Computing Technology and Science, Singapore, 2014, pp. 799-804.
DOI
[68]
J. M. Bohli, A. Skarmeta, M. V. Moreno, D. García, and P. Langendorfer, SMARTIE project: Secure IoT data management for smart cities, in Proc. 2015 Int. Conf. Recent Advances in Internet of Things (RIoT) , Singapore, 2015, pp. 1-6.
DOI
[69]
B. Jan, H. Farman, M. Khan, M. Talha, and I. U. Din, Designing a smart transportation system: An internet of things and big data approach, IEEE Wireless Communications, vol. 26, no. 4, pp. 73-79, 2019.
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 25 March 2021
Accepted: 09 April 2021
Published: 09 June 2021
Issue date: December 2021

Copyright

© The author(s) 2021.

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China (No. 61872038).

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return