When things matter: A survey on data-centric internet of things
Introduction
The Internet is a global system of networks that interconnect computers using the standard Internet protocol suite. It has significant impact on the world as it can serve billions of users worldwide. Millions of private, public, academic, business, and government networks, of local to global scope, all contribute to the formation of the Internet. The traditional Internet has a focus on computers and can be called the Internet of Computers. In contrast, evolving from the Internet of Computers, the Internet of Things (IoT) emphasizes things rather than computers (Ashton, 2009). It aims to connect everyday objects, such as coats, shoes, watches, ovens, washing machines, bikes, cars, even humans, plants, animals, and changing environments, to the Internet to enable communication/interactions between these objects. The ultimate goal of IoT is to enable computers to see, hear and sense the real world. It is predicted by Ericsson that the number of Internet-connected things will reach 50 billion by 2020. Electronic devices and systems exist around us providing different services to the people in different situations: at home, at work, in their office, or driving a car on the street. IoT also enables the close relationship between human and opportunistic connection of smart things (Guo et al., 2013).
There are several definitions or visions of IoT from different perspectives. From the viewpoint of services provided by things, IoT means “a world where things can automatically communicate to computers and each other providing services to the benefit of the human kind” (CASAGRAS, 2000). From the viewpoint of connectivity, IoT means “from anytime, anyplace connectivity for anyone, we will now have connectivity for anything” (ITU, 2005). From the viewpoint of communication, IoT refers to “a world-wide network of interconnected objects uniquely addressable, based on standard communication protocols” (INFSO, 2008). Finally, from the viewpoint of networking, IoT is the Internet evolved “from a network of interconnected computers to a network of interconnected objects” (European Commission, 2009).
We focus on our study of the Internet of Things from a data perspective. As shown in Fig. 1, data is processed differently in the Internet of Things and traditional Internet environments (i.e., Internet of Computers). In the Internet of Computers, both main data producers and consumers are human beings. However, in the Internet of Things, the main actors become things, which means things are the majority of data producers and consumers. Therefore, we give our definition of the Internet of Things as follows:
“In the context of the Internet, addressable and interconnected things, instead of humans, act as the main data producers, as well as the main data consumers. Computers will be able to learn and gain information and knowledge to solve real world problems directly with the data fed from things. As an ultimate goal, computers enabled by the Internet of Things technologies will be able to sense and react to the real world for humans.”
As of 2012, 2.5 quintillion () bytes of data are created daily.1 In IoT, connecting all of the things that people care about in the world becomes possible. All these things would be able to produce much more data than nowadays. The volumes of data are vast, the generation speed of data is fast and the data/information space is global (James et al., 2009). Indeed, IoT is one of the major driving forces for big data analytics. Given the scale of IoT, topics such as storage, distributed processing, real-time data stream analytics, and event processing are all critical, and we may need to revisit these areas to improve upon existing technologies for applications of this scale.
In this paper, we systematically investigate the key technologies related to the development of IoT and its applications, particularly from a data-centric perspective. The aim of this work is to provide a better understanding of the current research activities and issues. Fig. 2 shows the roadmap of this paper. As can be seen from the figure, we review and compare technologies including data streams, data storage models, searching, and event processing technologies, which play a vital role in enabling the vision of IoT. We also describe some relevant applications from several representative areas. Although some reviews about IoT have been conducted recently (e.g., Atzori et al., 2010, Zeng et al., 2011, An et al., 2013, Perera et al., 2013, Li et al., 2016, Yan et al., 2014), they focus on high level general issues and are mostly fragmented. In addition, these articles do not specifically cover techniques on data processing and management, which is fundamentally critical to fully embrace IoT. To the best of our knowledge, this is the first article that studies and discusses state-of-the-art techniques of IoT from the data-centric perspective.
The remainder of the article is organized as follows. Section 2 identifies an IoT data taxonomy. Section 3 reviews the data streaming techniques and Section 4 focuses on the data models and storage technologies for IoT. Search and event processing technologies are discussed in 5 Search techniques, 6 Complex event processing, respectively. In Section 7, some typical ongoing and/or potential IoT applications where data techniques for IoT can bring significant changes are described. Finally, Section 8 highlights some research open issues on IoT from the data perspective and Section 9 offers some concluding remarks.
Section snippets
IoT data taxonomy
In this section, we identify the intrinsic characteristics of IoT data and classify them into three categories, including Data Generation, Data Quality, and Data Interoperability. We also identify specific characteristics of each category, and the overall IoT data taxonomy is shown in Fig. 3.
Data streams
A data stream is a sequence of data objects of which the number is potentially unbounded. A data stream may be continuously generated at a rapid rate. In the data stream, each data object can be described by a multi-dimensional attribute vector within a continuous, categorical, or mixed attribute space (de Andrade Silva et al., 2013). There are some typical characteristics of data streams:
- •
Continuous arrival of data objects.
- •
Disordered arrival of data objects.
- •
Potentially unbounded size of a
Data storage models
The nature of data produced by the Internet of Things calls for a revisit of data storage techniques, which will be further discussed in this section.
Search techniques
Searching and finding relevant objects from billions of things is one of the major challenges for the future Internet of Things and can bring about huge potential impact to humans. Supporting technologies for searching things in the IoT are very different from those used in searching Web documents because things are tightly bound to contextual information (e.g., location) and have no easily indexable properties (e.g., human readable text in the case of Web documents). In addition, the state
Complex event processing
Data streaming techniques typically process incoming data through a sequence of transformations based on common SQL operators, like selection, aggregate, join, and these operators are defined in general by relational algebra. By contrast, the complex event processing (CEP) model views the information in the streams as events in the physical world. These events must be filtered, combined and transformed into higher-level events for better understanding by computers and humans. Similar to
Potential IoT applications
As pointed out by Ashton (2009) that IoT “has the potential to change the world, just as the Internet did”. The ongoing and/or potential IoT applications show that IoT can bring significant changes in many domains, i.e., cities and homes, environment monitoring, health, energy, and business. IoT can bring the ability to react to events in the physical world in an automatic, rapid and informed manner. This also opens up new opportunities for dealing with complex or critical situations and
Open issues
The development of IoT technologies and applications is merely beginning. Many new challenges and issues have not been addressed, which require substantial efforts from both academia and industry. In this section, we identify some key directions for future research and development from a data-centric perspective:
- •
Data quality and uncertainty: In IoT, as data volume increases, inconsistency and redundancy within data would become paramount issues. One of the central problems for data quality is
Summary
It is predicted that the next generation of the Internet will be composed of trillions of connected computing nodes at a global scale. Through these nodes, everyday objects in the world can be identified, connected to the Internet and take decisions independently. In this context, Internet of Things (IoT) is considered a new revolution of the Internet. In IoT, the possibility of seamlessly merging the real and the virtual worlds, through the massive deployment of embedded devices, opens up many
Acknowledgements
Quan Z. Sheng׳s work has been supported by the Australian Research Council Discovery Grant DP140100104. We express our gratitude to the anonymous reviewers for their comments and suggestions which have greatly helped us to improve this work.
References (91)
- et al.
Research on social relations cognitive model of mobile nodes in Internet of Things
J Netw Comput Appl
(2013) - et al.
The internet of thingsa survey
Comput Netw
(2010) - et al.
The sensor internet at worklocating everyday items using mobile phones
Pervas Mob Comput
(2008) - et al.
Opportunistic IoTexploring the harmonious interaction between human and the internet of things
J Netw Comput Appl
(2013) - et al.
Research on social relations cognitive model of mobile nodes in internet of things
J. Netw Comput Appl
(2014) - Agrawal J, Diao Y, Gyllstrom D, Immerman N. Efficient pattern matching over event streams. In: Proceedings of the ACM...
- Agrawal S, Chakrabarti K, Chaudhuri S, Ganti V, König AC, Xin D. Exploiting web search engines to search structured...
- Anicic D, Fodor P, Rudolph S, Stojanovic N. EP-SPARQL: a unified language for event processing and stream reasoning....
- Anonymous. National Intelligence Council (NIC). Disruptive civil technologies: six technologies with potential impacts...
- Apache. Apache HBase project 〈http://www.hbase.apache.org/〉;...