Abstract
Smart city data come from heterogeneous sources including various types of the Internet of Things such as traffic, weather, pollution, noise, and portable devices. They are characterized with diverse quality issues and with different types of sensitive information. This makes data processing and publishing challenging. In this paper, we propose a framework to streamline smart city data management, including data collection, cleansing, anonymization, and publishing. The paper classifies smart city data in sensitive, quasi-sensitive, and open/public levels and then suggests different strategies to process and publish the data within these categories. The paper evaluates the framework using a real-world smart city data set, and the results verify its effectiveness and efficiency. The framework can be a generic solution to manage smart city data.
Similar content being viewed by others
Notes
References
Barnaghi P, Bermudez-Edo M, Tonjes R (2015) Challenges for quality of data in smart cities. J Data Inf Qual 6(2–3):6
Bischof S, Karapantelakis A, Nechifor CS, Sheth A, Mileo A, Barnaghi P (2014) Semantic modelling of smart city data. In: W3C workshop on the web of things—enablers and services for an open web of devices. W3C
Bischof S, Polleres A, Sperl S (2013) City data pipeline In: Proceedings of the I-SEMANTICS posters and demonstrations track, p 45
Bovee M, Srivastava RP, Mak B (2003) A conceptual framework and belief-function approach to assessing overall information quality. Int J Intell Syst 18(1):51–74
Cappiello C, Francalanci C, Pernici B (2003) Time-related factors of data quality in multichannel information systems. J Manag Inf Syst 20(3):71–91
Carpineto C, Romano G (2015) K\(\theta \)-affinity privacy: releasing infrequent query refinements safely. Inf Process Manag 51(2):74–88
Darari F, Manurung R (2011) LinkedLab: a linked data platform for research communities. In: Advanced computer science and information system (ICACSIS), pp 253–258
Fung B, Wang K, Chen R, Yu PS (2010) Privacy-preserving data publishing: a survey of recent developments. ACM Comput Surv (CSUR) 42(4):14
Gao F, Ali MI, Mileo A (2014) Semantic discovery and Integration of urban data streams In: Proceedings of the 5th workshop on semantics for smarter cities, pp 15–30
Glasmeier A, Christopherson S (2015) Thinking about smart cities. Camb J Reg Econ Soc 8(1):3–12
Gubbi J, Buyya R, Marusic S, Palaniswami M (2013) Internet of things (IoT): a vision, architectural elements, and future directions. Future Gener Comput Syst 29(7):1645–1660
Haslhofer B, Schandl B (2008) The OAI2LOD server: exposing OAI-PMH metadata as linked data. In: Proceedings of WWW workshop linked data on the web
He Q, Antón AI (2003) A framework for modeling privacy requirements in role engineering. In: Proceedings of REFSQ, pp 137–146
Li N, Li T, Venkatasubramanian S (2007) t-closeness: privacy beyond k-anonymity and l-diversity. In: Proceedings of ICDE, pp 106–115
Li T, Li N (2009) On the tradeoff between privacy and utility in data publishing. In: Proceedings of SIGKDD, pp 517–526
Liu X, Nielsen PS (2015) Streamlining smart meter data analytics. In: Proceedings of the 10th conference on sustainable development of energy, water and environment systems. SDEWES2015.0558, pp 1–14
Liu X, Nielsen PS (2016) An ICT-solution for smart meter data analytics. Energy 115(3):1710–1722
Lopez V, Kotoulas S, Sbodio ML, Stephenson M, Gkoulalas-Divanis A, Aonghusa PM (2012) QuerioCity: a linked data platform for urban information management. The semantic web, pp 148–163
Machanavajjhala A, Kifer D, Gehrke J, Venkitasubramaniam M (2013) l-diversity: privacy beyond k-anonymity. ACM Trans Knowl Discov Data 1(1):3
Malin B (2008) k-unlinkability: a privacy protection model for distributed data. Data Knowl Eng 64(1):294–311
Manville C, Cochrane G, Cave J et al (2014) Mapping smart cities in the EU[J]. European Parliament; Directorate general for internal policies, policy department economic and scientific policy A
Navarro-Arribas G, Torra V, Erola A, Castella-Roca J (2012) User k-anonymity for privacy preserving data mining of query logs. Inf Process Manag 48(3):476–487
Parreira JX, Dhungana D, Engelbrecht G (2015) The role of RDF stream processing in an smart city ICT infrastructure–the Aspern smart city use case. The semantic web: ESWC 2015 satellite events, pp 343–352
Pipino L, Lee YW, Wang RY (2012) Data quality assessment. Commun ACM 4:211–218
Qin H, Li H, Zhao X (2010) Development status of domestic and foreign smart city. Glob Presence 9:50–52
Rahm E, Do HH (2000) Data cleaning: problems and current approaches. IEEE Data Eng Bull 23(4):3–13
Redman TC (1996) Data quality for the information age. Artech House, Boston, MA
Samarati P, Sweeney L (1998) Generalizing data to provide anonymity when disclosing information. In: Proceedings of SIGMOD-SIGACT-SIGART symposium on the principles of database systems
Santos H, Pinheiro P, McGuinness DL (2015) Contextual data collection for smart cities. In: Proceedings of the 6th workshop on semantics for smarter cities
Scannapieco M, Catarci T (2002) Data quality under a computer science perspective. Arch Comput 2:1–15
Snigdha C, Tanveer AF, Hima PK, Mukesh KM, Venkata S (2015) Cleansing a database system to improve data quality. US Patent US9,104709 B2
Su K, Li J, Fu H (2011) Smart city and the applications. In: Electronics, communications and control (ICECC), pp 1028–1031
Sweeney L (2002) Achieving k-anonymity privacy protection using generalization and suppression. J Uncertain Fuzziness Knowl Based Syst 10(5):571–588
Thomsen C, Pedersen TB (2009) Pygrametl: a powerful programming framework for extract-transform-load programmers. In: Proceedings of DOLAP, pp 49–56
Thusoo A, Sarma JS, Jain N, Shao Z, Chakka P, Zhang N, Murthy R (2010) Hive-a petabyte scale data warehouse using Hadoop. In: Proceedings of ICDE, pp 996–1005
Wand Y, Wang RY (1996) Anchoring data quality dimensions in ontological foundations. Commun ACM 39(11):86–95
Wong RC, Li J, Fu AWC, Wang K (2007) K-anonymity: an enhanced k-anonymity model for privacy preserving data publishing. In: Proceedings of SIGKDD, pp 754–759
Zanella A, Bui N, Castellani A, Vangelista L, Zorzi M (2014) Internet of things for smart cities. Internet Things J 1(1):22–32
Zaveri A, Rula A, Maurino A, Pietrobon R, Lehmann J, Auer S (2016) Quality assessment for linked data: a survey. Semantic Web J 7(1):63–93
Acknowledgements
This research was supported by the CITIES Project (No. 1035-00027B) funded by Innovation Fund Denmark. The infrastructure components are partly supported by the Danish Electronic Infrastructure (DeIC) through the project “Science Cloud for Cities.”
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
See Fig. 13.
Rights and permissions
About this article
Cite this article
Liu, X., Heller, A. & Nielsen, P.S. CITIESData: a smart city data management framework. Knowl Inf Syst 53, 699–722 (2017). https://doi.org/10.1007/s10115-017-1051-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-017-1051-3