Abstract
Many real systems produce network data or highly interconnected data, which can be called information networks. These information networks form a critical component in modern information infrastructure, constituting a large graph data volume. The analysis of information network data covers several technological areas, among them OLAP technologies. OLAP is a technology that enables multi-dimensional and multi-level analysis on a large volume of data, providing aggregated data visualizations with different perspectives. This article presents a literature review on the main applications of OLAP technology in the analysis of information network data. To achieve such goal, it shows a systematic review to list the works that apply OLAP technologies in graph data. It defines seven comparison criteria (Materialization, Network, Selection, Aggregation, Model, OLAP Operations, Analytics) to qualify the works found based on their functionalities. The works are analyzed according to each criterion and discussed to identify trends and challenges in the application of OLAP in the information network.
- Ziv Bar-yossef, Ravi Kumar, and D. Sivakumar. 2002. Reductions in streaming algorithms, with an application to counting triangles in graphs. In Proceedings of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms. Society for Industrial and Applied Mathematics, San Francisco, California, 623--632. DOI:http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.22.8563Google Scholar
- Albert-László Barabasi and Réka Albert. 1999. Emergence of scaling in random networks. Science (New York, N.Y.) 286, 5439 (October 1999), 509--512. DOI:https://doi.org/10.1126/SCIENCE.286.5439.509Google Scholar
- Seyed-Mehdi-Reza Beheshti, Boualem Benatallah, and Hamid Reza Motahari-Nezhad. 2016. Scalable graph-based OLAP analytics over process execution data. Distributed and Parallel Databases 34, 3 (2016), 379--423. DOI:https://doi.org/10.1007/s10619-014-7171-9Google ScholarDigital Library
- Seyed-Mehdi-Reza Beheshti, Boualem Benatallah, Hamid Reza Motahari-Nezhad, and Mohammad Allahbakhsh. 2012. A framework and a language for on-line analytical processing on graphs. In Proceedings of the Web Information Systems Engineering (WISE’12). 213--227. DOI:https://doi.org/10.1007/978-3-642-35063-4_16Google ScholarDigital Library
- Stephen P. Borgatti and Martin G. Everett. 2006. A graph-theoretic perspective on centrality. Social Networks 28, 4 (October 2006), 466--484. DOI:https://doi.org/10.1016/j.socnet.2005.11.005Google ScholarCross Ref
- Ulrik Brandes and Thomas Erlebach (Eds.). 2005. Network Analysis (Methodological Foundations), Vol. 3418. Springer, Berlin. DOI:https://doi.org/10.1007/b106453Google Scholar
- William Brendel and Sinisa Todorovic. 2011. Learning spatiotemporal graphs of human activities. In Proceedings of the 2011 International Conference on Computer Vision. IEEE, 778--785. DOI:https://doi.org/10.1109/ICCV.2011.6126316Google ScholarDigital Library
- Michel Caradec. 2018. Graph OLAP with Neo4j. Retrieved from https://github.com/michelcaradec/Graph-OLAP.Google Scholar
- Deepayan Chakrabarti, Yiping Zhan, and Christos Faloutsos. 2004. R-MAT: A recursive model for graph mining. In Proceedings of 4th SIAM International Conference on Data Mining.Google ScholarCross Ref
- Zakia Challal, Omar Boussaid, and Kamel Boukhalfa. 2017. Minimizing negative influence in social networks: A graph OLAP based approach. In Proceedings of the Database and Expert Systems Applications. 378--386. DOI:https://doi.org/10.1007/978-3-319-64471-4_30Google ScholarCross Ref
- Surajit Chaudhuri and Umeshwar Dayal. 1997. An overview of data warehousing and OLAP technology. ACM SIGMOD Record 26, 1 (March 1997), 65--74. DOI:https://doi.org/10.1145/248603.248616Google ScholarDigital Library
- Chen Chen, Xifeng Yan, Feida Zhu, Jiawei Han, and Philip S. Yu. 2008. Graph OLAP: Towards online analytical processing on graphs. In Proceedings of the 2008 8th IEEE International Conference on Data Mining. IEEE, 103--112. DOI:https://doi.org/10.1109/ICDM.2008.30Google ScholarDigital Library
- Chen Chen, Xifeng Yan, Feida Zhu, Jiawei Han, and Philip S. Yu. 2009. Graph OLAP: A multi-dimensional framework for graph data analysis. Knowledge and Information Systems 21, 1 (October 2009), 41--63. DOI:https://doi.org/10.1007/s10115-009-0228-9Google ScholarDigital Library
- George Colliat. 1996. OLAP, relational, and multidimensional database systems. ACM SIGMOD Record 25, 3 (September 1996), 64--69. DOI:https://doi.org/10.1145/234889.234901Google ScholarDigital Library
- Nigel Collier and Son Doan. 2012. GENI-DB: A database of global events for epidemic intelligence. Bioinformatics (Oxford, England) 28, 8 (April 2012), 1186--1188. DOI:https://doi.org/10.1093/bioinformatics/bts099Google Scholar
- Benoit Denis, Amine Ghrab, and Sabri Skhiri. 2013. A distributed approach for graph-oriented multidimensional analysis. In Proceedings of the 2013 IEEE International Conference on Big Data. IEEE, 9--16. DOI:https://doi.org/10.1109/BigData.2013.6691777Google ScholarCross Ref
- Reinhard Diestel. 2005. Graph Theory (Graduate Texts in Mathematics). Springer.Google Scholar
- Lorena Etcheverry and Alejandro A. Vaisman. 2012. QB4OLAP: A new vocabulary for OLAP cubes on the semantic web. In Proceedings of the CEUR Workshop. 905.Google Scholar
- Michalis Faloutsos, Petros Faloutsos, Christos Faloutsos, Michalis Faloutsos, Petros Faloutsos, and Christos Faloutsos. 1999. On power-law relationships of the Internet topology. In Proceedings of the Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM’99), Vol. 29. ACM, New York, New York, 251--262. DOI:https://doi.org/10.1145/316188.316229Google ScholarDigital Library
- Min Fang, Narayanan Shivakumar, Hector Garcia-Molina, Rajeev Motwani, and Jeffrey D. Ullman. 1998. Computing iceberg queries efficiently. In Proceedings of VLDB Conference. New York. http://www.vldb.org/conf/1998/p299.pdf.Google Scholar
- Daniela Florescu, Alon Levy, and Alberto Mendelzon. 1998. Database techniques for the World-Wide Web. ACM SIGMOD Record 27, 3 (September 1998), 59--74. DOI:https://doi.org/10.1145/290593.290605Google ScholarDigital Library
- Linton C. Freeman. 1978. Centrality in social networks conceptual clarification. Social Networks 1, 3 (January 1978), 215--239. DOI:https://doi.org/10.1016/0378-8733(78)90021-7Google ScholarCross Ref
- Amine Ghrab, Oscar Romero, Sabri Skhiri, Alejandro Vaisman, and Esteban Zimányi. 2015. A framework for building OLAP cubes on graphs. In Proceedings of the Advances in Databases and Information Systems. 92--105. DOI:https://doi.org/10.1007/978-3-319-23135-8_7Google ScholarCross Ref
- Amine Ghrab, Oscar Romero, Sabri Skhiri, and Esteban Zimányi. 2014. Analytics-Aware Graph Database Modeling. Technical Report. EURA NOVA Technical Series. Retrieved from https://research.euranova.eu/wp-content/uploads/analytics-aware-graph-database-modeling.pdf.Google Scholar
- Amine Ghrab, Sabri Skhiri, Salim Jouili, and Esteban Zimányi. 2013. An analytics-aware conceptual model for evolving graphs. In Proceedings of the Data Warehousing and Knowledge Discovery. 1--12. DOI:https://doi.org/10.1007/978-3-642-40131-2_1Google ScholarDigital Library
- Jim Gray, Surajit Chaudhuri, Adam Bosworth, Andrew Layman, Don Reichart, Murali Venkatrao, Frank Pellow, and Hamid Pirahesh. 1997. Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals. Data Mining and Knowledge Discovery 1, 1 (March 1997), 29--53. DOI:https://doi.org/10.1023/A:1009726021843Google ScholarDigital Library
- Per Hage and Frank Harary. 1995. Eccentricity and centrality in networks. Social Networks 17, 1 (January 1995), 57--63. DOI:https://doi.org/10.1016/0378-8733(94)00248-9Google ScholarCross Ref
- Jiawei Han. 2009. Mining heterogeneous information networks by exploring the power of links. In Lecture Notes in Computer Science. Springer, 13--30. DOI:https://doi.org/10.1007/978-3-642-04747-3_2Google Scholar
- Venky Harinarayan, Anand Rajaraman, and Jeffrey D. Ullman. 1996. Implementing data cubes efficiently. ACM SIGMOD Record 25, 2 (1996), 205--216. DOI:https://doi.org/10.1145/235968.233333Google ScholarDigital Library
- Wararat Jakawat, Cécile Favre, and Sabine Loudcher. 2014. OLAP on information networks: A new framework for dealing with bibliographic data. In New Trends in Databases and Information Systems. Springer, 361--370. DOI:https://doi.org/10.1007/978-3-319-01863-8_38Google Scholar
- Wararat Jakawat, Cécile Favre, and Sabine Loudcher. 2016. Graphs enriched by cubes for OLAP on bibliographic networks. International Journal of Business Intelligence and Data Mining 11, 1 (2016), 85. DOI:https://doi.org/10.1504/IJBIDM.2016.076435Google ScholarDigital Library
- Wararat Jakawat, Cécile Favre, and Sabine Loudcher. 2016. OLAP cube-based graph approach for bibliographic data. In Proceedings of the 42nd International Conference on Current Trends in Theory and Practice of Computer Science (SOFSEM’16), Vol. 1548. Harrachov, Czech Republic, 87--99.Google Scholar
- Glen Jeh and Jennifer Widom. 2002. SimRank: A measure of structural-context similarity. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’02). ACM, New York, 538. DOI:https://doi.org/10.1145/775047.775126Google ScholarDigital Library
- Ming Ji, Jiawei Han, and Marina Danilevsky. 2011. Ranking-based classification of heterogeneous information networks. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’11). ACM, New York, 1298. DOI:https://doi.org/10.1145/2020408.2020603Google ScholarDigital Library
- Han Jiawei, Micheline Kamber, and Jian Pei. 2012. Data Mining. Concepts and Techniques. Morgan Kaufmann, 159--160.Google Scholar
- Xin Jin, Jiawei Han, Liangliang Cao, Jiebo Luo, Bolin Ding, and Cindy Xide Lin. 2010. Visual cube and on-line analytical processing of images. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management (CIKM’10). ACM, New York, 849. DOI:https://doi.org/10.1145/1871437.1871546Google ScholarDigital Library
- Benedikt Kämpgen, Seán O’Riain, and Andreas Harth. 2015. Interacting with statistical linked data via OLAP operations. In Proceedings of the Semantic Web: ESWC 2012 Satellite Events. 87--101. DOI:https://doi.org/10.1007/978-3-662-46641-4_7Google ScholarCross Ref
- Seok Kang, Suan Lee, and Jinho Kim. 2019. Distributed graph cube generation using Spark framework. The Journal of Supercomputing OnlineFirst (10 January 2019), 1--22. https://link.springer.com/journal/11227/onlineFirst/page/9.Google ScholarCross Ref
- Ralph. Kimball and Margy Ross. 2002. The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling. Wiley, 436 pages.Google Scholar
- B. Kitchenham and S. Charters. 2007. Guidelines for Performing Systematic Literature Reviews in Software Engineering. Technical Report. Department of Computer Science University of Durham, Durham, UK. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.117.471.Google Scholar
- Sangkeun Lee, Sreenivas R. Sukumar, Seokyong Hong, and Seung Hwan Lim. 2016. Enabling graph mining in RDF triplestores using SPARQL for holistic in-situ graph analysis. Expert Systems with Applications 48 (2016), 9--25. https://www.sciencedirect.com/science/article/pii/S0957417415007708?via%3Dihub.Google ScholarDigital Library
- Jure Leskovec, Lada A. Adamic, and Bernardo A. Huberman. 2007. The dynamics of viral marketing. ACM Transactions on the Web 1, 1 (May 2007), Article 5. DOI:https://doi.org/10.1145/1232722.1232727Google ScholarDigital Library
- Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. Retrieved from http://snap.stanford.edu/data.Google Scholar
- Jure Leskovec and Rok Sosic. 2016. SNAP: A general-purpose network analysis and graph-mining library. ACM Transactions on Intelligent Systems and Technology 8, 1 (2016), 1.Google ScholarDigital Library
- Bingdong Li, Jeff Springer, George Bebis, and Mehmet Hadi Gunes. 2013. A survey of network flow applications. Journal of Network and Computer Applications 36, 2 (2013), 567--581. DOI:https://doi.org/10.1016/j.jnca.2012.12.020Google ScholarDigital Library
- Sabine Loudcher, Wararat Jakawat, Edmundo Pavel Soriano Morales, and Cécile Favre. 2015. Combining OLAP and information networks for bibliographic data analysis: A survey. Scientometrics 103, 2 (May 2015), 471--487. DOI:https://doi.org/10.1007/s11192-015-1539-0Google ScholarDigital Library
- Adriana Matei, Kuo-ming Chao, and Nick Godwin. 2015. OLAP for multidimensional semantic web databases. In Proceedings of the International Workshop on Business Intelligence for the Real-Time Enterprise. 81--96. DOI:https://doi.org/10.1007/978-3-662-46839-5_6Google ScholarCross Ref
- Konstantinos Morfonios and Georgia Koutrika. 2008. OLAP cubes for social searches: Standing on the shoulders of giants? In Proceedings of the 11th International Workshop on the Web and Databases (WebBD’08).Google Scholar
- Nan Li, Ziyu Guan, Lijie Ren, Jian Wu, Jiawei Han, and Xifeng Yan. 2013. gIceberg: Towards iceberg analysis in large graphs. In Proceedings of the 2013 IEEE 29th International Conference on Data Engineering (ICDE’13), Vol. 1. IEEE, 1021--1032. DOI:https://doi.org/10.1109/ICDE.2013.6544894Google Scholar
- Mark Newman. 2010. Networks: An Introduction (1st ed.). Oxford University Press.Google ScholarCross Ref
- Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1998. The PageRank citation ranking: Bringing order to the web. In Proceedings of the 7th International World Wide Web Conference. Brisbane, Australia, 161--172. DOI:https://doi.org/10.1.1.206.775Google Scholar
- Georgios A. Pavlopoulos, Maria Secrier, Charalampos N. Moschopoulos, Theodoros G. Soldatos, Sophia Kossida, Jan Aerts, Reinhard Schneider, and Pantelis G. Bagos. 2011. Using graph theory to analyze biological networks. BioData Mining 4, 1 (April 2011), 10. DOI:https://doi.org/10.1186/1756-0381-4-10Google ScholarCross Ref
- Mary K. Pratt. 2017. What is BI? Business Intelligence Definition and Solutions | CIO. Retrieved from https://www.cio.com/article/2439504/business-intelligence/business-intelligence-definition-and-solutions.html.Google Scholar
- Lu Qin, Jeffrey Xu Yu, Lijun Chang, Hong Cheng, Chengqi Zhang, and Xuemin Lin. 2014. Scalable big graph processing in MapReduce. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data (SIGMOD’14). ACM, New York, 827--838. DOI:https://doi.org/10.1145/2588555.2593661Google ScholarDigital Library
- Qiang Qu, Feida Zhu, Xifeng Yan, Jiawei Han, Philip S. Yu, and Hongyan Li. 2011. Efficient topological OLAP on information networks. In Proceedings of the Database Systems for Advanced Applications. 389--403. DOI:https://doi.org/10.1007/978-3-642-20149-3_29Google ScholarCross Ref
- Mehwish Riaz, Emilia Mendes, and Ewan Tempero. 2009. A systematic review of software maintainability prediction and metrics. In Proceedings of the 2009 3rd International Symposium on Empirical Software Engineering and Measurement. IEEE, 367--377. DOI:https://doi.org/10.1109/ESEM.2009.5314233Google ScholarDigital Library
- Chuan Shi, Yitong Li, Jiawei Zhang, Yizhou Sun, and Philip S. Yu. 2015. A survey of heterogeneous information network analysis. IEEE Transactions on Knowledge and Data Engineering 14, 8 (2015), 1--45. DOI:https://doi.org/10.1109/TKDE.2016.2598561Google Scholar
- Chuan Shi and Philip S. Yu. 2017. Heterogeneous Information Network Analysis and Applications. Springer International Publishing, Cham. DOI:https://doi.org/10.1007/978-3-319-56212-4Google Scholar
- Yizhou Sun and Jiawei Han. 2012. Mining heterogeneous information networks: Principles and methodologies. Synthesis Lectures on Data Mining and Knowledge Discovery 3, 2 (July 2012), 1--159. DOI:https://doi.org/10.2200/S00433ED1V01Y201207DMK005Google ScholarCross Ref
- Yizhou Sun and Jiawei Han. 2013. Mining heterogeneous information networks: A structural analysis approach. ACM SIGKDD Explorations Newsletter 14, 2 (April 2013), 20. DOI:https://doi.org/10.1145/2481244.2481248Google ScholarDigital Library
- Yuanyuan Tian, Richard A. Hankins, and Jignesh M. Patel. 2008. Efficient aggregation for graph summarization. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data (SIGMOD’08). ACM, New York,567. DOI:https://doi.org/10.1145/1376616.1376675Google Scholar
- Hanghang Tong, Christos Faloutsos, and Jia-yu Pan. 2006. Fast random walk with restart and its applications. In Proceedings of the 6th International Conference on Data Mining (ICDM’06). IEEE, 613--622. DOI:https://doi.org/10.1109/ICDM.2006.70Google ScholarDigital Library
- Charalampos E. Tsourakakis. 2008. Fast counting of triangles in large real networks without counting: Algorithms and laws. In Proceedings of the 2008 8th IEEE International Conference on Data Mining. IEEE, 608--617. DOI:https://doi.org/10.1109/ICDM.2008.72Google ScholarDigital Library
- C. Von Ferber, T. Holovatch, Yu Holovatch, and V. Palchykov. 2009. Public transport networks: Empirical analysis and modeling. European Physical Journal B 68, 2 (2009), 261--275. DOI:https://doi.org/10.1140/epjb/e2009-00090-xGoogle ScholarCross Ref
- Jingdong Wang, Sujia Luo, and Jie Yuan. 2018. Analysis of computer network and communication system. Journal of Networking and Telecommunications 1, 1 (February 2018), 507--550. Retrieved from http://systems.enpress-publisher.com/index.php/JNT/article/view/228/217.Google Scholar
- Pengsen Wang, Bin Wu, and Bai Wang. 2015. TSMH graph cube: A novel framework for large scale multi-dimensional network analysis. In Proceedings of the 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA’15). IEEE, 1--10. DOI:https://doi.org/10.1109/DSAA.2015.7344826Google ScholarCross Ref
- Zhengkui Wang, Qi Fan, Huiju Wang, Kian-Lee Tan, Divyakant Agrawal, and Amr El Abbadi. 2014. Pagrol: Parallel graph OLAP over large-scale attributed graphs. In Proceedings of the 2014 IEEE 30th International Conference on Data Engineering, Vol. 1. IEEE, 496--507. DOI:https://doi.org/10.1109/ICDE.2014.6816676Google ScholarCross Ref
- Lili Wu, Roshan Sumbaly, Chris Riccomini, Gordon Koo, Hyung Jin Kim, Jay Kreps, and Sam Shah. 2012. Avatara: OLAP for web-scale analytics products. Proceedings of the VLDB Endowment 5, 12 (August 2012), 1874--1877. DOI:https://doi.org/10.14778/2367502.2367525Google ScholarDigital Library
- Dan Yin and Hong Gao. 2014. Iceberg cube query on heterogeneous information networks. In Proceedings of the Wireless Algorithms, Systems, and Applications. 740--749. DOI:https://doi.org/10.1007/978-3-319-07782-6_66Google ScholarDigital Library
- Dan Yin, Hong Gao, Zhaonian Zou, Jianzhong Li, and Zhipeng Cai. 2016. Approximate iceberg cube on heterogeneous dimensions. In Proceedings of the Database Systems for Advanced Applications, Vol. 9049. 82--97. DOI:https://doi.org/10.1007/978-3-319-32049-6_6Google Scholar
- Mu Yin, Bin Wu, and Zengfeng Zeng. 2012. HMGraph OLAP: A novel framework for multi-dimensional heterogeneous network analysis. In Proceedings of the 15th International Workshop on Data Warehousing and OLAP (DOLAP’12). ACM, New York, 137. DOI:https://doi.org/10.1145/2390045.2390067Google ScholarDigital Library
- Zixing Zhang, Bin Wu, and Zeao Wang. 2017. A parallel framework for large-scale multidimensional heterogeneous network analysis. In Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM’17). ACM, New York, 625--626. DOI:https://doi.org/10.1145/3110025.3110038Google ScholarDigital Library
- Peixiang Zhao, Xiaolei Li, Dong Xin, and Jiawei Han. 2011. Graph cube: OnWarehousing and OLAP multidimensional networks. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data (SIGMOD’11). ACM, New York, 853. DOI:https://doi.org/10.1145/1989323.1989413Google ScholarDigital Library
Index Terms
- A Review on OLAP Technologies Applied to Information Networks
Recommendations
Data warehousing and OLAP over big data: current challenges and future research directions
DOLAP '13: Proceedings of the sixteenth international workshop on Data warehousing and OLAPIn this paper, we highlight open problems and actual research trends in the field of Data Warehousing and OLAP over Big Data, an emerging term in Data Warehousing and OLAP research. We also derive several novel research directions arising in this field, ...
Combining OLAP and information networks for bibliographic data analysis: a survey
In the context of scientometrics and bibliometrics, several research fields are dealing with bibliographic data. In this paper, we will explore how the combination of online analytical processing (OLAP) analysis and information networks could be an ...
Analytics over large-scale multidimensional data: the big data revolution!
DOLAP '11: Proceedings of the ACM 14th international workshop on Data Warehousing and OLAPIn this paper, we provide an overview of state-of-the-art research issues and achievements in the field of analytics over big data, and we extend the discussion to analytics over big multidimensional data as well, by highlighting open problems and ...
Comments