Elsevier

Information Sciences

Volumes 367–368, 1 November 2016, Pages 747-765
Information Sciences

Towards felicitous decision making: An overview on challenges and trends of Big Data

https://doi.org/10.1016/j.ins.2016.07.007Get rights and content

Abstract

The era of Big Data has arrived along with large volume, complex and growing data generated by many distinct sources. Nowadays, nearly every aspect of the modern society is impacted by Big Data, involving medical, health care, business, management and government. It has been receiving growing attention of researches from many disciplines including natural sciences, life sciences, engineering and even art & humanities. It also leads to new research paradigms and ways of thinking on the path of development. Lots of developed and under-developing tools improve our ability to make more felicitous decisions than what we have made ever before. This paper presents an overview on Big Data including four issues, namely: (i) concepts, characteristics and processing paradigms of Big Data; (ii) the state-of-the-art techniques for decision making in Big Data; (iii) felicitous decision making applications of Big Data in social science; and (iv) the current challenges of Big Data as well as possible future directions.

Introduction

Nowadays, an exponential growth of data may come from every imaginable source such as sensors, purchase transactions and social media networks. The speed of data growth has already exceeded Moore's law [30]. Everyday 2.5 quintillion bytes of data are created in 2011 [56]. IBM has reported that 90% of the data created in the world has been produced in the last two years.1 More than 267 million transactions are produced per day over 6 thousands of stores of Wal-Mart. Till 2011, almost 3 terabytes of data are collected by the US Library of Congress. Over 30 trillion bytes of image data are recorded by the Large Synoptic Survey Telescope in a single day [30]. Especially in China, Baidu conducts dozens of petabytes of data caused by users’ queries everyday; Alibaba generates almost 20 terabytes of data by over 880 million online transactions. Fig. 1 shows the prediction results of global data volume provided by International Data Corporation (IDC) [132]. We can conclude without doubt that the era of Big Data has arrived [101].

Besides the huge volume, Big Data also refer to the complex structure of data, the complexity of capturing and managing data [26]. Since it was introduced, Big Data have become one of the most popular issues in both scientific and engineering area. The recent upsurge began with the special issue entitled “Big Data” published by Nature [58]. After the Big Data initiative presented by Obama Administration in 2012, Gartner listed Big Data in both the “Top 10 Strategic Technology Trends for 2013″ and “Top 10 Critical Tech Trends for the Next Five Years” . Many other projects and solicitations, such as the US National Science Foundation and National Science Foundation of China, have announced to investigate and tackle the challenges of Big Data.

Big Data bring big value. The value takes the form of a value chain and is created through the processes of data discovery, integration and exploitation [102]. Regardless of the specific challenges, technologies and tools have been developed to support decision making in each phase of processing and applying Big Data. Till now, Big Data have been playing a central role in many decision making and forecasting domains such as business analysis, product development, loyalty, healthcare, clinicians, tourism marketing, transportation, etc. For example, as reported by McKinsey institute [96], over 50% of 560 enterprises insist that Big Data can help them in increasing operational efficiency, selecting informing strategic direction, supplying better customer service and so on. The use of knowledge exploited from Wal-Mart's large volume of transaction data has significantly benefitted its pricing strategies and advertising campaigns [30]. Moreover, benefitting from processing of Big Data, over 300 billion dollars and 250 billion euros potential annual values are produced to US health care and European public administration, respectively [96]. It is obvious that Big Data can support intelligent and felicitous decisions for organizations.

Big Data need decision support. Generally, decision science (or theory of choice) in economics, computer science, statistics and mathematics is referred to as identifying the values, uncertainties, rationalities, resultant optimal decision and other relative issues. Normative decision theory focuses on finding methodologies, technologies and tools (software) to identify the best decision to make based on the assumption that the decision maker is fully rational or bounded rational. Under this perspective, decision making, in general, exists in each procedure of Big Data, such as data acquisition/storage, data cleaning/integration, data analysis, data visualization and predicting by derived knowledge. Although it is far from achieving prefect solutions in each procedure, there are several useful techniques and technologies that have been applied for decision making in Big Data. For example, some decision making techniques involved with multiple disciplines are optimization methods, statistics, data mining, machine learning, visualization approaches and social network analysis. In addition, popular Big Data tools include three categories, which are batch processing, stream processing and hybrid processing tools [30]. Based on the paradigm of producing Big Data, the relationship between decision science and Big Data can be explained by Fig. 2. For one thing, Decision theory supports decisions in each phase of processing Big Data; for another, the solutions of Big Data enrich the content and scope of decision sciences. In this sense, one can make more intelligent and felicitous decisions by utilizing better prediction. For instance, we can analyze the preferences of consumers’ purchase and the correlation of two classes of products so that more efficient sales promotion can be designed; we can mine the social community of users so that targeted advertising would be more accurate; we can analyze the mood and sentiment of user so that public opinions, even criminal activities, can be predicted; we can also forecast the trends of epidemics so that reasonable emergency plans can be prepared; we can also predict the long-term and/or short term traffic flow to shorten the averaging driving and waiting time [136].

In order to figure out the existing developments, trends and challenges of both decision supporting technologies of processing Big Data and decision making based on Big Data, this paper presents an overview of both aspects. It is, of course, difficult to separate processing Big Data from Big Data applications. We try to do this in this paper, in despite of some inevitable overlaps, to exploit how decision sciences support the development of handling Big Data and how much felicitous DM is essential to be provided in the context of Big Data. Roughly, the first aspect focuses on the emerging techniques and technologies that are elaborately designed for processing Big Data based on decision sciences, whereas the second aspect concentrates on specific applications which process special data sets to support decision making in specific fields such as business and management.

The rest of the paper is organized as follows: Section 2 reviews some basic aspects of Big Data, such as concepts, characteristics, paradigms and related contributions. The existing decision making techniques for processing Big Data are summarized in Section 3. Then a brief review of Big Data applications in social science are presented in Section 4. Challenges and possible directions are depicted in Section 5 and some conclusions are drawn in Section 6.

Section snippets

A bird's eye on Big Data

Before embarking on the discussion of decision making in Big Data, we need to specify the scopes, concepts and characteristics of Big Data in this section.

Decision making tools for processing Big Data

Scientists have developed a wide variety of tools to capture the value of Big Data along with the value chain. Although it is far from meeting various needs, the existing decision making tools which cross multiple disciplines, have been applied to many data-intensive applications and shown their excited effectiveness for capturing, curating, analyzing and visualizing Big Data. In this section, we review some developments and current trends of decision making techniques and technologies in this

Intelligent decision making based on Big Data: the evidence from social Big Data

As have mentioned hereinabove, Big Data can be applied in various disciplines due to their power of felicitous decision making based on large, diverse and complex data. In this section, we only focus on some recent applications in social science, such as marketing, e-commerce, and social management. In this circumstance, Big Data come from multiple social media sources and can be referred to as social Big Data. Applications in other area, such as health care, medical, bioinformatics can be

Big challenges and possible directions

Big Data remain big challenges. Till now, it is too early to say that we have reached the standard theory for handling Big Data. Thus, the challenges are usually related to the application fields, including challenges in Big Data management and analysis, semantic challenges and other non-technical challenges [8], [97]. In addition, more challenges will arise along with the continuous development of new technologies and techniques. This section summarizes some general challenges of Big Data and

Conclusions

Along with the accumulation of ubiquitous and incessantly generated data, Big Data have become a new popular and booming discipline based on techniques and technologies from many other disciplines. More and more initiatives have been presented by different organizations and governments. A large amount of literature has been published, which facilitate and accelerate the development of Big Data. The concepts, aims and processing paradigms of Big Data are becoming more and more explicit and

Acknowledgments

The authors would like to thank the Editor-in-Chief, the associated editor and three anonymous reviewers for their insightful and constructive commendations that have led to an improved version of this paper. The work was supported by the National Natural Science Foundation of China (Nos. 61273209, 71571123), the Scientific Research Foundation of Graduate School of Southeast University (No. YBJJ1528).

References (154)

  • J. Grzymala-Busse

    Discretization based on entropy and multiple scanning

    Entropy

    (2013)
  • A.J. Hey et al.

    The Fourth Paradigm: Data-Intensive Scientific Discovery

    (2009)
  • A. Heydari et al.

    Detection of review spam: a survey

    Expert Syst. Appl.

    (2015)
  • M. Hindman

    Building Better Models Prediction, Replication, and Machine Learning in the Social Sciences

    Ann. Am. Acad. Polit. Soc. Sci.

    (2015)
  • G. Ingersoll

    Introducing Apache Mahout Scalable, Commercial-Friendly Machine Learning for Building Intelligent Applications

    (2009)
  • S. Kraft et al.

    Wiq: work-intensive query scheduling for in-memory database systems

  • KuC.-H. et al.

    A decision support system: Automated crime report analysis and classification for e-government

    Gov. Inf. Q.

    (2014)
  • D. Laney

    3D Data ManagementControlling Data Volume, Velocity and Variety, Research Note 6

    (2001)
  • D. Lazer et al.

    The parable of Google flu: traps in big data analysis

    Science

    (2014)
  • Y. LeCun et al.

    Deep learning

    Nature

    (2015)
  • M.K.K. Leung et al.

    Machine learning in genomic medicine: a review of computational problems and data sets

    Proc. IEEE

    (2016)
  • LinC.W. et al.

    A survey of fuzzy web mining

    Wires. Data Min. Knowl.

    (2013)
  • LinR. et al.

    The emotional responses of browsing Facebook: Happiness, envy, and the role of tie strength

    Comput. Hum. Behav.

    (2015)
  • MaH. et al.

    Mining social networks using heat diffusion processes for marketing candidates selection

  • N. Marz et al.

    Big Data: Principles and Best Practices of Scalable Realtime Data Systems

    (2012)
  • A. McAfee et al.

    Big data: the management revolution

    Harv. Bus. Rev.

    (2012)
  • W. van der Aalst et al.

    Processes meet big data: connecting data science with process science

    IEEE Trans. Serv. Comput.

    (2015)
  • M. Adrian, Big Data, Teradata Magazine. http://www.teradatamagazine.com/v11n01/Features/Big-Data/ (accessed December...
  • J. Ahrens et al.

    Large-scale data visualization using parallel data streaming

    IEEE Comput. Graph.

    (2001)
  • A. Almaatouq et al.

    Twitter: who gets caught? observed trends in social micro-blogging spam

  • I. Arel et al.

    Deep machine learning-a new frontier in artificial intelligence research

    IEEE Comput. Intell. Mag.

    (2010)
  • M.Z. Asghar et al.

    A unified framework for creating domain dependent polarity lexicons from user generated reviews

    PLoS One

    (2015)
  • S. Asur et al.

    Predicting the future with social media

  • A.T. Azar et al.

    Dimensionality reduction of medical big data using neural-fuzzy classifier

    Soft Comput.

    (2014)
  • M. Banko et al.

    Scaling to very very large corpora for natural language disambiguation

  • BaoJ. et al.

    Location-based and preferenceaware recommendation using sparse geo-social networking data

  • H. Barwick

    The “four Vs” of Big Data. Implementing Information Infrastructure Symposium

    (2012)
  • G. Bell et al.

    Beyond the data deluge

    Science

    (2009)
  • G. Bello-Orgaz et al.

    Social big data: Recent achievements and new challenges

    Inf. Fusion

    (2016)
  • L.M. Bettencourt

    The uses of big data in cities

    Big Data

    (2014)
  • C. Bizer et al.

    The meaningful use of big data: four perspectives–four challenges

    ACM SIGMOD Rec.

    (2012)
  • M. Bohlouli et al.

    Knowledge discovery from social media using big data-provided sentiment analysis (SoMABiT)

    J. Inf. Sci.

    (2015)
  • V. Bolón-Canedo et al.

    Data classification using an ensemble of filters

    Neurocomputing

    (2014)
  • M. Bramer

    Principles of Data Mining

    (2007)
  • F. Bravo-Marquez et al.

    Meta-level sentiment models for big social data analysis

    Knowl. Based Syst.

    (2014)
  • R. Casado et al.

    Emerging trends and technologies in big data processing

    Concurr. Comp-Pract. E.

    (2015)
  • S. Chainey et al.

    The utility of hotspot mapping for predicting spatial patterns of crime

    Secur. J.

    (2008)
  • ChangH.-T. et al.

    IoT big-data centred knowledge granule analytic and cluster framework for BI applications: a case base analysis

    PLoS One

    (2015)
  • ChangR.M. et al.

    Understanding the paradigm shift to computational social science in the presence of big data

    Decis. Support Syst.

    (2014)
  • ChenC.P. et al.

    Data-intensive applications, challenges, techniques and technologies: a survey on Big Data

    Inf. Sci.

    (2014)
  • Cited by (209)

    View all citing articles on Scopus
    View full text