Towards felicitous decision making: An overview on challenges and trends of Big Data
Introduction
Nowadays, an exponential growth of data may come from every imaginable source such as sensors, purchase transactions and social media networks. The speed of data growth has already exceeded Moore's law [30]. Everyday 2.5 quintillion bytes of data are created in 2011 [56]. IBM has reported that 90% of the data created in the world has been produced in the last two years.1 More than 267 million transactions are produced per day over 6 thousands of stores of Wal-Mart. Till 2011, almost 3 terabytes of data are collected by the US Library of Congress. Over 30 trillion bytes of image data are recorded by the Large Synoptic Survey Telescope in a single day [30]. Especially in China, Baidu conducts dozens of petabytes of data caused by users’ queries everyday; Alibaba generates almost 20 terabytes of data by over 880 million online transactions. Fig. 1 shows the prediction results of global data volume provided by International Data Corporation (IDC) [132]. We can conclude without doubt that the era of Big Data has arrived [101].
Besides the huge volume, Big Data also refer to the complex structure of data, the complexity of capturing and managing data [26]. Since it was introduced, Big Data have become one of the most popular issues in both scientific and engineering area. The recent upsurge began with the special issue entitled “Big Data” published by Nature [58]. After the Big Data initiative presented by Obama Administration in 2012, Gartner listed Big Data in both the “Top 10 Strategic Technology Trends for 2013″ and “Top 10 Critical Tech Trends for the Next Five Years” . Many other projects and solicitations, such as the US National Science Foundation and National Science Foundation of China, have announced to investigate and tackle the challenges of Big Data.
Big Data bring big value. The value takes the form of a value chain and is created through the processes of data discovery, integration and exploitation [102]. Regardless of the specific challenges, technologies and tools have been developed to support decision making in each phase of processing and applying Big Data. Till now, Big Data have been playing a central role in many decision making and forecasting domains such as business analysis, product development, loyalty, healthcare, clinicians, tourism marketing, transportation, etc. For example, as reported by McKinsey institute [96], over 50% of 560 enterprises insist that Big Data can help them in increasing operational efficiency, selecting informing strategic direction, supplying better customer service and so on. The use of knowledge exploited from Wal-Mart's large volume of transaction data has significantly benefitted its pricing strategies and advertising campaigns [30]. Moreover, benefitting from processing of Big Data, over 300 billion dollars and 250 billion euros potential annual values are produced to US health care and European public administration, respectively [96]. It is obvious that Big Data can support intelligent and felicitous decisions for organizations.
Big Data need decision support. Generally, decision science (or theory of choice) in economics, computer science, statistics and mathematics is referred to as identifying the values, uncertainties, rationalities, resultant optimal decision and other relative issues. Normative decision theory focuses on finding methodologies, technologies and tools (software) to identify the best decision to make based on the assumption that the decision maker is fully rational or bounded rational. Under this perspective, decision making, in general, exists in each procedure of Big Data, such as data acquisition/storage, data cleaning/integration, data analysis, data visualization and predicting by derived knowledge. Although it is far from achieving prefect solutions in each procedure, there are several useful techniques and technologies that have been applied for decision making in Big Data. For example, some decision making techniques involved with multiple disciplines are optimization methods, statistics, data mining, machine learning, visualization approaches and social network analysis. In addition, popular Big Data tools include three categories, which are batch processing, stream processing and hybrid processing tools [30]. Based on the paradigm of producing Big Data, the relationship between decision science and Big Data can be explained by Fig. 2. For one thing, Decision theory supports decisions in each phase of processing Big Data; for another, the solutions of Big Data enrich the content and scope of decision sciences. In this sense, one can make more intelligent and felicitous decisions by utilizing better prediction. For instance, we can analyze the preferences of consumers’ purchase and the correlation of two classes of products so that more efficient sales promotion can be designed; we can mine the social community of users so that targeted advertising would be more accurate; we can analyze the mood and sentiment of user so that public opinions, even criminal activities, can be predicted; we can also forecast the trends of epidemics so that reasonable emergency plans can be prepared; we can also predict the long-term and/or short term traffic flow to shorten the averaging driving and waiting time [136].
In order to figure out the existing developments, trends and challenges of both decision supporting technologies of processing Big Data and decision making based on Big Data, this paper presents an overview of both aspects. It is, of course, difficult to separate processing Big Data from Big Data applications. We try to do this in this paper, in despite of some inevitable overlaps, to exploit how decision sciences support the development of handling Big Data and how much felicitous DM is essential to be provided in the context of Big Data. Roughly, the first aspect focuses on the emerging techniques and technologies that are elaborately designed for processing Big Data based on decision sciences, whereas the second aspect concentrates on specific applications which process special data sets to support decision making in specific fields such as business and management.
The rest of the paper is organized as follows: Section 2 reviews some basic aspects of Big Data, such as concepts, characteristics, paradigms and related contributions. The existing decision making techniques for processing Big Data are summarized in Section 3. Then a brief review of Big Data applications in social science are presented in Section 4. Challenges and possible directions are depicted in Section 5 and some conclusions are drawn in Section 6.
Section snippets
A bird's eye on Big Data
Before embarking on the discussion of decision making in Big Data, we need to specify the scopes, concepts and characteristics of Big Data in this section.
Decision making tools for processing Big Data
Scientists have developed a wide variety of tools to capture the value of Big Data along with the value chain. Although it is far from meeting various needs, the existing decision making tools which cross multiple disciplines, have been applied to many data-intensive applications and shown their excited effectiveness for capturing, curating, analyzing and visualizing Big Data. In this section, we review some developments and current trends of decision making techniques and technologies in this
Intelligent decision making based on Big Data: the evidence from social Big Data
As have mentioned hereinabove, Big Data can be applied in various disciplines due to their power of felicitous decision making based on large, diverse and complex data. In this section, we only focus on some recent applications in social science, such as marketing, e-commerce, and social management. In this circumstance, Big Data come from multiple social media sources and can be referred to as social Big Data. Applications in other area, such as health care, medical, bioinformatics can be
Big challenges and possible directions
Big Data remain big challenges. Till now, it is too early to say that we have reached the standard theory for handling Big Data. Thus, the challenges are usually related to the application fields, including challenges in Big Data management and analysis, semantic challenges and other non-technical challenges [8], [97]. In addition, more challenges will arise along with the continuous development of new technologies and techniques. This section summarizes some general challenges of Big Data and
Conclusions
Along with the accumulation of ubiquitous and incessantly generated data, Big Data have become a new popular and booming discipline based on techniques and technologies from many other disciplines. More and more initiatives have been presented by different organizations and governments. A large amount of literature has been published, which facilitate and accelerate the development of Big Data. The concepts, aims and processing paradigms of Big Data are becoming more and more explicit and
Acknowledgments
The authors would like to thank the Editor-in-Chief, the associated editor and three anonymous reviewers for their insightful and constructive commendations that have led to an improved version of this paper. The work was supported by the National Natural Science Foundation of China (Nos. 61273209, 71571123), the Scientific Research Foundation of Graduate School of Southeast University (No. YBJJ1528).
References (154)
- et al.
Big data for natural language processing: a streaming approach
Knowl. Based Syst.
(2015) - et al.
Big Data computing and clouds: trends and future directions
J. Parallel Distrb. Comput.
(2015) - et al.
Representation learning: a review and new perspectives
IEEE Trans. Pattern Anal.
(2013) - et al.
Recent advances and emerging challenges of feature selection in the context of big data
Knowl. Based Syst.
(2015) - et al.
Critical questions for big data provocations for a cultural, technological, and scholarly phenomenon
Inf. Commun. Soc.
(2012) Review: Talend open studio makes quick etl work of large data sets
(2009)- et al.
Business intelligence and analytics: From big data to big impact
MIS Q.
(2012) Collect it all: national security, Big Data and governance
GeoJournal
(2015)- et al.
A critical review of the first 10 years of candidate gene-by-environment interaction research in psychiatry
Am. J Psychiat.
(2011) - et al.
Word-of-mouth understanding: Entity-centric multimodal aspect-opinion mining in social media
IEEE Trans. Multimed.
(2015)
Discretization based on entropy and multiple scanning
Entropy
The Fourth Paradigm: Data-Intensive Scientific Discovery
Detection of review spam: a survey
Expert Syst. Appl.
Building Better Models Prediction, Replication, and Machine Learning in the Social Sciences
Ann. Am. Acad. Polit. Soc. Sci.
Introducing Apache Mahout Scalable, Commercial-Friendly Machine Learning for Building Intelligent Applications
Wiq: work-intensive query scheduling for in-memory database systems
A decision support system: Automated crime report analysis and classification for e-government
Gov. Inf. Q.
3D Data ManagementControlling Data Volume, Velocity and Variety, Research Note 6
The parable of Google flu: traps in big data analysis
Science
Deep learning
Nature
Machine learning in genomic medicine: a review of computational problems and data sets
Proc. IEEE
A survey of fuzzy web mining
Wires. Data Min. Knowl.
The emotional responses of browsing Facebook: Happiness, envy, and the role of tie strength
Comput. Hum. Behav.
Mining social networks using heat diffusion processes for marketing candidates selection
Big Data: Principles and Best Practices of Scalable Realtime Data Systems
Big data: the management revolution
Harv. Bus. Rev.
Processes meet big data: connecting data science with process science
IEEE Trans. Serv. Comput.
Large-scale data visualization using parallel data streaming
IEEE Comput. Graph.
Twitter: who gets caught? observed trends in social micro-blogging spam
Deep machine learning-a new frontier in artificial intelligence research
IEEE Comput. Intell. Mag.
A unified framework for creating domain dependent polarity lexicons from user generated reviews
PLoS One
Predicting the future with social media
Dimensionality reduction of medical big data using neural-fuzzy classifier
Soft Comput.
Scaling to very very large corpora for natural language disambiguation
Location-based and preferenceaware recommendation using sparse geo-social networking data
The “four Vs” of Big Data. Implementing Information Infrastructure Symposium
Beyond the data deluge
Science
Social big data: Recent achievements and new challenges
Inf. Fusion
The uses of big data in cities
Big Data
The meaningful use of big data: four perspectives–four challenges
ACM SIGMOD Rec.
Knowledge discovery from social media using big data-provided sentiment analysis (SoMABiT)
J. Inf. Sci.
Data classification using an ensemble of filters
Neurocomputing
Principles of Data Mining
Meta-level sentiment models for big social data analysis
Knowl. Based Syst.
Emerging trends and technologies in big data processing
Concurr. Comp-Pract. E.
The utility of hotspot mapping for predicting spatial patterns of crime
Secur. J.
IoT big-data centred knowledge granule analytic and cluster framework for BI applications: a case base analysis
PLoS One
Understanding the paradigm shift to computational social science in the presence of big data
Decis. Support Syst.
Data-intensive applications, challenges, techniques and technologies: a survey on Big Data
Inf. Sci.
Cited by (209)
Adoption of big data analytics for energy pipeline condition assessment - A systematic review
2023, International Journal of Pressure Vessels and PipingCoupling big data and life cycle assessment: A review, recommendations, and prospects
2023, Ecological IndicatorsDeveloping an analytical framework for estimating food security indicators in the United Arab Emirates: A review
2024, Environment, Development and SustainabilityA Bayesian Approach to Constructing Probabilistic Models from Knowledge Graphs
2024, International Journal of Semantic ComputingEmerging Opportunities for Ferroelectric Field-Effect Transistors: Integration of 2D Materials
2024, Advanced Functional Materials