Elsevier

Automation in Construction

Volume 96, December 2018, Pages 398-410
Automation in Construction

Building energy savings: Analysis of research trends based on text mining

https://doi.org/10.1016/j.autcon.2018.10.008Get rights and content

Highlights

  • A text mining-based methodology is proposed for mining massive text data.

  • 1600 research articles and engineering reports in the building field are analyzed.

  • Various text mining techniques have been integrated for knowledge discovery.

  • This study serves as a prototype for mining large text data in the building field.

  • Useful insights on research trends of building energy saving have been identified.

Abstract

Building energy saving has become the top concern in achieving global sustainability. In the past decades, massive amounts of academic articles and engineering reports have been published, focusing on the energy conservation throughout the whole building life-cycle. From a macroscopic perspective, these articles provide a comprehensive description on the development of building energy saving measures and technologies. The knowledge discovered from such text data can be used to facilitate the decision-making for researchers and policymakers. Conventional approaches are neither effective nor efficient in analyzing massive text data. As a solution, this study proposes a text mining-based methodology to gain insights from relevant literature on building energy saving. In total, 1600 articles were collected and analyzed at different stages according to important timestamps identified. Various text mining techniques were adopted to identify and describe the research trends. The results present clear differences in research focuses at different stages. An emerging research trend has been identified in the building field, which is related to green buildings, intelligent buildings and low-carbon buildings. The methodology developed in this study can be used as a prototype to enable semi-automated knowledge discovery from massive text data in the building field.

Introduction

Energy conservation has attracted worldwide concerns since the oil crisis in 1973 [[1], [2], [3]]. As one of the major energy consumers, buildings accounts for approximately 40% of the total energy consumptions and one third of the global greenhouse gas emissions [4]. It has been identified as one of the main contributors to climate changes [5]. Meanwhile, the energy saving potential in buildings is considerable [6]. For instance, up to 22% of energy consumptions during building operations can be reduced by adopting advanced building automation technologies [7]. In the past decades, many studies have been carried out to develop building energy saving technologies throughout the whole building life-cycle [8,9]. The volume and diversity of research articles are expanding rapidly. An accurate and reliable identification of major research topics and trends can be very useful for the future development in the building field [10]. However, conventional analytics are neither efficient nor effectively in the knowledge discovery from massive text data. Advanced techniques and methodologies are urgently needed to enable a semi-automated or fully-automated knowledge discovery process.

As one of the major fields in data mining (DM), text mining aims to discover previously unknown yet potentially useful knowledge from unstructured or semi-structured text data [11]. It can be used to perform various tasks, such as document clustering, text summarization, sentiment analysis and social network analysis [12]. Some studies have been performed to investigate the usefulness of text mining in analyzing academic articles and engineering reports. Nie and Sun used text mining to identify research trends in design research [13]. A two-dimensional approach, which included bibliometric and network analysis, was developed to detect major academic branches. Hung and Zhang adopted text mining and clustering analysis to analyze academic papers and theses in the field of mobile learning [14]. The results helped to identify the patterns and main topics in academic research. Oh and Lee analyzed 869 papers related to geospatial information research [15]. Several interdisciplinary research directions were identified based on the use of text mining and network analysis. Jiang and Qiang employed a topic modeling-based bibliometric analysis to explore global trends of hydropower studies [16]. In total, 1726 articles were analyzed to show the research trend in hydropower industry. Abbas and Zhang used text mining techniques to analyze the hotspots of patent technologies [17]. Rezaeian and Montazeri claimed that text mining can effectively realize science foresight by extracting representative topics from large amounts of text data [18]. Zhang and Liao discovered the relationships between the keywords in the existing literature about ground source heap pump systems through the co-occurrence analysis of text data [19]. The results obtained were used to provide supports for policy-making. Park and Nagy used the text mining software VOSviewer to download and analyze all relevant abstracts from the web of science database [20]. The results show that there is a close link between building controls and indoor thermal comforts. Moezzi and Goins conducted a survey of 192 office buildings and used text mining techniques to analyze issues related to building operations, such as the relationships between excessive air conditioning, worker stress and frustration, workplace availability and physical work spaces [21]. Gunay and Shen developed a text mining-based method to extract valuable information of fault patterns in HVAC systems [22].

Previous studies have validated the usefulness of text mining in discovering research topics and trends from massive text data. Nevertheless, to the best of the authors' knowledge, few studies have been performed in the building field. To fulfill this research gap, this study proposes a text mining-based methodology to discover useful knowledge from a large number of academic papers and engineering reports. It is organized as follows: Section 2 presents the research methodology. The research results are shown and discussed in Section 3. Conclusions are drawn in Section 4.

Section snippets

Research outline

Fig. 1 presents the research outline. The first step is to identify the key timestamps in the history of building energy saving. Rather than analyzing all text data as a whole, several key timestamps are identified for data partitioning. The aim is to enhance the sensitivity and quality of the knowledge discovered. The second step is data collection, which describes how the text data are collected. The third step is data preprocessing, which aims to enhance the quality of text data for further

Top-ranked terms in TF-IDF matrices

Identifying terms with high TF-IDF values helps to extract the general themes from text data. In this study, the articles at each stage were treated as a whole and TF-IDF values for each term were then calculated. The larger the TF-IDF value, the more significant the term is in representing the main theme of that article. Fig. 4 showed the top 15 terms with the highest TF-IDF values at each stage. The main findings are as below:

  • Several terms (e.g., “Solar”, “Cooling”, “Heating” and “Chiller”)

Conclusions

Massive amounts of academic studies have been performed in the building field to achieve energy conservations. Considering the large amount and great diversity of these studies, it is very challenging and time-consuming to accurately identify the major topics and research trends. This study proposes a text mining-based methodology to enhance the efficiency and effectiveness in analyzing large-scale text data. In total, 1600 academic articles, which ranges from Year 1973 to 2016, were collected

Acknowledgement

This research was conducted with the support of the Shenzhen Government Basic Research Foundation for Free exploration (Grant No. JCYJ20170818141151733) and the Social Science Research Support Grant (No. 17QNFC34), Shenzhen University.

References (41)

Cited by (0)

View full text