An intelligent pattern recognition model to automate the categorisation of residential water end-use events
Introduction
Sensor technology and the ‘big data’ they generate combined with advanced machine learning techniques provide numerous opportunities for enhancing outdated approaches covering all the various segments of water resources management (Schimak et al., 2010; Usländer et al., 2010). Reported studies demonstrate that such technologies and techniques are increasingly influencing how we better monitor and manage large-scale water basins (e.g. White et al., 2006; Quinn et al., 2010; Murlà et al., 2010), river stream flow (Lindim et al., 2010; David et al., 2013), drinking water reservoir quality (Glasgow et al., 2004), water treatment plant operations (Storey et al., 2011), water distribution system networks (Dorini et al., 2006), consumer water end use consumption (Nguyen et al., 2013a; Willis et al., 2011c), and wastewater plant operations (Dürrenmatt and Gujer, 2012). The research focus of this paper is on the application of sensors (i.e. high resolution smart meters) and ‘big data’ analytical techniques at the urban water scale; specifically the residential water consumer and their end use water consumption. This frontier area of water end use or micro-component analysis research is beginning to attract research attention. Froehlich et al. (2011) conducted a study using pressure sensing devices to infer water usage events in households in Washington State, USA. CSIRO (2012) have recently combined an acoustic sensor with smart water metering systems in order to disaggregate residential water consumption into end use categories. The authors (Nguyen et al., 2013a) utilised machine learning techniques such as HMM and DTW to disaggregate remotely collected high resolution water flow data received from smart meters into single end use event categories, which is the precursor to this present paper seeking to disaggregate concurrently occurring end use events. With these technologies becoming commercially viable, the vision of an intelligent expert system, which can perform autonomous water end use analysis and provide feedback and decision support to both water consumers and authorities, is rapidly becoming a reality.
The era when urban water planning focused only on how to build and supply water has been replaced by a new paradigm, where the precise accounting and management of urban water consumption is deemed essential to maintenance of a sustainable water future. Lower water yield reliability, from traditional water supply sources, and the increasing demand for water in urban areas, requires the development of a more adaptive and innovative water resource management approach, fed by robust real-time information. As a consequence, an increasing number of smart water metering technologies have been introduced to the market. Such metering devices embrace two distinct elements: meters that use new technology to capture water use information; and communication systems that can capture and transmit real-time water use information (Stewart et al., 2010). While current forms of smart metering technology can provide total consumption data to the customer and utility at high levels of resolution, they fail to disaggregate this data into its end-use use categories. This study envisions and provides the architecture for an advanced smart metering system that enables customers and utilities to actively monitor, through web-portal interfaces, real-time information about what, when, where and how water was consumed at their meter connection (e.g. 56 litre shower occurring between 06:55–07:15 Tuesday 25 May 2012). The proposed system allows individual consumers to log into their user-defined water consumption web page to view their daily, weekly, and monthly consumption tables, as well as charts on their water end-use patterns across major end use categories (e.g. leaks, clothes washer, dishwasher, tap, toilet, shower and irrigation). It can also rapidly alert them of occurring leak events so that they can immediately address them instead of the current slow feedback process from current metering technology (e.g. monthly or quarterly alert at best).
The analytical report generated by the new advanced integrated water management system will help utilities identify the water consumption patterns of their various consumer types and assist with a range of urban water planning and management functions (Stewart et al., 2010). However, such a system requires a robust analytical model to automatically and accurately disaggregate the flow trace data into individual water end-use event categories. Current end-use disaggregation processes used by the authors and their aligned research teams requires extensive manual data collection and analysis as summarised in Fig. 1 (Beal et al., 2011a). Automation of this resource intensive process is essential to developing the proposed advanced water management system that has commercial viability. The design and verification of an automated flow pattern recognition model that has good accuracy is the ultimate aim of this study.
In recent years, a number of residential water end use studies have been completed using a range of single or mixed methods, such as household auditing, diaries, high resolution smart metering and pressure sensors, with a diverse range of per capita end use summaries. Jacobs (2007) and Blokker et al. (2010) provided summaries on a good proportion of the end use models developed from stochastic techniques, contingent valuation approaches (CVA), modelling, and metered methods. The introduction of advanced technology has enabled the direct capture and classification of water end use events. Table 1 provides a summary of reported end use studies that have applied high resolution smart meters, data loggers or pressure sensors completed internationally in the last 15 years.
As displayed in Table 1, from a direct measurement and water end use recognition approach which is undoubtedly the future of this type of problem, the two main approaches presently reported include using smart water meters in conjunction with a decision-tree based analysis tool such as Trace Wizard or Identiflow or as more recently published, the inclusion of pressure sensors at individual appliances (i.e. HydroSense) along with a HMM based decision tool. Each approach has its own strengths and weaknesses, which were discussed in detail in (Nguyen et al., 2013a).
In summary, the ideal approach that is most amenable to citywide application is installing smart water meters at the property boundary in conjunction with intelligent end use pattern recognition algorithms either in-built into the meter software or within a processing module at the utilities data centre. This is the lowest cost and non-intrusive approach to water end use disaggregation. However, for such widespread implementation, the following summarised limitations of the existing models (i.e. Trace Wizard and Identiflow) have to be overcome:
- •
inability to analyse collected data without human interaction and manual reclassification (i.e. main disadvantage);
- •
inability to accurately distinguish different end use categories which have similar water flow characteristics (e.g. shower, bathtub and irrigation);
- •
inability to classify an end use category that has various physical parameters depending on appliance models (e.g. dishwasher, clothes washer and toilet); and
- •
inability to deal with multi-layer combined events (i.e. cannot handle three or more concurrent events).
These shortcomings have motivated the development of an automatic flow trace analysis system which can address all of the above mentioned issues. For the building of such an intelligent model, an in-depth understanding of the existing techniques applied to this type of problem is required. Nguyen et al. (2013a) presented a detailed review of pattern recognition techniques available and provided a rating for them (i.e. 1 star (*) = poor; ** = below average; *** = average; **** = good; and ***** = excellent) for their processing time efficiency, classification accuracy, self-learning potential and an overall applicability rating for each technique to the herein examined water end use pattern recognition process (Table 2).
Based on this review Nguyen et al. (2013a) suggested a method using a hybrid combination of the Hidden Markov Model (HMM) and Dynamic Time Warping (DTW) algorithm to help classify single independent events into appropriate end use categories. However, in residential households a small proportion of water end use events are occurring simultaneously (e.g. shower and toilet flush). Therefore, in order to achieve the vision of an automated and intelligent water management system, this current paper presents a robust method which integrates the analytical techniques employed in single event classification with vector gradient filtering method to perform a comprehensive combined event analysis. A detailed literature review of HMM and DTW techniques, including their theoretical foundations and application to single event analysis has been conducted in Nguyen et al. (2013a). This present paper only summarises the applied HMM and gradient vector filtering methods and outlines their crucial roles in combined event disaggregation.
HMM is a statistical Markov model in which the system being modelled is assumed to be a Markov process with unobserved (hidden) states. In a regular Markov model the visible state transition probabilities are the only parameters. In a hidden Markov model, the state is not directly visible, but the output, which is dependent on the state, is visible. Each state has a probability distribution over the possible output tokens. Therefore, the sequence of tokens generated by an HMM gives some information about the sequence of the states (Ephraim and Merhav, 2002). HMM was used as the principal technique for the classification of all single end use events in (Nguyen et al., 2013a). In the present study, the existing HMM model, which was previously applied in single event analysis, is incorporated with some additional physical parameters to help disaggregate a combined event into many samples and assign them to specific end-use categories.
Another mathematical tool is utilised for combined event analysis, namely, gradient vector filtering. This technique has been widely applied in many fields such as image enhancement, noise reduction or digital signal enhancement (Boashash, 2003). It relies on the analysis of the multi-dimensional gradient vector of the original signal to extract information contained in the signal, so that unnecessary parts can be filtered or removed (Shapiro and George, 1998). Based on that principle, another version of this technique has been developed to suit this particular study. The proposed method also considers the gradient change of the flow rate data series to determine whether a flow rate fluctuation is actually a new event occurring on top of the base event or expected variation within the base event.
Section snippets
Research objective
The development of pattern matching algorithms which are able to automatically categorise the collected flow trace data points, received from the wireless data loggers, into particular water end-use categories requires the resolution of two key research questions; firstly, how to recognise single events from the collected flow trace?; and, secondly, how to separate a combined event into its appropriate single event categories? The first research question was successfully achieved using a hybrid
Research regions
Data utilised for the development of the model is sourced from 252 residential households fitted with a smart meter and data logger and located in the urban south east corner of the State of Queensland, Australia. These households are consenting participants in the recently completed South-east Queensland Residential End Use Study (SEQREUS) funded by the Queensland State Government (Beal and Stewart, 2012). A sample of properties is taken from the Sunshine Coast Regional Council (n = 67),
End use classification process overview
With the available database, the disaggregation process of the water end-use events from the raw data is developed and shown in Fig. 2. As mentioned previously, single events are those which occur in isolation (e.g. toilet flushing only), while combined events have simultaneous occurrences of water usage (e.g. a shower occurring while someone else is using a tap), which is more challenging to disaggregate. At the very first step, HMM algorithm is used to recognise if an event is a single event
Combined event separation process
The first required step in the combined event classification analysis is to separate the subjected event into several smaller parts as displayed in Fig. 7. The flow data records the water usage; therefore, the flow rate changes (gradients) indicate a device is switched on or off. A modified gradient vector filtering method is developed to allow the analyst to dissect a combined event to any desired level. This technique plays a fundamental role, in conjunction with the other analytical
Sub-event analysis
Prior to the establishment of an overall methodology for performing the combined event analysis, it is necessary to create category index representing all the eight end-use categories, with the corresponding order as follows: 1 – shower, 2 – faucet, 3 – clothes washer, 4 – dishwasher, 5 – full flush toilet, 6 – half flush toilet, 7 – bathtub, and 8 – irrigation. The main process in Layer 1 is the classification process using HMM with threshold criteria.
Base-event analysis
Once all the sub events are fully classified, the final step in this combined event study is to analyse the base sample. The base sample is the product obtained after removing all spiky samples from the original combined event after the initial separation process. As explained in the previous section, via many intensive analysis processes on the collected data, it is revealed that the majority of the base events are formed by only one, or a combination, of the following end-use categories:
Combined event analysis example
For a more comprehensive understanding of the overall study, the proposed technique is performed on one typical combined event collected to explain, in detail, how each step is applied. The original event's details are extracted directly from the user's diary (presented in Table 10).
Some additional information in the household is also provided as:
- •
One clothes wash started from 7:30 am and finished at 8:13 am.
- •
No dish washing in the morning.
- •
Toilet cistern volume: 7.0 L.
From the given information,
Combined event classification accuracy
The model is verified using another independent 50 combined events, which are basically divided into three categories with the increasing level of complexity.
Type 1 of the independent combined events includes two events which occur simultaneously. The longer one of these two events plays as the base event, while the remaining one is considered as the sub event. This is the simplest event combination in reality; therefore, in the present study, only five samples of this type are collected to
Conclusions, limitations and future directions
The establishment of an integrated water management system, which employs smart water metering, in conjunction with an intelligent algorithm to automate the flow trace analysis process, is becoming more and more feasible. The first fundamental step to extract the single events from the flow rate series, and assign them to appropriate categories, was achieved using a model containing a hybrid combination of HMM and DTW algorithms. This single event disaggregation model is comprehensively
Model development implications for urban water management
The model developed in this study is the key element for the building of an integrated water management system which is able to automatically categorise the flow data recorded from water meter into all end-use categories. One application of this system allows for individual consumers to log into their user-defined water consumption web page to view their daily, weekly and monthly consumption tables as well as charts on water consumption patterns for categories of water end use. Average and/or
References (55)
- et al.
Regional-scale river flow modelling using off-the-shelf runoff products, thousands of mapped rivers and hundreds of stream flow gauges
Environmental Modelling & Software
(2013) - et al.
Data-driven modelling approaches to support wastewater treatment plant operation
Environmental Modelling & Software
(2012) - et al.
Real-time remote monitoring of water quality: a review of current applications, and advancements in sensor, telemetry, and computing technologies
Journal of Experimental Marine Biology and Ecology
(2004) - et al.
Use of environmental sensors and sensor networks to develop water and salinity budgets for seasonal wetland real-time water quality management
Environmental Modelling & Software
(2010) - et al.
Advances in on-line drinking water quality monitoring and early warning systems
Water Research
(2011) - et al.
2010. Designing environmental software applications based upon an open sensor service architecture
Environmental Modelling & Software
(2010) - et al.
Identifying opportunities for achieving water savings throughout the Murray–Darling Basin
Environmental Modelling & Software
(2006) - et al.
Alarming visual display monitors affecting shower end-use water and energy conservation in Australian residential households
Resources, Conservation and Recycling
(2010) - et al.
Residential potable and recycled water end-uses in a dual reticulated supply system
Desalination
(2011) - et al.
Induction of fuzzy decision trees
Fuzzy Sets and Systems
(1995)