Elsevier

Environmental Modelling & Software

Volume 47, September 2013, Pages 108-127
Environmental Modelling & Software

An intelligent pattern recognition model to automate the categorisation of residential water end-use events

https://doi.org/10.1016/j.envsoft.2013.05.002Get rights and content

Abstract

The rapid dissemination of residential water end-use (e.g. shower, clothes washer, etc.) consumption data to the customer via a web-enabled portal interface is becoming feasible through the advent of high resolution smart metering technologies. However, in order to achieve this paradigm shift in residential customer water use feedback, an automated approach for disaggregating complex water flow trace signatures into a registry of end-use event categories needs to be developed. This outcome is achieved by applying a hybrid combination of gradient vector filtering, Hidden Markov Model (HMM) and Dynamic Time Warping Algorithm (DTW) techniques on an existing residential water end-use database of 252 households located in South-east Queensland, Australia having high resolution water meters (0.0139 L/pulse), remote data transfer loggers (5 s logging) and completed household water appliance audits. The approach enables both single independent events (e.g. shower event) and combined events (i.e. several overlapping single events) to be disaggregated from flow data into a comprehensive end-use event registry. Complex blind source separation of concurrently occurring water end use events (e.g. shower and toilet flush occurring in same time period) is the primary focus of this present study. Validation of the developed model is achieved through an examination of 50 independent combined events.

Introduction

Sensor technology and the ‘big data’ they generate combined with advanced machine learning techniques provide numerous opportunities for enhancing outdated approaches covering all the various segments of water resources management (Schimak et al., 2010; Usländer et al., 2010). Reported studies demonstrate that such technologies and techniques are increasingly influencing how we better monitor and manage large-scale water basins (e.g. White et al., 2006; Quinn et al., 2010; Murlà et al., 2010), river stream flow (Lindim et al., 2010; David et al., 2013), drinking water reservoir quality (Glasgow et al., 2004), water treatment plant operations (Storey et al., 2011), water distribution system networks (Dorini et al., 2006), consumer water end use consumption (Nguyen et al., 2013a; Willis et al., 2011c), and wastewater plant operations (Dürrenmatt and Gujer, 2012). The research focus of this paper is on the application of sensors (i.e. high resolution smart meters) and ‘big data’ analytical techniques at the urban water scale; specifically the residential water consumer and their end use water consumption. This frontier area of water end use or micro-component analysis research is beginning to attract research attention. Froehlich et al. (2011) conducted a study using pressure sensing devices to infer water usage events in households in Washington State, USA. CSIRO (2012) have recently combined an acoustic sensor with smart water metering systems in order to disaggregate residential water consumption into end use categories. The authors (Nguyen et al., 2013a) utilised machine learning techniques such as HMM and DTW to disaggregate remotely collected high resolution water flow data received from smart meters into single end use event categories, which is the precursor to this present paper seeking to disaggregate concurrently occurring end use events. With these technologies becoming commercially viable, the vision of an intelligent expert system, which can perform autonomous water end use analysis and provide feedback and decision support to both water consumers and authorities, is rapidly becoming a reality.

The era when urban water planning focused only on how to build and supply water has been replaced by a new paradigm, where the precise accounting and management of urban water consumption is deemed essential to maintenance of a sustainable water future. Lower water yield reliability, from traditional water supply sources, and the increasing demand for water in urban areas, requires the development of a more adaptive and innovative water resource management approach, fed by robust real-time information. As a consequence, an increasing number of smart water metering technologies have been introduced to the market. Such metering devices embrace two distinct elements: meters that use new technology to capture water use information; and communication systems that can capture and transmit real-time water use information (Stewart et al., 2010). While current forms of smart metering technology can provide total consumption data to the customer and utility at high levels of resolution, they fail to disaggregate this data into its end-use use categories. This study envisions and provides the architecture for an advanced smart metering system that enables customers and utilities to actively monitor, through web-portal interfaces, real-time information about what, when, where and how water was consumed at their meter connection (e.g. 56 litre shower occurring between 06:55–07:15 Tuesday 25 May 2012). The proposed system allows individual consumers to log into their user-defined water consumption web page to view their daily, weekly, and monthly consumption tables, as well as charts on their water end-use patterns across major end use categories (e.g. leaks, clothes washer, dishwasher, tap, toilet, shower and irrigation). It can also rapidly alert them of occurring leak events so that they can immediately address them instead of the current slow feedback process from current metering technology (e.g. monthly or quarterly alert at best).

The analytical report generated by the new advanced integrated water management system will help utilities identify the water consumption patterns of their various consumer types and assist with a range of urban water planning and management functions (Stewart et al., 2010). However, such a system requires a robust analytical model to automatically and accurately disaggregate the flow trace data into individual water end-use event categories. Current end-use disaggregation processes used by the authors and their aligned research teams requires extensive manual data collection and analysis as summarised in Fig. 1 (Beal et al., 2011a). Automation of this resource intensive process is essential to developing the proposed advanced water management system that has commercial viability. The design and verification of an automated flow pattern recognition model that has good accuracy is the ultimate aim of this study.

In recent years, a number of residential water end use studies have been completed using a range of single or mixed methods, such as household auditing, diaries, high resolution smart metering and pressure sensors, with a diverse range of per capita end use summaries. Jacobs (2007) and Blokker et al. (2010) provided summaries on a good proportion of the end use models developed from stochastic techniques, contingent valuation approaches (CVA), modelling, and metered methods. The introduction of advanced technology has enabled the direct capture and classification of water end use events. Table 1 provides a summary of reported end use studies that have applied high resolution smart meters, data loggers or pressure sensors completed internationally in the last 15 years.

As displayed in Table 1, from a direct measurement and water end use recognition approach which is undoubtedly the future of this type of problem, the two main approaches presently reported include using smart water meters in conjunction with a decision-tree based analysis tool such as Trace Wizard or Identiflow or as more recently published, the inclusion of pressure sensors at individual appliances (i.e. HydroSense) along with a HMM based decision tool. Each approach has its own strengths and weaknesses, which were discussed in detail in (Nguyen et al., 2013a).

In summary, the ideal approach that is most amenable to citywide application is installing smart water meters at the property boundary in conjunction with intelligent end use pattern recognition algorithms either in-built into the meter software or within a processing module at the utilities data centre. This is the lowest cost and non-intrusive approach to water end use disaggregation. However, for such widespread implementation, the following summarised limitations of the existing models (i.e. Trace Wizard and Identiflow) have to be overcome:

  • inability to analyse collected data without human interaction and manual reclassification (i.e. main disadvantage);

  • inability to accurately distinguish different end use categories which have similar water flow characteristics (e.g. shower, bathtub and irrigation);

  • inability to classify an end use category that has various physical parameters depending on appliance models (e.g. dishwasher, clothes washer and toilet); and

  • inability to deal with multi-layer combined events (i.e. cannot handle three or more concurrent events).

These shortcomings have motivated the development of an automatic flow trace analysis system which can address all of the above mentioned issues. For the building of such an intelligent model, an in-depth understanding of the existing techniques applied to this type of problem is required. Nguyen et al. (2013a) presented a detailed review of pattern recognition techniques available and provided a rating for them (i.e. 1 star (*) = poor; ** = below average; *** = average; **** = good; and ***** = excellent) for their processing time efficiency, classification accuracy, self-learning potential and an overall applicability rating for each technique to the herein examined water end use pattern recognition process (Table 2).

Based on this review Nguyen et al. (2013a) suggested a method using a hybrid combination of the Hidden Markov Model (HMM) and Dynamic Time Warping (DTW) algorithm to help classify single independent events into appropriate end use categories. However, in residential households a small proportion of water end use events are occurring simultaneously (e.g. shower and toilet flush). Therefore, in order to achieve the vision of an automated and intelligent water management system, this current paper presents a robust method which integrates the analytical techniques employed in single event classification with vector gradient filtering method to perform a comprehensive combined event analysis. A detailed literature review of HMM and DTW techniques, including their theoretical foundations and application to single event analysis has been conducted in Nguyen et al. (2013a). This present paper only summarises the applied HMM and gradient vector filtering methods and outlines their crucial roles in combined event disaggregation.

HMM is a statistical Markov model in which the system being modelled is assumed to be a Markov process with unobserved (hidden) states. In a regular Markov model the visible state transition probabilities are the only parameters. In a hidden Markov model, the state is not directly visible, but the output, which is dependent on the state, is visible. Each state has a probability distribution over the possible output tokens. Therefore, the sequence of tokens generated by an HMM gives some information about the sequence of the states (Ephraim and Merhav, 2002). HMM was used as the principal technique for the classification of all single end use events in (Nguyen et al., 2013a). In the present study, the existing HMM model, which was previously applied in single event analysis, is incorporated with some additional physical parameters to help disaggregate a combined event into many samples and assign them to specific end-use categories.

Another mathematical tool is utilised for combined event analysis, namely, gradient vector filtering. This technique has been widely applied in many fields such as image enhancement, noise reduction or digital signal enhancement (Boashash, 2003). It relies on the analysis of the multi-dimensional gradient vector of the original signal to extract information contained in the signal, so that unnecessary parts can be filtered or removed (Shapiro and George, 1998). Based on that principle, another version of this technique has been developed to suit this particular study. The proposed method also considers the gradient change of the flow rate data series to determine whether a flow rate fluctuation is actually a new event occurring on top of the base event or expected variation within the base event.

Section snippets

Research objective

The development of pattern matching algorithms which are able to automatically categorise the collected flow trace data points, received from the wireless data loggers, into particular water end-use categories requires the resolution of two key research questions; firstly, how to recognise single events from the collected flow trace?; and, secondly, how to separate a combined event into its appropriate single event categories? The first research question was successfully achieved using a hybrid

Research regions

Data utilised for the development of the model is sourced from 252 residential households fitted with a smart meter and data logger and located in the urban south east corner of the State of Queensland, Australia. These households are consenting participants in the recently completed South-east Queensland Residential End Use Study (SEQREUS) funded by the Queensland State Government (Beal and Stewart, 2012). A sample of properties is taken from the Sunshine Coast Regional Council (n = 67),

End use classification process overview

With the available database, the disaggregation process of the water end-use events from the raw data is developed and shown in Fig. 2. As mentioned previously, single events are those which occur in isolation (e.g. toilet flushing only), while combined events have simultaneous occurrences of water usage (e.g. a shower occurring while someone else is using a tap), which is more challenging to disaggregate. At the very first step, HMM algorithm is used to recognise if an event is a single event

Combined event separation process

The first required step in the combined event classification analysis is to separate the subjected event into several smaller parts as displayed in Fig. 7. The flow data records the water usage; therefore, the flow rate changes (gradients) indicate a device is switched on or off. A modified gradient vector filtering method is developed to allow the analyst to dissect a combined event to any desired level. This technique plays a fundamental role, in conjunction with the other analytical

Sub-event analysis

Prior to the establishment of an overall methodology for performing the combined event analysis, it is necessary to create category index i=1,2,,8 representing all the eight end-use categories, with the corresponding order as follows: 1 – shower, 2 – faucet, 3 – clothes washer, 4 – dishwasher, 5 – full flush toilet, 6 – half flush toilet, 7 – bathtub, and 8 – irrigation. The main process in Layer 1 is the classification process using HMM with threshold criteria.

Base-event analysis

Once all the sub events are fully classified, the final step in this combined event study is to analyse the base sample. The base sample is the product obtained after removing all spiky samples from the original combined event after the initial separation process. As explained in the previous section, via many intensive analysis processes on the collected data, it is revealed that the majority of the base events are formed by only one, or a combination, of the following end-use categories:

Combined event analysis example

For a more comprehensive understanding of the overall study, the proposed technique is performed on one typical combined event collected to explain, in detail, how each step is applied. The original event's details are extracted directly from the user's diary (presented in Table 10).

Some additional information in the household is also provided as:

  • One clothes wash started from 7:30 am and finished at 8:13 am.

  • No dish washing in the morning.

  • Toilet cistern volume: 7.0 L.

From the given information,

Combined event classification accuracy

The model is verified using another independent 50 combined events, which are basically divided into three categories with the increasing level of complexity.

Type 1 of the independent combined events includes two events which occur simultaneously. The longer one of these two events plays as the base event, while the remaining one is considered as the sub event. This is the simplest event combination in reality; therefore, in the present study, only five samples of this type are collected to

Conclusions, limitations and future directions

The establishment of an integrated water management system, which employs smart water metering, in conjunction with an intelligent algorithm to automate the flow trace analysis process, is becoming more and more feasible. The first fundamental step to extract the single events from the flow rate series, and assign them to appropriate categories, was achieved using a model containing a hybrid combination of HMM and DTW algorithms. This single event disaggregation model is comprehensively

Model development implications for urban water management

The model developed in this study is the key element for the building of an integrated water management system which is able to automatically categorise the flow data recorded from water meter into all end-use categories. One application of this system allows for individual consumers to log into their user-defined water consumption web page to view their daily, weekly and monthly consumption tables as well as charts on water consumption patterns for categories of water end use. Average and/or

References (55)

  • L.E. Baum et al.

    A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains

    The Annals of Mathematical Statistics

    (1970)
  • C. Beal et al.

    South East Queensland Residential end Use Study: Final Report

    (January 2012)
  • C. Beal et al.

    A novel mixed method smart metering approach to reconciling differences between perceived and actual residential end use water consumption

    Journal of Cleaner Production

    (2011)
  • C. Beal et al.

    SEQ residential end-use study

    Journal of the Australian Water Association

    (2011)
  • H.K.D.H. Bhadeshia

    Neural networks in materials science

    ISIJ International

    (1999)
  • C.M. Bishop

    Neural Networks for Pattern Recognition

    (1995)
  • E. Blokker et al.

    Simulating residential water demand with a stochastic end use model

    Journal of Water Resources Planning and Management

    (2010)
  • B. Boashash

    Time Frequency Signal Analysis and Processing. A Comprehensive Reference

    (2003)
  • S.H. Cha et al.

    A genetic algorithm for constructing compact binary decision trees

    Journal of Pattern Recognition Research

    (2009)
  • CSIRO

    SEQ Residential Water End Use Study: Validation Trial of CSIRO End Use Sensor

    (2012)
  • Y. Da et al.

    An improved PSO-based ANN with simulated annealing technique

  • H. Deng et al.

    Bias of importance measures for multi-valued attributes and solutions

  • G. Dorini et al.

    An efficient algorithm for sensor placement in water distribution systems

  • G. Eggers et al.

    Dynamic vegetation model as a tool for ecological impact assessments of dam operation

    Journal of Hydro-environment Research, Special Issue on Ecohydraulics: Recent Research and Applications

    (2012)
  • Y. Ephraim et al.

    Hidden Markov processes

    IEEE Transactions on Information Theory

    (2002)
  • J. Froehlich et al.

    A Longitudinal Study of Pressure Sensing to Infer Real-world Water Usage Events in the Home

    (2011)
  • J.E. Froehlich et al.

    HydroSense: infrastructure-mediated single-point sensing of whole-home water activity

  • Cited by (0)

    View full text