Abstract

Short-time heavy rainfall is a kind of sudden strong and heavy precipitation weather, which seriously threatens people’s life and property safety. Accurate precipitation nowcasting is of great significance for the government to make disaster prevention and mitigation decisions in time. In order to make high-resolution forecasts of regional rainfall, this paper proposes a convolutional 3D GRU (Conv3D-GRU) model to predict the future rainfall intensity over a relatively short period of time from the machine learning perspective. Firstly, the spatial features of radar echo maps with different heights are extracted by 3D convolution, and then, the radar echo maps on time series are coded and decoded by using GRU. Finally, the trained model is used to predict the radar echo maps in the next 1-2 hours. The experimental results show that the algorithm can effectively extract the temporal and spatial features of radar echo maps, reduce the error between the predicted value and the real value of rainfall, and improve the accuracy of short-term rainfall prediction.

1. Introduction

Short-term heavy precipitation is a kind of weather process with sudden heavy rainfall, short precipitation time, and large precipitation. Every year, natural disasters caused by it emerge endlessly, which seriously threaten people’s life and property safety. It is of great significance to carry out early warning and near forecast for disaster prevention and mitigation.

Radar echo maps extrapolation technology is the main technical means of precipitation nowcasting [1]. According to the observed echo maps, the echo’s intensity distribution and the echo body’s moving speed and direction (such as the rainfall area) are determined. By linear or nonlinear extrapolation of the echo body, the future radar echo maps can be predicted. At present, radar echo extrapolation technology is mainly divided into four categories: single body centroid-based method, cross-correlation-based method, optical flow-based method, and machine learning-based method. Through the recognition and analysis of radar echoes maps, the single body centroid-based method can obtain the features of thunderstorm cells, such as the thunderstorm center, the thunderstorm volume, and the weight center of reflectivity factor and extrapolate features of these thunderstorm movements to make convective near prediction [26]. Single body centroid-based method is suitable for tracking isolated, large, and strong echo monomer or monomer group, but the tracking success rate is low when the echo is scattered, merged, or split.

Cross-correlation-based method calculates the optimal spatial correlation of different regions of radar echo at adjacent time to determine the moving vector characteristics of echo maps and extrapolates the position of radar echo at future time [712]. This method is intuitive, simple, and easy to implement. When the shape, moving speed, and direction of radar echo image change gently, cross-correlation-based method can achieve better results. However, it is difficult to ensure the accuracy of the tracking for the strong convective weather process with rapid change of echo image because it is only simple to calculate the correlation coefficient.

Optical flow-based method is a tracking method in the field of computer vision. When there is relative motion between the observed target and the sensor, the motion of the brightness mode observed is called optical flow. This method has also been applied in the meteorological field [1318]. The optical flow based method can get a better overall movement trend for the heavy convective precipitation. However, this method is required to follow the invariance assumption, and radar echo has generation and elimination evolution to a certain extent. Therefore, the nonconservation of reflectivity factor leads to extrapolation error, and the error for fast-moving echo is large.

Machine learning-based method uses its self-learning ability to obtain some hidden features of echo changes and shows good memory and association ability [19, 20]. It has been applied as classification model and numerical prediction model in weather forecast, showing the potential and broad prospects of applying neural network model to radar echo extrapolation [21, 22]. In particular, it has recently used deep learning to process meteorological big data, showing strong technical advantages and performance, which has received great attention from the industry [2325].

By transforming the reflectivity factor into a gray image, the prediction of short-term and imminent rainfall is transformed into a video prediction problem [20]. RNN model, 2D CNN model, and 3D CNN model are three common network structures for video prediction. Attali and Montanvert [23] proposed the first RNN-based video prediction model by using convolutional RNN to encode observation frames; Dey and Zhao [26] proposed the LSTM based codec network model, using one LSTM to encode input frames and another LSTM to predict future frames; Tam and Heidrich [24] used ConvLSTM to replace LSTM to better capture temporal and spatial correlation. Giesen et al. [27] and Faraj et al. [28] extend the model in [20] to predict the transformation of the input frame instead of directly predicting the original pixel. Shen et al. [29] use RNN to capture temporal motion and CNN to capture spatial features to predict video sequences. The 2D CNN network model proposed in [30] regards video frames as multichannel. At present, the application of deep learning technology in related fields is in full swing, but its application in weather prediction is still in its infancy.

In order to capture well the spatiotemporal correlations and solve the problem that low accuracy of short-term rainfall prediction, this paper proposes a Conv3D-GRU model for radar echo short-term and imminent rainfall prediction. We extend the idea of 2D convolution to 3D convolution which is introduced to extract spatial dimension features of radar echo images at different heights. Then, the GRU network is built to extract the time dimension features. Finally, we build an end-to-end trainable model to realize the short-term rainfall forecast in the future. When evaluated on the radar echo dataset that our Conv3D-GRU model consistently outperforms both the Conv2D and the Conv2D-GRU, which can effectively improve the accuracy of short-term rainfall prediction.

2. Preliminaries

Short-term precipitation forecast takes the past radar echo extrapolation maps data as input and outputs radar echo extrapolation maps sequence in the future 1-2 hours, which can be summarized as a time series prediction problem.

RNN is known as temporal neural network, which can be used to process data containing time series. The gradient of RNN is easy to disappear in the long network, which makes the learning of RNN model difficult. Long short-term memory (LSTM) network can deal with the above problems. Gated recurrent unit (GRU) is a kind of gated cyclic unit structure and an improved network model of LSTM. In order to improve the training effect, we use the GRU recurrent neural network to learn the features of radar echo sequence data.

LSTM is composed of forgetting gate, input gate, and output gate to form a memory unit to filter input information, while GRU neural network improves the design of “gate,” which combines input gate and forgetting gate of LSTM neural network into an update gate . The LSTM is optimized from three gates to two gates. Like LSTM memory unit, GRU is composed of many neural units, each of which is also a complex “gate” structure. Among them, the operation in GRU neural unit can be expressed as follows:

From formula (1), it can be seen that each neural unit in the network has a dependency relationship with each other, and each neural unit participates in the decision of information screening. Weight of update door is expressed as . The output of the previous neural unit is , and input of the current neural unit is . The sigmoid activation function is represented by . Add the input of the previous neural unit and the current neural unit , multiply with to get the update gate , and then use sigmoid function to operate. When the value is larger, the information of the current neural unit will be retained more, and the information of the previous neural unit will be ignored more in this process.

3. Materials and Methods

Although GRU has been proved to be able to deal with the long-range dependence problem well, the input of the network is a one-dimensional vector, and the radar echo maps used in this paper are three-dimensional images. If the GRU is directly applied, the image must be transformed into one-dimensional vector, which will undoubtedly lose a lot of space information of radar maps. To deal with this problem, a network model based on Conv3D-GRU is proposed in this paper. The network can receive the 3D images as input data, so as to better retain the spatial characteristics of radar map. As shown in Figure 1, the model is composed of 3D feature extraction module and GRU-based coding prediction module. The former mainly uses 3D convolution to extract radar echo maps features of different heights at a certain time, while the latter mainly encodes the radar images on time series and predicts the future radar images.

3.1. Spatial Feature Extraction of 3D Convolutional Neural Network

In order to extract the spatial features of radar echo maps, we stack radar maps of different heights on a certain time t to form a cube and then use 3D convolution to fuse the spatial features in the cube. 3D convolution is an extension of 2D convolution neural network on the dimension, including convolution layer, pooling layer, activation function layer, and fully connected layers. 3D convolution improves the convolution kernel size and pooling layer filter size in the network structure to three-dimensional. At the same time, multidimensional image can also be directly used as the input of neural network. 3D maximum pooling layer selects the maximum pooling method, and the received convolution layer output is a cube data. In fully connected layers, the neurons are connected with all the neurons in the adjacent layer. The neuron vector obtained from the feature space is taken as the input of the full connection layer. The input eigenvector is processed by matrix multiplication, and output is taken as the input of the next GRU module.

The output of 3D convolution is expressed as , where and represent the spatial dimension of the input image and represents the time dimension, represents the activation function, is offset function of the j-th feature map in layer i, , , and represent the size of convolution kernel, represents the weight of the (, , ) neuronal connections in the Mth feature, indicates the dimension size of the input information, and , , and are convolution values, respectively. The 3D convolution process can be expressed as follows:

The loss function of 3D convolution neural network is constructed as follows:where represents the 3D input vector, represents the corresponding label, represents the prediction output, is all parameters, is the indicator function, and represents the estimated probability that belongs to classification .

Table 1 gives the 3D convolution network structure in this paper which consists of one input layer, four 3D convolution layers, and four fully connected layers.

3.2. GRU Coding: Prediction Module

As shown in Figure 2, the prediction network of GRU coding-prediction module consists of two modules, namely, (1) GRU coding module, which is used to extract the temporal characteristics of input time series; (2) GRU prediction module, which can predict the radar echo map in the future based on the time series characteristics obtained by the coding module.

According to the time characteristic sequence of the 3D convolution input, the GRU time series prediction network can predict the output sequence of the future period. Firstly, the sequence output of 3D convolution is used as the input time series of the GRU coding module, and then the input data is encoded into a fixed-size state vector, so as to complete the extraction of the time series features of the input time series. At this time, the information of the entire input sequence will be stored in the cell state of the GRU neuron. After that, the GRU prediction module takes the cell states of the abovementioned neurons as the initial state of the module’s cells and generates the prediction output sequence of the future time period one by one based on the time series features obtained by the GRU coding module.

After encoding, the extracted radar map feature information is retained in the neural unit of the GRU model. The initial neural unit state of the prediction module is copied from the state of the previous encoder module. According to the characteristic information obtained by the encoder module, the prediction module is responsible for the prediction of rainfall in a short time in the future. The final output sequence corresponds to the input moment one to one. The parameters in the encoding phase are shared, and the parameters in the decoding phase are shared, but the parameters between encoding and decoding are not shared, so the model learns two sets of parameters.

4. Experiment and Result Analysis

4.1. Experimental Environment and Dataset

In order to verify the validity of our Conv3D-GRU model in rainfall prediction on radar echo images, this study will combine the weather radar intensities collected in Ningbo Meteorological Station for training and testing. The environment developed on Win10 OS with 128 G RAM and 2 GHz CPU. CPU processor model is Inter (R) Xeon(R) CPU E5-2683 v3, and deep learning framework is Pytorch-GPU(1.3.1). As rainfall events occur sparsely, we select the rainy days based on the rain barrel information from the radar echo image to build our final dataset. The resolution of radar echo image is 480 × 480 pixels. When preprocessing, we transform the intensity values to gray-level pixels by setting and crop the radar maps to 101 × 101 pixels. The weather radar data is recorded every 6 minutes, so there are 240 frames per day to predict the rainfall of radar echo maps in the next two hours. Firstly, the original radar echo map is processed into gray image by linear transformation. Since the original radar echo map has noise interference in the acquisition process, the bilinear filter is used to filter the image to reduce the impact of noise on training and evaluation. Original radar echo images and filtered images are shown in Figure 3.

4.2. Evaluated Algorithm

In the process of experimental training, the network structure of Conv3D-GRU was constructed, and the network weight parameters were initialized. Three radar map sequences of different heights were used for size normalization, and frame data was used as the video stream for input. Each altitude receives 5 frames as input and 20 frames output as prediction value. The input radar echo image sequence is processed by 3D convolution network to obtain the spatial dimension features. The time series radar echo features are generated by GRU network, and the convolution decoding module is used to obtain the output of the image sequence. Finally, the heavy rainfall prediction in the next two hours is realized.

In order to evaluate the performance of the algorithm in this paper, critical success index (CSI), Heidke skill score (HSS), mean square error (MSE), mean absolute error (MAE), balanced mean square error (B-MSE), and balanced mean absolute error (B-MAE) were used as evaluation indexes of the model. The score was calculated by using rainfall thresholds of 0.5, 2, 5, 10, and 30. The main formulas are as follows:

Among them, indicates (prediction = 1, truth = 1), indicates (prediction = 1, truth = 0), indicates (prediction = 0, truth = 0), and indicates (prediction = 0, truth = 1). When the values of CSI and HSS are higher, the probability of heavy rainfall is higher. represents the predicted value, represents the true value, and represents the number of samples in the test set. The smaller the value of MSE and MAE is, the smaller the error between the predicted value and the true value of rainfall is, and the better the performance of the model is.

4.3. Analysis of Experimental Results

In the process of training, Adam optimizer was used to optimize. The batch size was set to 8, the learning rate was set to 0.0001, and the momentum was 0.5. In the experiment of this paper, the radar maps of the first five moments were selected as the input to realize the rainfall prediction of the next 20 moments. Figure 4 shows the rainfall prediction results of radar echo maps, in which group (a) represents the time series of radar echo maps obtained within half an hour; (b) is the real output image sequence (within the next 2 hours); (c), (d), and (e) are three groups of experiments, respectively, which show the predicted output of Conv2D, Cov2D-GRU, and Conv3D-GRU algorithms on radar echo maps sequence in the next 2 hours. The experimental results show that the short-term rainfall prediction of this algorithm is more clear, which can better obtain the features of radar map time and space dimensions at different heights, more accurately predict the future rainfall contour, and use the radar map information to realize the rainfall forecast in the next 2 hours.

4.4. Evaluation Results

In order to further verify the performance of the proposed algorithm for short-term heavy rainfall prediction, we compared Conv3D-GRU with other algorithms. It can be seen from Table 2 that “r ≥ τ” represents the rainfall threshold of skill score at τ mm/h, and each cell represents the average score of the next 20 frames. Through comparison, it is found that the average score of this algorithm on CSI and HSS is better than that of Conv2D and Conv2D-GRU algorithms, indicating that the Conv3D-GRU proposed in this paper has achieved good results in extracting radar image features. Moreover, it can make full use of the correlation between the current frame information and the historical frame and the future frame time information, which improves the accuracy of short-term heavy rainfall prediction to a certain extent.

It can be seen from Table 3 that the four indicators of Conv3D-GRU are better than Conv2D and Conv2D-GRU. MSE, MAE, B-MSE, and B-MAE are smaller, which indicates that the error between the predicted value and the real value of rainfall is smaller, and the rainfall prediction of the model is more accurate and has higher performance.

5. Conclusion

In this paper, we have proposed a Conv3D-GRU model for short-term rainfall prediction of radar echo image to improve the accuracy of regional rainfall prediction.

We construct the Conv3D network to extract the spatial dimension features of radar images at different heights and then build the encoder module to obtain the spatiotemporal feature sequence of radar images. The GRU neural network is introduced to extract the time dimension information. Finally, the prediction module is used to realize the radar echo rainfall prediction in the next 2 hours. Our experimental validation shows that the Conv3D-GRU model consistently outperforms both the Conv2D and the Conv2D-GRU algorithm, and the Conv3D-GRU can capture well the spatiotemporal correlations. The redundancy of spatial data is reduced, and the accuracy of rainfall prediction is improved. Although the Conv3d-GRU model in this paper has achieved satisfactory results in short-term regional rainfall prediction, some remaining research work can be carried out to further improve our method performance. Since the limitation of rainfall timeliness within a certain hour and the influence of single meteorological factors, it is inevitable that there are shortcomings. For future work, we plan to optimize the structure of the constructed neural network and learn more spatiotemporal feature sequence by increasing the network depth of convolution encoder. We will also combine with other meteorological features, such as temperature, wind field, and other information.

Data Availability

The data used to support the findings of the study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported in part by the Project of the Science and Plan for Zhejiang Province (nos. LGF19F020008, LGF21F020023, and LGF21F020022), NingBo Science and the Technology Project (no. 2019C50008), Ningbo Natural Science Foundation (nos. 202003N4320, 202003N4324, and 202003N4321), and Zhejiang Education Department Project (no. Y202045246).