Classification of storm events using a fuzzy encoded multilayer perceptron
Nicolino J. Pizzi
National Research Council Canada, Institute for Biodiagnostics
435 Ellice Avenue, Winnipeg, MB, R3B 1Y6, Canada
Volumetric radar data are used to detect severe summer storm events but discriminating between storm event types is a challenge due to the high dimensionality and amorphous nature of the data, the paucity of data labeled through an external independent reference test, and the imprecision of the class labels. Two multilayer perceptron architectures are used to discriminate between two types of storm events, hail and tornado. The first architecture uses the original meteorological feature vectors whereas the second transforms some of these features using a method known as fuzzy interquartile encoding. Both architectures are benchmarked against linear discriminant analysis. It is shown that the fuzzy encoded multilayer perceptron significantly outperforms the other two methods.
A radar data processing system conducts a volume scan by stepping a continuously rotating antenna through a series of elevation angles at regular intervals. Subsystems exist that allow operational meteorologists to focus their attention on regions of interest, known as storm cells, within the volumetric radar scan. It is difficult to classify detected storm cells into specific types of storm events due to a number of confounding factors . Storm cells have an amorphous three-dimensional structure as well as a vague evolution. The nature of the rotating antenna causes data acquisition to be incomplete. The external independent reference test that is used to assign class labels to storm cells is a schedule of observed storm events. This schedule contains information such as the geographic location of a storm, its duration and time of origin, and several of its meteorological characteristics. As this schedule will contain events observed by the general public (apart from meteorologists and climate observers), the descriptions may be inaccurate and, hence, the class labels may be imprecise.
When a storm cell is found, a number of meteorological parameters, known as products, are derived from the volumetric radar data including the cell’s volume and extent, velocity, maximum wind gust potentials, and gradient profiles . A set of these products and corresponding class labels are used to determine the classification efficacy of the multilayer perceptron . Two architectures are used: one that uses as input the original products, and another that transforms some of the products using fuzzy interquartile encoding , a method that takes features and intervalizes them across a collection of fuzzy sets that overlap at quartile boundaries. A third classifier, linear discriminant analysis , is used a performance benchmark.
2. Meteorological Products
A prerequisite to weather prediction is the ability to classify severe storm cells at different stages in their life cycles. The Radar Decision Support System (RDSS) , developed for Environment Canada by InfoMagnetics Technologies Corporation, maintains a database of storm cells that meet minimum reflectivity thresholds indicative of storm severity. The cells used in this study were gathered from volumetric data acquired from two non-Doppler radar installations in Manitoba over the summers of 1997–9. The sampling rate is once every five minutes with the reflectivity values used spread out in time over the five-minute interval as the radar adjusts its azimuth. This is sufficient to capture the dominant features in most heavy rain and wind events, however, hail and tornado events have substantially shorter life cycles and their features may not be clearly depicted . Each cell is composed of two main sets of data. The first set is a bounding cube that contains the three-dimensional storm cell (the bounding cube varies from cell to cell). The voxel values within this volume range from 0–255: 0 indicates no radar reflection occurred for that voxel location, 255 indicates that no data was acquired for the voxel, and the remaining values correspond to reflectivity values (dbZ). Figure 1 shows a two-dimensional slice of a typical storm cell volume. The second set comprises the meteorological products, which are qualitative values based on heuristic criteria.
For the purposes of classification, the original input vectors are composed of 21 features. The first 19 features are products selected from the storm cell and fall into two categories, derived (15 products) and heuristic (4 products). The first derived feature is the height offset product and the next two comprise the extent product (together they describe the overall size of the storm cell). The spatial characteristics of the storm cell’s core make up the next 8 features: core volume (1 feature); core height (1 feature); core size (2 features), the extent of the largest two-