#### Title

Prediction of urban stormwater quality at unmonitored catchments using artificial neural networks

#### Year

2011

#### Degree Name

Doctor of Philosophy

#### Department

University of Wollongong. School of Civil, Mining and Environmental Engineering

#### Recommended Citation

May, Daniel Brent, Prediction of urban stormwater quality at unmonitored catchments using artificial neural networks, Doctor of Philosophy thesis, University of Wollongong. School of Civil, Mining and Environmental Engineering, University of Wollongong, 2011. http://ro.uow.edu.au/theses/3411

#### Abstract

In this research, the applicability of using back-propagation artificial neural networks (ANNs) to model urban stormwater quality at unmonitored catchments is investigated. The study used data collected by the United States Geological Survey (USGS) and the United States Environmental Protection Agency (USEPA) as a part of the Nationwide Urban Runoff Program (NURP). Two main sets of analyses were undertaken. The first set focused upon modelling total phosphorus, while the second set analysed a number of water quality constituents.

Total phosphorus was initially analysed, since it was the most prevalent variable available in the dataset. Data was logarithmically transformed, and then regression models were constructed using both event mean concentration (EMC) and event load as the dependent variable. It was found that using EMC as the dependent variable produced the most accurate results. Therefore EMC was modeled in subsequent analyses. Regression models were then constructed on the entire dataset, and on a smaller regional subset defined using mean annual rainfall (MARN). It was found that the regression model constructed on the smaller subset of data produced the most accurate models. This was assumed to result from the increased homogeneity of the smaller subset, and increased number of variables used in the associated regression model. ANN models were then constructed using both datasets. It was found that ANN models were only slightly more accurate than the regression models. However, the ANN models were deemed to be less appropriate, due to their lack of transparency, increased calibration time, and increased potential to overfit the limited amount of data available.

The second set of analyses sought to compare a number of different statistical techniques capable of predicting urban stormwater quality. Additional variables from a variety of sources were combined with the NURP data, and missing values present in the NURP dataset were infilled using simplistic statistical techniques. Analyses were then undertaken upon a total of 14 constituents; ammonia (NH3), cadmium (Cd), chemical oxygen demand (COD), chloride (Cl), copper (Cu), dissolved phosphorus (DP), dissolved solids (DS), lead (Pb), nitrogen oxides (NOx), suspended solids (SS), total Kjeldhal nitrogen (TKN), total nitrogen (TN), total phosphorus (TP), and zinc (Zn). Regression models were constructed and compared to site mean concentrations, mean metropolitan area concentrations, land use based mean concentrations and nationwide mean concentrations.

Constant concentration estimates based upon geometric means were generally found to be more appropriate than arithmetic means. Geometric mean estimates were less likely to be biased by large potentially outlying EMCs, and generally predicted the lower EMCs which were correlated with large run off volumes, leading to accurate predictions of long term yields. Constant concentration models were then compared with the regression models. Overall, it was observed that site mean concentration estimates produced the most accurate results. However, these models were unable to be applied at unmonitored sites. Various models capable of predicting EMCs at unmonitored metropolitan areas were then compared. It was observed that regression models were more accurate than landuse averages. Surprisingly, land use averages were less accurate than a broad scale average of the entire dataset. Models capable of predicting EMCs at unmonitored catchments were then compared. It was found that the regression models produced more accurate estimates of EMCs than the mean metropolitan area concentration models, though both models produced very similar predictions of long term yields at single sites. However, the mean metropolitan area concentration models could not be applied at metropolitan areas with limited or no prior sampling. Furthermore, regression models provided valuable insight into the most significant processes influencing urban stormwater quality.

ANN models were then constructed to predict five highly sampled constituents; chemical oxygen demand, lead, suspended solids, total Kjeldhal nitrogen, and total phosphorus. Inputs used to construct ANN models were obtained from the previous regression analyses. ANN network parameters were then optimized using a trial and error approach, and the final ANN models compared to regression models. In general, the regression models were more accurate than the associated ANN models when validated upon independent data. In addition, ANN models were significantly more time consuming to construct and less transparent. Consequently, it was concluded that ANN models are currently not a viable technique for predicting urban storm water quality at unmonitored catchments.