Spatio-temporal modelling for non-stationary point referenced data
Spatial and spatio-temporal phenomena are commonly modelled as Gaussian processes via the geostatistical model (Gelfand & Banerjee, 2017). In the geostatistical model the spatial dependence structure is modelled using covariance functions. Most commonly, the covariance functions impose an assumption of spatial stationarity on the process. That means the covariance between observations at particular locations depends only on the distance between the locations (Banerjee et al., 2014). It has been widely recognized that most, if not all, processes manifest spatially nonstationary covariance structure Sampson (2014). If the study domain is small in area or there is not enough data to justify more complicated nonstationary approaches, then stationarity may be assumed for the sake of mathematical convenience (Fouedjio, 2017). However, relationships between variables can vary significantly over space, and a ‘global’ estimate of the relationships may obscure interesting geographical phenomena (Brunsdon et al., 1996; Fouedjio, 2017; Sampson & Guttorp, 1992).
In this thesis, we considered three non-parametric approaches to flexibly account for non-stationarity in both spatial and spatio-temporal processes. First, we proposed partitioning the spatial domain into sub-regions using the K-means clustering algorithm based on a set of appropriate geographic features. This allowed for fitting separate stationary covariance functions to the smaller sub-regions to account for local differences in covariance across the study region. Secondly, we extended the concept of covariance network regression to model the covariance matrix of both spatial and spatio-temporal processes. The resulting covariance estimates were found to be more flexible in accounting for spatial autocorrelation than standard stationary approaches. The third approach involved geographic random forest methodology using a neighbourhood structure for each location constructed through clustering. We found that clustering based on geographic measures such as longitude and latitude ensured that observations that were too far away to have any influence on the observations near the locations where a local random forest was fitted were not selected to form the neighbourhood.
In addition to developing flexible methods to account for non-stationarity, we developed a pivotal discrepancy measure approach for goodness-of-fit testing of spatio-temporal geostatistical models. We found that partitioning the pivotal discrepancy measures increased the power of the test.