# Divisible Statistics and Their Partial Sum Processes: Asymptotic Properties and Applications

**Divisible statistics have been widely used in many areas of statistical analysis. For example, Pearson's Chi-square statistic and the log-likelihood ratio statistic are frequently used in goodness of fit (GOF) and categorical analysis; the maximum likelihood (ML) estimators of the Shannon's and Simpson's diversity indices are often used as measure of diversity; and the spectral statistic plays a key role in the theory of large number of rare events. In the classical multinomial model, where the number of disjoint events N and their probabilities are all fixed, limit distributions of many divisible statistics have gradually been established. However, most of the results are based on the asymptotic equivalence of these statistics to Pearson's Chi-square statistic and the known limit distribution of the latter. In fact, with deeper analysis, one can conclude that the key point is not the asymptotic behavior of the Chi-square statistic, but that of the normalized frequencies. Based on the asymptotic normality of the normalized frequencies in the classical model, a unified approach to the limit theorems of more general divisible statistics can be established, of which the case of the Chi-square statistic is simply a natural corollary.**

In many applications, however, the classical multinomial model is not appropriate, and an extension to new models becomes necessary. This new type of model, called "non-classical" multinomial models, considers the case when N increases and the {Pni} change as sample size n increases. As we will see, in these non-classical models, both the asymptotic normality of the normalized frequencies and the asymptotic equivalence of many divisible statistics to the Chi-square statistic are lost, and the limit theorems established in classical model are no longer valid in non-classical models.

The extension to non-classical models not only met the demands of many real world applications, but also opened a new research area in statistical analysis, which has not been thoroughly investigated so far. Although some results on the limit distributions of the divisible statistics in non-classical models have been acquired, e.g., Holst (1972); Morris (1975); Ivchenko and Levin (1976); Ivchenko and Medvedev (1979), they are far from complete. Though not yet attracting much attention by many applied statisticians, another advanced approach, introduced by Khmaladze (1984), makes use of modern martingale theory to establish functional limit theorems of the partial sum processes of divisible statistics successfully. In the main part of this thesis, we show that this martingale approach can be extended to more general situations where both Gaussian and Poissonian frequencies exist, and further discuss the properties and applications of the limiting processes, especially in constructing distribution-free statistics.

The last part of the thesis is about the statistical analysis of large number of rare events (LNRE), which is an important class of non-classical multinomial models and presented in numerous applications. In LNRE models, most of the frequencies are very small and it is not immediately clear how consistent and reliable inference can be achieved. Based on the definitions and key concepts firstly introduced by Khmaladze (1988), we discuss a particular model with the context of diversity of questionnaires. The advanced statistical techniques such as large deviation, contiguity and Edgeworth expansion used in establishing limit theorems underpin the potential of LNRE theory to become a fruitful research area in future.