This festschrift includes papers authored by many collaborators, colleagues, and students of Professor Thomas P Hettmansperger, who worked in research in nonparametric statistics, rank statistics, robustness, and mixture models during a career that spanned nearly 40 years. It is a broad sample of peer-reviewed, cutting-edge research related to nonparametrics and mixture models.
Sample Chapter(s)
Estimation of Location and Scale Parameters Based on Kernel Functional Estimators (239 KB)
https://doi.org/10.1142/9789814340564_fmatter
The following sections are included:
https://doi.org/10.1142/9789814340564_0001
In this paper we present new estimators of location and scale parameters that are constructed based on kernel estimators of the distribution functionals. We investigate the asymptotic behavior of the estimators under different sets of weak moment conditions and we devise a simple method to obtain the optimal bandwidth for the kernel function. The theoretical results are supported by a simulation study of the estimators' small sample performance.
https://doi.org/10.1142/9789814340564_0002
This article extends the nonparametric EM (npEM) algorithm of Benaglia et al. (2009a) and describes a method to select the bandwidths used in this extended algorithm. The extension allows for a distinct bandwidth for each mixture component and each block of conditionally independent and identically distributed repeated measures, and the bandwidth selection method is a generalization of Silverman's rule of thumb.
https://doi.org/10.1142/9789814340564_0003
In statistical shape analysis, as well as in several other application fields, researchers have to cope with a large number of variables and few available specimens. With reference to parametric inferential methods based on multivariate normality, the most natural way to compare two mean shapes is by using the Hotelling's T2 test. Despite its widespread use, this test may not be appropriate unless there is a large number of specimens available (Dryden and Mardia, 1998; Blair et al., 1994). For these reasons, we propose a nonparametric permutation counterpart, stressing the case in which the number of variables is larger than the permutation sample space, and find it very powerful. On the basis of these results, we perform a simulation study to evaluate the power of multivariate permutation tests based on Pesarin's combining functions (Pesarin, 2001). In particular, we show that the power of the suggested tests increases when the number of processed variables increases, provided that the induced noncentrality parameter δ increases, and this result holds even when the number of variables is larger than the permutation sample space. Such findings look very relevant for analysing multivariate small-sample problems when shapes are the object of study.
https://doi.org/10.1142/9789814340564_0004
In many forms of penalty function smoothing, the choice of smoothing parameter is critical. Existing procedures for this choice include cross-validation or variations on likelihood methods. An alternative is introduced here, based on taking an intuitive non-parametric approach. For the linear spline form of penalty function, a Cramér-von Mises test statistic for the amount of smoothing is equal to the ratio of the least-squares and roughness components of the penalty formulation. Thus, constraining this test statistic to be equal to a central value of the Cramér-von Mises distribution provides a rational method for choosing the smoothing parameter. The necessary computations can be carried out by a convergent fixed-point iteration method, along with O(n) equation solving techniques facilitated by a Cholesky-like decomposition. An example is used to illustrate the method, and simulations show that it compares favorably to the use of cross-validation.
https://doi.org/10.1142/9789814340564_0005
We consider statistical models that have been proposed for luminosity distributions for the globular clusters in the Milky Way and M31. Although earlier research showed that the cluster luminosity functions in those two galaxies were well fit by Gaussian distributions, subsequent investigations suggested that their luminosities were better fit by t-, rather than Gaussian, distributions. By applying the Bayesian Information Criterion, we do not find overwhelming statistical evidence that the t-distribution is superior to the Gaussian distribution as a model of luminosity distribution for the Milky Way. In the case of M31, we find moderate evidence that the Gaussian distribution is superior to the t-distribution. In neither case do we find strong evidence to support the use of one distribution over the other as a statistical model for the luminosities of the globular clusters in the Milky Way and M31. Consequently, we recommend that the Gaussian be retained as the statistical model for luminosity distribution. Moreover, we urge caution in the use of the Kolmogorov-Smirnov statistic to justify the choice of statistical models for globular cluster luminosity functions.
https://doi.org/10.1142/9789814340564_0006
We consider an improved density estimator which arises from treating the kernel density estimator as an element of the model that consists of all mixtures of the kernel, continuous or discrete. One can obtain the kernel density estimator with "likelihood-tuning" by using the uniform density as the starting value in an EM algorithm. The second tuning leads to a fitted density with higher likelihood than the kernel density estimator. The two step EM estimator can be written explicitly with a Gaussian kernel, and its bias is one order of magnitude smaller than the kernel estimator. In addition, the order of magnitude of the variance stays of the same order, so that the asymptotic mean square error can be reduced significantly. Compared with other modified density estimators, the simulation results show that the two-step likelihood-tuned density estimator performs robustly against different types of true density.
https://doi.org/10.1142/9789814340564_0007
We review shock models for defaults, both in the standard and in the urn-based approach. Shock models are motivated by engineering problems where a material can break because of stress, but they can also be efficiently used in other fields, such as economics and biology.
Standard shock models are parametric models, whereas the urn-based shock models are nonparametric models assuming as little as possible for the prediction of defaults. First, we mention some results for the two models, in particular describing the finite and asymptotic behavior of time to failure or of the chance for defaults. Finally, we also present an application of the second model to defaults of Italian firms, comparing our results with a standard prediction model of economics, the Z-score. We note that our model predicts the default behavior of these firms better.
https://doi.org/10.1142/9789814340564_0008
This paper explores additional properties of an inverse propensity score weighted kernel density estimator for estimating the density of incomplete data. This estimator is based on the Horvitz-Thompson estimator and requires estimating the propensity score assuming the response variable is missing at random. Nonparametric methods are used to estimate the propensity scores. Implications of misspecifying the missing data mechanism on the performance of the density estimator are discussed and evaluated. In addition, an augmented inverse propensity score weighted kernel density estimator, which is not influenced by this misspecification, is proposed and evaluated.
https://doi.org/10.1142/9789814340564_0009
The likelihood ratio test for m-sample homogeneity of covariance is notoriously sensitive to violations of the Gaussian assumptions. Its asymptotic behavior under non-Gaussian densities has been the subject of an abundant literature. In a recent paper, Yanagihara et al. (2005) show that the asymptotic distribution of the likelihood ratio test statistic, under arbitrary elliptical densities with finite fourth-order moments, is that of a linear combination of two mutually independent chi-square variables. Their proof is based on characteristic function methods, and only allows for convergence in distribution conclusions. Moreover, they require homokurticity among the m populations. Exploiting the findings of Hallin and Paindaveine (2009), we reinforce that convergence-in-distribution result into a convergence-in-probability one—that is, we explicitly decompose the likelihood ratio test statistic into a linear combination of two variables that are asymptotically independent chi-square—and moreover extend it to the heterokurtic case.
https://doi.org/10.1142/9789814340564_0010
Motivated by applications in microwave engineering and diffusion tensor imaging, we study the problem of deconvolution density estimation on the space of positive definite symmetric matrices. We develop a nonparametric estimator for the density function of a random sample of positive definite matrices. Our estimator is based on the Helgason-Fourier transform and its inversion, the natural tools for analysis of compositions of random positive definite matrices. Under several smoothness conditions on the density of the intrinsic error in the random sample, we derive upper bounds on the rates of convergence of our nonparametric estimator to the true density.
https://doi.org/10.1142/9789814340564_0011
We propose a variant of historical functional linear models for cases where the current response is affected by the predictor process in a window into the past. Different from the rectangular support of functional linear models, the triangular support of the historical functional linear models and the point-wise support of varying coefficient models, the current model has a sliding window support into the past. This idea leads to models that bridge the gap between varying coefficient models and functional linear (historic) models. By utilizing one-dimensional basis expansions and one-dimensional smoothing procedures, the proposed estimation algorithm is shown to have better performance and to be faster than the estimation procedures proposed for historical functional linear models.
https://doi.org/10.1142/9789814340564_0012
Rank-Based methods for linear models for independent and identically distributed data have been developed over the past 30 years. However, little work has been done in the area of mixed models. In this paper, we discuss a transformation approach to modeling a particular mixed model: one with an arbitrary number of fixed effects and covariates but only one random effect. Discussion of the asymptotic theory is given and the results of a simulation study verify the theory. These models are used to estimate the fixed effects from an experiment that uses a randomized block design.
https://doi.org/10.1142/9789814340564_0013
Analogous to the univariate plots, multivariate QQ plots can be used to compare two multivariate distributions (empirical or theoretical) by matching a set of quantiles in one distribution with the corresponding set in the other. We base our plots on Chaudhuri's spatial quantiles, which are vectors of the same dimension as the observations. The QQ plots consist of arrows pointing from one distribution's quantiles to the other's. In two dimensions, the arrows can be plotted directly. In higher dimensions, we look at projections. Principal component-like projections are used to find directions in which the two distributions are most different.
We focus on assessing how well certain symmetry models, including spherical symmetry, fit multivariate samples. The QQ plots compare the spatial quantiles of the data to that of a symmetrized version of the data. A randomization technique aids in choosing the number of dimensions that show deviations from the symmetry of interest.
https://doi.org/10.1142/9789814340564_0014
This paper studies variance estimators of cross-validation estimators of the generalization error. Three estimators are discussed, and their performance is evaluated in a variety of data models and data sizes. It is shown that the standard error associated with the moment approximation estimator is smaller than that associated with the other two. The effect of training and test set size on these estimators is discussed.
https://doi.org/10.1142/9789814340564_0015
In the estimation of distributions with time-to-event data, it is often natural to impose shape and smoothness constraints on the hazard function. Systems that fail because of wearing out might be assumed to have monotone hazard, or perhaps monotone-convex. Organ transplant failures are often assumed to have convex or bathtub-shaped hazard function. In this paper we present estimates that maximize the likelihood over a set of shape-restricted regression splines. Right censoring is a simple extension. The methods are applied to real and simulated data sets to illustrate their properties and to compare with existing nonparametric estimators.
https://doi.org/10.1142/9789814340564_0016
Several extensions of the multivariate normal model have been shown to be useful in practical data analysis. Therefore, tools to identify which model might be appropriate for the analysis of a real data set are needed. This paper suggests the simultaneous use of two location and two scatter functionals to obtain multivariate descriptive measures for multivariate location, scatter, skewness and kurtosis, and shows how these measures can be used to distinguish among a wide range of models that extend the multivariate normal model. The method is demonstrated with examples on simulated and real data.
https://doi.org/10.1142/9789814340564_0017
In this paper we provide insight into the empirical properties of indirect cross-validation (ICV), a new method of bandwidth selection for kernel density estimators. First, we describe the method and report on the theoretical results used to develop a practical-purpose model for certain ICV parameters. Next, we provide a detailed description of a numerical study that shows that the ICV method usually outperforms least squares cross-validation (LSCV) in finite samples. One of the major advantages of ICV is its increased stability compared to LSCV. Two real data examples show the benefit of using both ICV and a local version of ICV.
https://doi.org/10.1142/9789814340564_0018
Individuals with matching true scores on two testing occasions are defined as stable. Individuals are unstable otherwise. Classical reliability theory assumes all individuals are stable. A proposed model assumes there is a mixed population of stable and unstable individuals. The estimated probability that individual i = 1, 2,.. ,n is stable is used to weight i's test scores, forming a weighted correlation coefficient rw, which is reliability under the model.
https://doi.org/10.1142/9789814340564_0019
Rank regression offers a valuable alternative to the classical least squares approach. The use of rank regression not only provides protection against outlier contamination but also leads to substantial efficiency gain in the presence of heavier-tailed errors. This article studies the asymptotic performance of rank regression with Wilcoxon scores when the regression function is possibly mis-specified. We establish that under general conditions, the Wilcoxon rank regression estimator converges in probability to a well-defined limit and has an asymptotic normal distribution. We also derive a formula for the bias of omitted variables. Besides furthering our understanding of the properties of rank regression, these theoretical results have important implications for developing rank-based model selection and model checking procedures.
https://doi.org/10.1142/9789814340564_0020
Variable selection via penalized likelihood has received considerable attention recently. Penalized likelihood estimators with properly chosen penalty functions possess nice properties. In practice, optimizing the penalized likelihood function is often challenging because the object function may be nondiffer-entiable and/or nonconcave. Existing algorithms such as the local quadratic approximation (LQA) algorithm share a similar drawback of backward selection that once a variable is deleted, it is essentially left out of the final model. We propose the iterative conditional maximization (ICM) algorithm to address the aforementioned drawback. It utilizes the characteristics of the nonconcave penalized likelihood and enjoys fast convergence. Three simulation studies, in linear, logistic, and Poisson regression, together with one real data analysis, are conducted to assess the performance of the ICM algorithm.
https://doi.org/10.1142/9789814340564_bmatter
AUTHOR INDEX