The permutation distribution of matrix correlation statistics
Many statistics used to test for association between pairs (yi, zi) of multivariate observations, sampled from n individuals in a population, are based on comparing the similarity aij of each pair (i, j) of individuals, as evidenced by the values yi and yj, with their similarity bij based on the values zi, and Zj. A common strategy is to compute the sample correlation between these two sets of values. The appropriate null hypothesis distribution is that derived by permuting the zi's at random among the individuals, while keeping the yi's fixed. In this paper, a Berry–Esseen bound for the normal approximation to this null distribution is derived, which is useful even when the matrices a and b are relatively sparse, as is the case in many applications. The proofs are based on constructing a suitable exchangeable pair, a technique at the heart of Stein's method.