One-sample U -statistics can be regarded as generalizations of means. They are sums of dependent variables, but we show them to be asymptotically normal by the projection method. Certain interesting test statistics, such as the Wilcoxon statistics and Kendall's r-statistic, are one-sample U -statistics. The Wilcoxon statistic for testing a difference in location between two samples is an example of a two-sample U-statistic. The Cramer-von Mises statistic is an example of a degenerate U-statistic.
One-Sample V-Statistics
Let X I, … , Xn be a random sample from an unknown distribution. Given a known function h, consider estimation of the “parameter“
In order to simplify the formulas, it is assumed throughout this section that the function h is permutation symmetric in its r arguments. (A given h could always be replaced by a symmetric one.) The statistic h(XI , •.. , Xr) is an unbiased estimator for (), but it is unnatural, as it uses only the first r observations. A U-statistic with kernel h remedies this; it is defined as
where the sum is taken over the set of all unordered subsets f3 of r different integers chosen from ﹛I, … , n﹜. Because the observations are i.i.d., U is an unbiased estimator for () also. Moreover, U is permutation symmetric in Xl, … Xn, and has smaller variance than h(Xb … , Xr). In fact, if X (1) , •.• , X(n) denote the values Xl, … , Xn stripped from their order (the order statistics in the case of real-valued variables), then
Because a conditional expectation is a projection, and projecting decreases second moments, the variance of the U -statistic U is smaller than the variance of the naive estimator h(XJ, … , Xr).
In this section it is shown that the sequence ./ii(U–0) is asymptotically nonnal under the condition that Eh2(X\, … , Xr ) < 00.
12.1 Example. A U-statistic of degree r = 1 is a mean n-I1::7=\h(Xi ). The asserted asymptotic nonnality is then just the central limit theorem.