Abstract
We investigate the problem of testing whether d possibly multivariate random variables, which may or may not be continuous, are jointly (or mutually) independent. Our method builds on ideas of the two-variable Hilbert–Schmidt independence criterion but allows for an arbitrary number of variables. We embed the joint distribution and the product of the marginals in a reproducing kernel Hilbert space and define the d-variable Hilbert–Schmidt independence criterion dHSIC as the squared distance between the embeddings. In the population case, the value of dHSIC is 0 if and only if the d variables are jointly independent, as long as the kernel is characteristic. On the basis of an empirical estimate of dHSIC, we investigate three non-parametric hypothesis tests: a permutation test, a bootstrap analogue and a procedure based on a gamma approximation. We apply non-parametric independence testing to a problem in causal discovery and illustrate the new methods on simulated and real data sets.
Original language | English |
---|---|
Journal | Journal of the Royal Statistical Society, Series B (Statistical Methodology) |
Volume | 80 |
Issue number | 1 |
Pages (from-to) | 5-31 |
Number of pages | 27 |
ISSN | 1369-7412 |
DOIs | |
Publication status | Published - 1 Jan 2018 |
Keywords
- Causal inference
- Independence test
- Kernel methods
- V-statistics