A Kernel Test for Three Variable Interations Dino Sejdinovi1 , Arthur Gretton1 , Wiher Bergsma2 2 1 Gatsby Unit, CSML, University College London Department of Statistis, London Shool of Eonomis NIPS, 07 De 2013 D. Sejdinovi (CSML, UCL) Three-variable tests NIPS, 07 De 2013 1 / 17 Deteting a higher order interation How to detet V-strutures with pairwise weak (or nonexistent) dependene? Y X Z D. Sejdinovi (CSML, UCL) Three-variable tests NIPS, 07 De 2013 2 / 17 Deteting a higher order interation How to detet V-strutures with pairwise weak (or nonexistent) dependene? D. Sejdinovi (CSML, UCL) Three-variable tests NIPS, 07 De 2013 2 / 17 Deteting a higher order interation How to detet V-strutures with pairwise weak (or nonexistent) dependene? X ⊥ ⊥ Y, Y ⊥ ⊥ Z, X X vs Y ⊥ ⊥ Z Y X Y vs Z Z X vs Z XY vs Z i .d . N (0, 1), X , Y i .∼ Z | X , Y ∼ (XY )Exp( √12 ) sign D. Sejdinovi (CSML, UCL) Three-variable tests NIPS, 07 De 2013 2 / 17 Deteting pairwise dependene How to detet dependene in a non-Eulidean / strutured domain? Y X 1 : Honourable senators, I have a question for the Leader of the Government in the Senate with regard to the support funding to farmers that has been announed. Most farmers have not reeived any money yet. Y Xprovinial 2 : No doubt there is great pressure on and muniipal governments in relation to the issue of hild are, but the reality is that there have been no uts to hild are funding from the federal government to the provines. In fat, we have inreased federal investments for early hildhood development. ··· 1 : Honorables sénateurs, ma question s'adresse au leader du gouvernement au Sénat et onerne l'aide naniére qu'on a annonée pour les agriulteurs. La plupart des agriulteurs n'ont enore rien reçu de et argent. 2 : Il est évident que les ordres de gouvernements proviniaux et muniipaux subissent de fortes pressions en e qui onerne les servies de garde, mais le gouvernement n'a pas réduit le nanement qu'il verse aux provines pour les servies de garde. Au ontraire, nous avons augmenté le nanement fédéral pour le développement des jeunes enfants. ? ⇐⇒ ··· Are the Frenh text extrats translations of the English ones? D. Sejdinovi (CSML, UCL) Three-variable tests NIPS, 07 De 2013 3 / 17 Deteting pairwise dependene −→ K= −→ L= D. Sejdinovi (CSML, UCL) Three-variable tests NIPS, 07 De 2013 4 / 17 Deteting pairwise dependene −→ K= hH KH , H LH i = (H KH ◦ H LH )++ H = I − n1 11⊤ (entering matrix) A++ =P P n n −→ L= D. Sejdinovi (CSML, UCL) i =1 Three-variable tests j =1 Aij NIPS, 07 De 2013 4 / 17 Kernel Embedding feature map: z 7→ k (·, z ) ∈ Hk instead of z 7→ (ϕ1 (z ), . . . , ϕs (z )) ∈ Rs hk (·, z ), k (·, w )iHk = k (z , w ) inner produts easily omputed D. Sejdinovi (CSML, UCL) Three-variable tests NIPS, 07 De 2013 5 / 17 Kernel Embedding feature map: z 7→ k (·, z ) ∈ Hk instead of z 7→ (ϕ1 (z ), . . . , ϕs (z )) ∈ Rs hk (·, z ), k (·, w )iHk = k (z , w ) inner produts easily omputed embedding: P 7→ µk (P ) = EZ ∼P k (·, Z ) ∈ Hk instead of P 7→ (Eϕ1 (Z ), . . . , Eϕs (Z )) ∈ Rs hµk (P ), µk (Q )iHk = EZ ∼P ,W ∼Q k (Z , W ) inner produts easily estimated D. Sejdinovi (CSML, UCL) Three-variable tests NIPS, 07 De 2013 5 / 17 Independene test via embeddings Maximum Mean Disrepany (MMD) (Borgwardt et al, 2006; Gretton et al, 2007) MMDk (P , Q ) = kµk (P ) − µk (Q )kHk ISPD : kernels: µk injetive on all signed measures and (Sriperumbudur, 2010) MMDk metri Gaussian, Laplaian, inverse multiquadratis, Matérn et. D. Sejdinovi (CSML, UCL) Three-variable tests NIPS, 07 De 2013 6 / 17 Independene test via embeddings Maximum Mean Disrepany (MMD) (Borgwardt et al, 2006; Gretton et al, 2007) MMDk (P , Q ) = kµk (P ) − µk (Q )kHk ISPD : kernels: µk injetive on all signed measures and (Sriperumbudur, 2010) MMDk metri Gaussian, Laplaian, inverse multiquadratis, Matérn et. Hilbert-Shmidt Independene Criterion (HSIC) Gretton et al (2005, 2008); Smola et al (2007) 2 kµκ (PXY ) − µκ (PX PY )kHκ : k( !" κ( !" k( D. Sejdinovi (CSML, UCL) Three-variable tests , #" #" !" , !" , #" l( ) #" , #" ) )= ) × l( NIPS, 07 De 2013 !" !" , #" 6 / 17 ) Independene test via embeddings Maximum Mean Disrepany (MMD) (Borgwardt et al, 2006; Gretton et al, 2007) MMDk (P , Q ) = kµk (P ) − µk (Q )kHk ISPD : kernels: µk injetive on all signed measures and (Sriperumbudur, 2010) MMDk metri Gaussian, Laplaian, inverse multiquadratis, Matérn et. Hilbert-Shmidt Independene Criterion (HSIC) Gretton et al (2005, 2008); Smola et al (2007) 2 kµκ (PXY ) − µκ (PX PY )kHκ : Empirial HSIC= n12 (H KH ◦ H LH )++ Powerful independene tests that generalize dCov of Szekely et al (2007); DS et al (2013) D. Sejdinovi (CSML, UCL) Three-variable tests k( !" κ( !" k( , #" #" !" , !" , #" l( ) #" , #" ) )= ) × l( NIPS, 07 De 2013 !" !" , #" 6 / 17 ) V-struture Disovery Y X Z Assume X ⊥ ⊥ Y has been established. V-struture an then be deteted by: CI test: H0 : X ⊥ ⊥ Y |Z (Zhang et al 2011) or D. Sejdinovi (CSML, UCL) Three-variable tests NIPS, 07 De 2013 7 / 17 V-struture Disovery Y X Z Assume X ⊥ ⊥ Y has been established. V-struture an then be deteted by: CI test: H0 : X ⊥ ⊥ Y |Z (Zhang et al 2011) or Fatorisation test: H0 : (X , Y ) ⊥ ⊥ Z ∨ (X , Z ) ⊥ ⊥ Y ∨ (Y , Z ) ⊥ ⊥X (multiple standard two-variable tests) ompute p-values for eah of the marginal tests for (Y , Z ) ⊥⊥ X , Y , or (X , Y ) ⊥⊥ Z (X , Z ) ⊥ ⊥ apply Holm-Bonferroni (HB) sequentially rejetive orretion (Holm 1979) D. Sejdinovi (CSML, UCL) Three-variable tests NIPS, 07 De 2013 7 / 17 V-struture Disovery (2) How to detet V-strutures with pairwise weak (or nonexistent) dependene? X ⊥ ⊥ Y, Y ⊥ ⊥ Z, X X1 vs Y1 ⊥ ⊥ Z Y X Y1 vs Z1 Z X1 vs Z1 X1*Y1 vs Z1 i .d . X1 , Y1 i .∼ N (0, 1), Z1 | X1 , Y1 ∼ (X1 Y1 )Exp ( √12 ) sign D. Sejdinovi (CSML, UCL) Three-variable tests NIPS, 07 De 2013 8 / 17 V-struture Disovery (2) How to detet V-strutures with pairwise weak (or nonexistent) dependene? X ⊥ ⊥ Y, Y ⊥ ⊥ Z, X X1 vs Y1 ⊥ ⊥ Z Y X Y1 vs Z1 Z X1 vs Z1 X1*Y1 vs Z1 i .d . X1 , Y1 i .∼ N (0, 1), Z1 | X1 , Y1 ∼ (X1 Y1 )Exp ( √12 ) i .d . N (0, Ip −1 ) X2:p , Y2:p , Z2:p i .∼ sign D. Sejdinovi (CSML, UCL) Three-variable tests NIPS, 07 De 2013 8 / 17 Null aeptane rate (Type II error) V-struture Disovery (3) V-struture disovery: Dataset A 1 0.8 0.6 0.4 2var: Fator 0.2 CI: 0 1 3 5 7 9 11 13 15 X ⊥⊥ Y |Z 17 19 Dimension Figure: CI test for X Y |Z from Zhang et al (2011), and a fatorisation test n = 500 ⊥ ⊥ with a HB orretion, D. Sejdinovi (CSML, UCL) Three-variable tests NIPS, 07 De 2013 9 / 17 Lanaster Interation Measure Denition (Bahadur (1961); Lanaster (1969)) of (X1 , . . . , XD ) ∼ P is a signed measure ∆P that vanishes whenever P an be fatorised in a non-trivial way as a produt of its (possibly multivariate) marginal distributions. Interation measure D. Sejdinovi (CSML, UCL) Three-variable tests NIPS, 07 De 2013 10 / 17 Lanaster Interation Measure Denition (Bahadur (1961); Lanaster (1969)) of (X1 , . . . , XD ) ∼ P is a signed measure ∆P that vanishes whenever P an be fatorised in a non-trivial way as a produt of its (possibly multivariate) marginal distributions. Interation measure D=2: ∆L P = PXY − PX PY D. Sejdinovi (CSML, UCL) Three-variable tests NIPS, 07 De 2013 10 / 17 Lanaster Interation Measure Denition (Bahadur (1961); Lanaster (1969)) of (X1 , . . . , XD ) ∼ P is a signed measure ∆P that vanishes whenever P an be fatorised in a non-trivial way as a produt of its (possibly multivariate) marginal distributions. Interation measure D=2: D=3: ∆L P = PXY − PX PY ∆L P = PXYZ − PX PYZ − PY PXZ − PZ PXY + 2PX PY PZ D. Sejdinovi (CSML, UCL) Three-variable tests NIPS, 07 De 2013 10 / 17 Lanaster Interation Measure Denition (Bahadur (1961); Lanaster (1969)) of (X1 , . . . , XD ) ∼ P is a signed measure ∆P that vanishes whenever P an be fatorised in a non-trivial way as a produt of its (possibly multivariate) marginal distributions. Interation measure D=2: D=3: ∆L P = PXY − PX PY ∆L P = PXYZ − PX PYZ − PY PXZ − PZ PXY + 2PX PY PZ ∆L P = PXY Z −PX PY Z Z D. Sejdinovi (CSML, UCL) Y X +2PX PY PZ −PZ PXY −PY PXZ Y X Z Three-variable tests Y X Z Y X Z NIPS, 07 De 2013 10 / 17 Lanaster Interation Measure Denition (Bahadur (1961); Lanaster (1969)) of (X1 , . . . , XD ) ∼ P is a signed measure ∆P that vanishes whenever P an be fatorised in a non-trivial way as a produt of its (possibly multivariate) marginal distributions. Interation measure D=2: D=3: ∆L P = PXY − PX PY ∆L P = PXYZ − PX PYZ − PY PXZ − PZ PXY + 2PX PY PZ ∆L P = 0 PXY Z −PX PY Z Z D. Sejdinovi (CSML, UCL) Y X +2PX PY PZ −PZ PXY −PY PXZ Y X Z Three-variable tests Y X Z Y X Z NIPS, 07 De 2013 10 / 17 A Test using Lanaster Measure Construt a test by estimating kµκ (∆L P )k2Hκ , where κ = k ⊗ l ⊗ m: kµκ (PXYZ − PXY PZ − · · · )k2Hκ = hµκ PXYZ , µκ PXYZ iHκ − 2 hµκ PXYZ , µκ PXY PZ iHκ · · · D. Sejdinovi (CSML, UCL) Three-variable tests NIPS, 07 De 2013 11 / 17 Inner Produt Estimators ν\ν ′ PXYZ ( PXYZ PXY PZ PXZ PY PYZ PX PX PY PZ K ◦ L ◦ M)++ K ◦ L) M)++ (K ◦ L)++ M++ K ◦ M) L)++ (MKL)++ (K ◦ M)++ L++ M ◦ L) K)++ (KLM)++ (KML)++ (L ◦ M)++ K++ tr (K+ ◦ L+ ◦ M+ ) PXY PZ PXZ PY (( (( PYZ PX (( PX PY PZ Table: D. Sejdinovi (CSML, UCL) V -statisti estimators of hµκ ν, µκ ν ′ iH Three-variable tests KL)++ M++ KM)++ L++ (LM)++ K++ K++ L++ M++ ( ( κ NIPS, 07 De 2013 12 / 17 Inner Produt Estimators ν\ν ′ PXYZ ( PXYZ PXY PZ PXZ PY PYZ PX PX PY PZ K ◦ L ◦ M)++ K ◦ L) M)++ (K ◦ L)++ M++ K ◦ M) L)++ (MKL)++ (K ◦ M)++ L++ M ◦ L) K)++ (KLM)++ (KML)++ (L ◦ M)++ K++ tr (K+ ◦ L+ ◦ M+ ) PXY PZ (( PXZ PY (( PYZ PX (( PX PY PZ Table: V -statisti estimators of hµκ ν, µκ ν ′ iH KL)++ M++ KM)++ L++ (LM)++ K++ K++ L++ M++ ( ( κ Proposition (Lanaster interation statisti) kµκ (∆L P )k2Hκ = 1 n2 (H KH ◦ H LH ◦ H MH )++ . Empirial joint entral moment in the feature spae D. Sejdinovi (CSML, UCL) Three-variable tests NIPS, 07 De 2013 12 / 17 Null aeptane rate (Type II error) Example A: fatorisation tests V-struture disovery: Dataset A 1 0.8 0.6 0.4 2var: Fator ∆L : Fator 0.2 CI: 0 1 3 5 7 9 11 13 15 X ⊥⊥ Y |Z 17 19 Dimension Figure: Fatorisation hypothesis: Lanaster statisti vs. a two-variable based test (both with HB orretion); Test for D. Sejdinovi (CSML, UCL) X ⊥ ⊥ Y |Z Three-variable tests from Zhang et al (2011), n = 500 NIPS, 07 De 2013 13 / 17 Example B: Joint dependene an be easier to detet i .d . N (0, 1) X1 , Y1 i .∼ 2 w .p. 1/3, X1 + ǫ, 2 Z1 = Y1 + ǫ, w .p. 1/3, where ǫ ∼ N (0, 0.12 ). X1 Y1 + ǫ, w .p. 1/3, i .d . X2:p , Y2:p , Z2:p i .∼ N (0, Ip −1 ) dependene of Z on pair (X , Y ) is stronger than on X and Y individually D. Sejdinovi (CSML, UCL) Three-variable tests NIPS, 07 De 2013 14 / 17 Null aeptane rate (Type II error) Example B: fatorisation tests V-struture disovery: Dataset B 1 0.8 0.6 0.4 2var: Fator ∆L : Fator 0.2 CI: 0 1 3 5 7 9 11 13 15 X ⊥⊥ Y |Z 17 19 Dimension Figure: Fatorisation hypothesis: Lanaster statisti vs. a two-variable based test (both with HB orretion); Test for D. Sejdinovi (CSML, UCL) X ⊥ ⊥ Y |Z Three-variable tests from Zhang et al (2011), n = 500 NIPS, 07 De 2013 15 / 17 Interation for D≥4 Interation measure valid for all (Streitberg, 1990): ∆S P = D X (−1)|π|−1 (|π| − 1)!Jπ P π For a partition π, Jπ assoiates to the joint the orresponding fatorisation, e.g., J13|2|4 P = PX1 X3 PX2 PX4 . D. Sejdinovi (CSML, UCL) Three-variable tests NIPS, 07 De 2013 16 / 17 Interation for D≥4 Interation measure valid for all (Streitberg, 1990): ∆S P = D X (−1)|π|−1 (|π| − 1)!Jπ P π For a partition π, Jπ assoiates to the joint the orresponding fatorisation, e.g., J13|2|4 P = PX1 X3 PX2 PX4 . D. Sejdinovi (CSML, UCL) Three-variable tests NIPS, 07 De 2013 16 / 17 D≥4 Interation measure valid for all (Streitberg, 1990): ∆S P D X = (−1)|π|−1 (|π| − 1)!Jπ P π For a partition π, Jπ J13|2|4 P = PX1 X3 PX2 PX4 . D. Sejdinovi (CSML, UCL) Bell numbers growth 1e+19 1e+14 1e+09 1e+04 assoiates to the joint the orresponding fatorisation, e.g., Number of partitions of {1,...,D} Interation for Three-variable tests 1 3 5 7 9 11 13 15 17 19 21 23 25 D NIPS, 07 De 2013 16 / 17 D≥4 Interation measure valid for all (Streitberg, 1990): ∆S P D X = (−1)|π|−1 (|π| − 1)!Jπ P π For a partition π, Jπ Bell numbers growth 1e+19 1e+14 1e+09 1e+04 assoiates to the joint the orresponding fatorisation, e.g., Number of partitions of {1,...,D} Interation for J13|2|4 P = PX1 X3 PX2 PX4 . 1 3 5 7 9 11 13 15 17 19 21 23 25 D (Lanaster interation) vs. joint umulants (Streitberg interation) joint entral moments D. Sejdinovi (CSML, UCL) Three-variable tests NIPS, 07 De 2013 16 / 17 Summary A nonparametri test for three-variable interation and for total independene, using embeddings of signed measures into RKHSs D. Sejdinovi (CSML, UCL) Three-variable tests NIPS, 07 De 2013 17 / 17 Summary A nonparametri test for three-variable interation and for total independene, using embeddings of signed measures into RKHSs Test statistis are simple and easy to ompute - orresponding permutation tests signiantly outperform standard two-variable-based tests on V-strutures with weak pairwise interations D. Sejdinovi (CSML, UCL) Three-variable tests NIPS, 07 De 2013 17 / 17 Summary A nonparametri test for three-variable interation and for total independene, using embeddings of signed measures into RKHSs Test statistis are simple and easy to ompute - orresponding permutation tests signiantly outperform standard two-variable-based tests on V-strutures with weak pairwise interations All forms of Lanaster three-variable interation an be deteted for a large family of reproduing kernels (ISPD) D. Sejdinovi (CSML, UCL) Three-variable tests NIPS, 07 De 2013 17 / 17 Summary A nonparametri test for three-variable interation and for total independene, using embeddings of signed measures into RKHSs Test statistis are simple and easy to ompute - orresponding permutation tests signiantly outperform standard two-variable-based tests on V-strutures with weak pairwise interations All forms of Lanaster three-variable interation an be deteted for a large family of reproduing kernels (ISPD) Thank You! Poster S6 D. Sejdinovi (CSML, UCL) Three-variable tests NIPS, 07 De 2013 17 / 17 Referenes B. Streitberg, Lanaster 18781885, 1990. A. Kankainen. interations revisited. Annals of Statistis 18(4): Consistent Testing of Total Independene Based on the Empirial Charateristi Funtion . PhD thesis, University of Jyväskylä, 1995. A. Gretton, K. Fukumizu, C.-H. Teo, L. Song, B. Shölkopf and A. Smola. kernel statistial test of independene. in Advanes in Neural Information Proessing Systems 20: 585592, MIT Press, 2008. A B. Sriperumbudur, A. Gretton, K. Fukumizu, G. Lankriet and B. Shölkopf. Hilbert spae embeddings and metris on probability measures. J. Mah. Learn. Res. 11: 15171561, 2010. D. Sejdinovi, B. Sriperumbudur, A. Gretton and K. Fukumizu, Equivalene of distane-based and RKHS-based statistis in hypothesis testing. Annals of Statistis 41(5): 2263-2291, 2013. D. Sejdinovi (CSML, UCL) Three-variable tests NIPS, 07 De 2013 18 / 17