Brigitte Le Roux Français

Combinatorial Inference in Geometric Data Analysis

Overview

Combinatorial Inference in Geometric Data Analysis provides a combinatorial approach to statistical inference adapted to Euclidean clouds of points arising from Geometric Data Analysis (GDA). The book develops a unified framework for typicality tests (comparing a group to a reference population) and homogeneity tests (comparing several subclouds), relying on combinatorial permutation rather than parametric distributional assumptions.

Authors and Publisher

Brigitte Le Roux, Solène Bienaise, Jean-Luc Durand — Chapman & Hall/CRC, Computer Science & Data Analysis Series, 2019.
Publisher’s page

Table of Contents

ChapterPage
Prefacevii
Symbolsxi
1Introduction1
1.1 On combinatorial inference1
1.2 On Geometric Data Analysis4
1.3 On Inductive Data Analysis5
1.4 Computational aspects6
2Cloud of Points in a Geometric Space9
2.1 Basic statistics10
2.2 Covariance structure of a cloud14
2.3 Mahalanobis distance and principal ellipsoids20
2.4 Partition of a cloud25
3Combinatorial Typicality Tests29
3.1 The typicality problem29
3.2 Combinatorial typicality test for mean point32
3.3 One-dimensional case: typicality test for mean45
3.4 Combinatorial typicality test for variance49
3.5 Combinatorial inference in GDA51
3.6 Computations with R and Coheris SPAD software55
4Geometric Typicality Test65
4.1 Principle of the test65
4.2 Geometric typicality test for mean point69
4.3 One-dimensional case: typicality for mean86
4.4 The case of a design with two repeated measures90
4.5 Other methods92
4.6 Computations with R and Coheris SPAD software97
5Homogeneity Permutation Tests107
5.1 The homogeneity problem107
5.2 Principle of combinatorial homogeneity tests108
5.3 Homogeneity of independent groups: general case109
5.4 Homogeneity of two independent groups116
5.5 The case of a repeated measures design133
5.6 Other methods140
5.7 Computations with R and Coheris SPAD software141
6Research Case Studies153
6.1 The Parkinson study156
6.2 The Members of French Parliament and Globalisation170
6.3 The European Central Bankers study188
6.4 Cognitive Tests and Education200
Bibliography245
Author Index250
Subject Index252

Companion Materials

Data and Simplified R Scripts

The simplified R scripts below compute observed significance levels (p-values) and compatibility regions for Chapters 3, 4 and 5.

Chapter 3 — Combinatorial Typicality Tests

Typicality tests consist in comparing the observations of a group with the ones of a reference population of which the group may or may not be a subset. Two test statistics are studied: (1) the Mahalanobis distance between points with respect to the covariance structure of the reference cloud; (2) the variance of the cloud.

Chapter 4 — Geometric Typicality Test

The geometric typicality test consists in comparing the mean point of a Euclidean cloud to a reference point by taking the squared Mahalanobis distance between points as a test statistic. This test can be applied to a design with two repeated measures, the basic dataset being the individual differences.

Chapter 5 — Homogeneity Tests

The homogeneity tests presented in this chapter consist in comparing several subclouds by taking the M-variance between the mean points of subclouds as a test statistic — that is, the variance calculated from the Mahalanobis distance between points. The book studies the case of several independent groups and the case of repeated measures. In the case of several independent groups, several permutation schemes are studied depending on whether the comparison is global, partial, or specific (see pp. 109–110).

Full and SPAD-Interfaced R Scripts

The full R scripts implement the methods described in the book. Each ZIP archive contains three scripts (“main”, “parameters”, “core”), data files and a user’s guide.

Research Case Studies (Chapter 6)

For each case study, data are provided in Excel format together with a SPAD project that reproduces the analyses presented in Chapter 6.

  1. The Parkinson Study — data: Parkinson.xls; SPAD project: The Parkinson Study
  2. Members of French Parliament and Globalisation — data: MPs&Globalisation.xls; SPAD project: MPs-Globalisation
  3. The European Central Bankers Study — data and SPAD project available on request from Frédéric Lebaron.
  4. Cognitive Tests and Education — data: CognitiveTests.xls; SPAD project: Cognitive Study