Abstract (EN):
There has been increasing interest in pattern classification methods and neuroimaging studies using permutation tests to estimate the statistical significance of a classifier (p-value). Permutation tests usually use the test error as a dataset statistic to estimate the p-value(s) by measuring the dissimilarity between two or more populations. Using the test error as a dataset statistic; however, may camouflage the lowest recognizable classes, and the resulting p-value will be biased toward better values (usually lower values) because of the highly recognizable classes; thus, lower p-values could sometimes be the result of undercoverage. In this study, we investigate this problem and propose the implementation of permutation tests based on a per-class test error as a dataset statistic. We also propose a model that is based on partially scrambling the testing samples (in this model, the training samples are not scrambled) when computing the non-permuted statistic in order to judge the p-value's tolerance and to draw conclusions regarding, which permutation test procedures are more reliable. For the same purpose, we propose another model that is based on chance-level shifting of the permuted statistic. We tested these two proposed models on functional magnetic resonance imaging data that were collected while human subjects responded to visual stimulation paradigms, and our results showed that these models can aid in determining, which permutation test procedure is superior. We also found that permutation tests that use a per-class test error as a dataset statistic are more reliable in addressing the null hypothesis that all classes in the problem domain are drawn from the same distribution.
Language:
English
Type (Professor's evaluation):
Scientific
No. of pages:
10