Statistical Tests Guide

MARVELous supports various statistical tests for associating genetic exposures with phenotypic outcomes. This guide covers available tests, when to use them, and how to define custom tests.

Available Tests

Tests for Continuous Outcomes

Code	Name	Description
OLS	Ordinary Least Squares	Linear regression with optional covariates.
KW	Kruskal-Wallis	Non-parametric test for differences in distributions across groups.
AOV	One-way ANOVA	Parametric test for differences in group means.
MWU	Mann-Whitney U	Non-parametric rank-sum test comparing two groups.
T	Independent-samples T-test	Parametric test comparing means of two groups.

Tests for Binary Outcomes

Code	Name	Description
GLM-Binom	Logistic Regression	Generalized linear model with binomial family and logit link with optional covariates.
CHISQ	Chi-square	Tests independence between a binary outcome and an exposure in a contingency table.
FISHER	Fisher’s Exact	Exact test for 2×2 contingency tables. Preferred over chi-square when expected cell counts are small.

Tests for Categorical Outcomes

Code	Name	Description
CHISQ	Chi-square	Tests independence between a multi-category outcome and an exposure in a contingency table.

Survival Analysis

Code	Name	Description
Cox-PH	Cox Proportional Hazards	Semi-parametric survival model. Requires both an event indicator (0/1) and a time-to-event column. Supports covariates.

Covariate Models

Define multiple covariate models to compare adjusted and unadjusted results:

[Covs]
Unadjusted   None
Model_1      age;sex
Model_2      age;sex;bmi
Full age;sex;bmi;smoking;PC1;PC2;PC3;PC4

Each model runs separately with regression tests (OLS, GLM-Binom, FIRTH).

Note

Not all tests support covariates. They are run only with the “Unadjusted” model.

Custom Tests

The tests are based on the clean-data package, which included the described tests. The package allows is desgned to work with statistical tests and regression callables from scipy, statsmodels, and lifelines and defining custom statistical tests is possible by modifying marvel/association/tests.py.

Test Structure

Tests are defined as dictionaries with three keys:

{
    'Test method': callable,     # The test function
    'P-value': int or str,       # Index or attribute for p-value
    'kwargs': dict,              # Optional keyword arguments
}

Adding a Custom Test

Edit marvel/association/tests.py
Add your test to the STATS class:

from scipy import stats

class STATS:
    def __post_init__(self):
        self.__tests = {
            # ... existing tests ...

            'MY_TEST': {
                BNames.TEST_METHOD: my_test_function,
                BNames.TEST_PVALUE: 0,
                BNames.TEST_KWARGS: {},
            },
        }

    @property
    def my_test(self):
        '''My custom test.'''
        return self.__tests['MY_TEST']

class AllTests:
    def __post_init__(self):
        self.__tests = {
            # ... existing tests ...

            'MY_TEST': {
                TEST_DICT_NAMES.NAME: 'My Custom Test',
                TEST_DICT_NAMES.TEST: BTests().my_test,
                TEST_DICT_NAMES.STAT: STATS().my_test,
            },
        }

Use in configuration:

[ConTests]
outcome      MY_TEST

Environment Variable Override

You can specify a custom tests module via environment variable:

export MARVEL_TEST_DEFS=/path/to/custom_tests.py
marvelous config.cnf -v

Output Interpretation

Results File Columns

Column	Description
Model	Covariate model name (from [Covs] section)
Model name	Full name of the statistical test
Variable	Outcome variable name
Exposure	Exposure/gene name
N (Cases)	Number of cases (binary outcomes)
N (Samples)	Total sample size for this test
Exposed	Number of exposed (carrier) samples
Non-exposed	Number of non-exposed (non-carrier) samples
Estimate	Effect estimate (beta for OLS, log-OR for logistic)
Std. Error	Standard error of estimate
Test statistic	Test statistic value (t, chi-square, etc.)
P-value	P-value from the test
Estimate (95% CI)	Formatted estimate with confidence interval
OR (95% CI)	Odds ratio with CI (binary outcomes only)

Statistical Tests Guide

Available Tests

Tests for Continuous Outcomes

Tests for Binary Outcomes

Tests for Categorical Outcomes

Survival Analysis

Covariate Models

Custom Tests

Test Structure

Adding a Custom Test

Environment Variable Override

Output Interpretation

Results File Columns

See Also