marvel.association package

marvel.association.class_test module

Association testing code

class marvel.association.class_test.AssociationResult(baseline_table: DataFrame, coefficient_table: DataFrame)

Bases: object

Container for association test results

baseline_table: DataFrame
coefficient_table: DataFrame
class marvel.association.class_test.BaselineTestWrapper(test_module=None)

Bases: object

Wrapper around baseline testing functionality

run_single_test(outcome: str, exposure: str, test: dict, strata_max: int | None = None, strata_sort: bool = True, data_manager=None, data: DataFrame | None = None, cov: DataFrame | None = None, time_column: str | None = None, **kwargs) AssociationResult

Run a single baseline test.

Parameters:
  • outcome (str) – Outcome name

  • exposure (str) – Exposure name

  • test (dict) – Dictionary mapping test type to test IDs

  • strata_max (int) – Raises a ValueError if there are more unique strata values. Set to None to ignore.

  • strata_sort (bool, default True) – Should the strata be sorted.

  • data_manager (DataManager, optional) – Data manager for loading data (preferred method)

  • data (pd.DataFrame, optional) – Pre-loaded data (legacy method, if data_manager not provided)

  • cov (pd.DataFrame, optional) – Covariates dataframe (legacy method)

  • time_column (str or None) – For survival tests, the name of the time-to-event column

  • **kwargs – Additional arguments for baseline table

Returns:

Baseline and coefficient tables

Return type:

AssociationResult

class marvel.association.class_test.DataValidator(min_group_size: int = 1)

Bases: object

Validates data for testing

check_sample_size(data: DataFrame, exposure: str, outcome: str) bool

Check if stratified groups have sufficient sample sizes.

Parameters:
  • data (pd.DataFrame) – Data

  • exposure (str) – Exposure column name

  • outcome (str) – Outcome column name

Returns:

True if all groups meet minimum size requirement

Return type:

bool

check_stratified_variation(data: DataFrame, exposure: str, outcome: str) bool

Check if exposure has variation within each outcome stratum.

This is crucial for stratified analyses where we need variation in the exposure variable within each outcome category.

Parameters:
  • data (pd.DataFrame) – Data

  • exposure (str) – Exposure column name

  • outcome (str) – Outcome column name (stratification variable)

Returns:

True if exposure has variation in all outcome strata

Return type:

bool

check_string_columns(data: DataFrame, check_columns: str | list)

Check if columns contain strings

class marvel.association.class_test.DifferencesTester(test_module=None, min_group_size: int = 1)

Bases: object

Main tester for differences across exposures, models, and outcomes

run(exposure: str | list, models: dict, test_dict: dict, data_manager=None, data: DataFrame | None = None, cov: DataFrame | None = None, id_column: str = 'id', key_sep: str = ';', survival_time_map: dict | None = None, **kwargs) tuple[DataFrame, DataFrame]

Test differences across exposures, models, and outcomes.

Parameters:
  • exposure (str or list) – Exposure names

  • models (dict) – Dictionary mapping model names to covariates

  • test_dict (dict) – Dictionary mapping outcome types to outcomes and tests, example: {‘continuous’ : {‘outcome’ : ‘OLS;T’}, ‘binary’ : {‘event’ : ‘GLM-Binom’}}

  • data_manager (DataManager, optional) – Data manager for loading data (preferred method)

  • data (pd.DataFrame, optional) – Pre-loaded data (legacy method)

  • cov (pd.DataFrame, optional) – Covariates dataframe (legacy method)

  • id_column (str) – ID column name

  • key_sep (str) – Separator for tests and covariates

  • survival_time_map (dict or None) – Maps event_col -> time_col for survival outcomes

  • **kwargs – Additional arguments for baseline table

Returns:

Baseline and coefficient tables across all exposures, models, and outcomes

Return type:

tuple[pd.DataFrame, pd.DataFrame]

exception marvel.association.class_test.InsufficientVariationError

Bases: Exception

Raised when there is insufficient data variation for a statistical test

class marvel.association.class_test.ModelComparator(test_module=None, min_group_size: int = 1)

Bases: object

Compares different statistical models

run(exposure: str, models: dict, test_dict: dict, data_manager=None, data: DataFrame | None = None, cov: DataFrame | None = None, id_column: str = 'id', key_sep: str = ';', survival_time_map: dict | None = None, **kwargs) tuple[DataFrame, DataFrame]

Test differences across multiple models with different covariates.

Parameters:
  • exposure (str) – Exposure name

  • models (dict) – Dictionary mapping model names to semicolon-separated covariates

  • test_dict (dict) – Dictionary mapping outcome types to outcomes and tests

  • data_manager (DataManager, optional) – Data manager for loading data (preferred method)

  • data (pd.DataFrame, optional) – Pre-loaded data (legacy method)

  • cov (pd.DataFrame, optional) – Covariates dataframe (legacy method)

  • id_column (str) – ID column name

  • key_sep (str) – Separator for tests and covariates

  • survival_time_map (dict or None) – Maps event_col -> time_col for survival outcomes

  • **kwargs – Additional arguments for baseline table

Returns:

Baseline and coefficient tables across all models

Return type:

tuple[pd.DataFrame, pd.DataFrame]

class marvel.association.class_test.OutcomeCategoryTester(test_module=None, min_group_size: int = 1)

Bases: object

Tests differences per outcome category

run(exposure: str, test_dict: dict, data_manager=None, data: DataFrame | None = None, cov: DataFrame | None = None, key_sep: str = ';', survival_time_map: dict | None = None, **kwargs) tuple[DataFrame, DataFrame]

Test differences per outcome category.

Parameters:
  • exposure (str) – Exposure name

  • test_dict (dict) – Dictionary mapping outcome types to outcomes and their tests

  • data_manager (DataManager, optional) – Data manager for loading data (preferred method)

  • data (pd.DataFrame, optional) – Pre-loaded data (legacy method)

  • cov (pd.DataFrame, optional) – Covariates dataframe (legacy method)

  • key_sep (str) – Separator for test IDs

  • survival_time_map (dict or None) – Maps event_col -> time_col for survival outcomes

  • **kwargs – Additional arguments for baseline table

Returns:

Baseline and coefficient tables

Return type:

tuple[pd.DataFrame, pd.DataFrame]

class marvel.association.class_test.TestTypeResolver(test_module=None)

Bases: object

Resolves test type and extracts test configurations

resolve(outcome: str, test: dict, time_column: str | None = None) tuple

Determine test type and extract test configurations.

Parameters:
  • outcome (str) – Outcome name

  • test (dict) – Dictionary with one key (continuous/categorical/binary/survival) mapping to a list of test names

  • time_column (str or None) – For survival tests, the name of the time-to-event column

Returns:

(variable_lists, test_configs) where variable_lists is a tuple of (continuous, categorical, binary, survival) variable lists

Return type:

tuple