marvel.pipeline module
The pipeline module contains the main orchestrator classes for running the MARVELous analysis pipeline.
Main Pipeline
- class marvel.pipeline.MARVELousPipeline(config: PipelineConfig, verbose: bool = True)
Bases:
objectMain pipeline orchestrator for MARVELous
The main orchestrator class that coordinates extraction and association steps.
Example usage:
from marvel.pipeline import MARVELousPipeline # Create from config file pipeline = MARVELousPipeline.from_config_file("config.cnf") # Run pipeline results = pipeline.run() # Get summary print(pipeline.get_summary())
- classmethod from_config_file(config_path: str, verbose: bool = True) MARVELousPipeline
Create pipeline from configuration file
- Parameters:
- Returns:
Initialized pipeline
- Return type:
Pipeline Steps
Abstract Base Class
- class marvel.pipeline.PipelineStep(config: PipelineConfig, verbose=False, logger=None)
Bases:
ABCAbstract base class for pipeline steps
Abstract base class that defines the interface for pipeline steps.
Variant Extraction Step
- class marvel.pipeline.VariantExtractionStep(config: PipelineConfig, verbose=False, logger=None)
Bases:
PipelineStepPipeline step for variant extraction
Handles variant extraction from genetic files (VCF, BGEN, PLINK).
Example usage:
from marvel.pipeline import VariantExtractionStep from marvel.utils.config_tools import PipelineConfig config = PipelineConfig( extract_variants=True, geno_files={"chr22": "/data/chr22.vcf.gz"}, var_files={"variants": "/data/variants.tsv"}, output_path="./results" ) step = VariantExtractionStep(config, verbose=True) results = step.execute() # Results contain: # - output_files: List of created carrier files # - summary: DataFrame with extraction summary # - carriers: DataFrame with carrier matrix # - results: Dict with successful/failed task info
Association Testing Step
- class marvel.pipeline.AssociationTestingStep(config: PipelineConfig, variant_files: List[str], outcome_cache_size: int = 20, covariate_cache_size: int = 20, verbose=False, logger=None)
Bases:
PipelineStepPipeline step for association testing
Handles association testing between genetic exposures and phenotypic outcomes.
Example usage:
from marvel.pipeline import AssociationTestingStep from marvel.utils.config_tools import PipelineConfig config = PipelineConfig( association_analysis=True, pheno_file="/data/phenotypes.tsv", cov_file="/data/covariates.tsv", id_column="IID", output_path="./results", outcomes={ "blood_pressure": { "outcome_coltype": "continuous", "tests": ["OLS", "KW"] } }, covariate_models={"Unadjusted": None} ) step = AssociationTestingStep( config=config, variant_files=["carriers.tsv.gz"], verbose=True ) results = step.execute() # Results contain: # - completed: List of successfully tested exposures # - failed: List of (exposure, error) tuples # - skipped: List of skipped exposures
Entry Point
MARVELous Entry Point
Command-line entry point for MARVELous.
Example:
from marvel.marvelous_entry import marvelous
# Run programmatically
exit_code = marvelous(
config_file="config.cnf",
outpath="./custom_output",
dry_run=False,
verbose=True
)
- marvel.marvelous_entry.main()
Main entry point for MARVELous