Usage Guide

This page covers how to run MARVELous from the command line.

Command-Line Interface

MARVELous provides a command-line interface through the marvelous command (installed with the package).

Basic Syntax

marvelous <config_file> [options]

Arguments

Argument	Description
`config_file`	Path to the configuration file (required)

Options

Option	Description
`-v`, `--verbose`	Enable verbose output with detailed logging
`--outpath PATH`	Override output directory from config file
`--dry-run`	Validate configuration without running pipeline
`--log-file PATH`	Write log output to file (warnings will not print to console)
`--version`	Show version number and exit
`-h`, `--help`	Show help message and exit

Examples

Run full pipeline:

marvelous /path/to/config.cnf -v

Validate configuration (dry run):

marvelous /path/to/config.cnf --dry-run

Override output directory:

marvelous /path/to/config.cnf --outpath /custom/output/dir

Write warnings to file:

marvelous /path/to/config.cnf -v --log-file analysis.log

Programmatic Usage

MARVELous can also be used as a Python library for integration into scripts or notebooks. See the marvel.pipeline module documentation for details.

Output Files

Extraction Output

When variant extraction is enabled, MARVELous creates:

File	Description
`{VarOutput}_carriers.tsv.gz`	Carrier matrix (samples × variants/genes)
`{VarOutput}_summary.tsv.gz`	Extraction summary with variant counts

Carrier file format:

id      GENE1   GENE2   variant_1       variant_2
sample1 1       0       1               0
sample2 0       1       0               1
sample3 2       0       1               1

Values indicate allele count (0, 1, or 2 for diploid). If variants are combined using the cat_column option, the number can become higher, because it is a sum of the variants. For more information on the values, see Advanced Features.

Association Output

When association testing is enabled, MARVELous creates for each exposure:

File	Description
`{exposure}_results.tsv.gz`	Association test results
`{exposure}_baseline.tsv`	Baseline characteristics table

Results file columns:

Column	Description
Model	Covariate model name
Model name	Statistical test name
Variable	Outcome variable name
Exposure	Exposure variable name
N (Cases)	Number of cases (binary outcomes)
N (Samples)	Total sample size
Exposed	Number of exposed samples
Non-exposed	Number of non-exposed samples
Estimate	Effect estimate (beta or OR)
Std. Error	Standard error
Test statistic	Test statistic value
P-value	P-value
Estimate (95% CI)	Formatted estimate with confidence interval
OR (95% CI)	Odds ratio with confidence interval (binary outcomes)

Dry Run Mode

Use --dry-run to validate your configuration without running the analysis:

marvelous config.cnf --dry-run

This will:

Parse and validate the configuration file
Check that all input files exist
Verify column names in input files
Check that specified tests are defined
Print a configuration summary

Workflow Examples

Full Analysis

A typical full analysis workflow:

# 1. Validate configuration
marvelous analysis.cnf --dry-run

# 2. Run full pipeline
marvelous analysis.cnf -v --log-file analysis.log

# 3. Check results
ls ./results/

The configuration file can be created manually, or using a helper function included in the package. For more information see Configuration Reference.

Please refer to the full example here: Command-line interface example.

Two-Stage Workflow

For large analyses or more control, run extraction and association separately:

Stage 1: Extraction

Create extraction.cnf:

[GenoInput]
chr22        /data/chr22.vcf.gz

[VarInput]
variants     /data/variants.tsv

[Output]
VarOutput    /results/carriers

[Options]
extract_variants     True
association_analysis False

Run:

marvelous extraction.cnf -v

Stage 2: Association

Create association.cnf:

[ExpInput]
carriers     /results/carriers_carriers.tsv.gz

[PhenoInput]
phenotypes   /data/outcomes.tsv
covariates   /data/covariates.tsv

[BinTests]
disease      GLM-Binom;FISHER

[Covs]
Adjusted     age;sex

[Options]
extract_variants     False
association_analysis True
output_path  /results

Run:

marvelous association.cnf -v

Error Handling

Common errors and solutions:

Configuration file not found:

FileNotFoundError: Configuration file not found: config.cnf

Solution: Check the path to your configuration file.

Missing required headers:

ConfigHeaderMissingError: The following headers are missing: ['PhenoInput']

Solution: Add the required section to your configuration file.

Input file not found:

FileNotFoundError: The following input files are missing: ['/path/to/file.vcf.gz']

Solution: Verify paths in your configuration file.

Column not found:

InputValidationError: The columns ['outcome1'] were not present in the input files

Solution: Check column names in your phenotype file.

Unsupported genetic file type:

TypeError: Unsupported genetic file extension

Solution: Use VCF (.vcf, .vcf.gz), BGEN (.bgen), or PLINK (.bed) files.