Usage Guide

This page covers how to run MARVELous from the command line.

Command-Line Interface

MARVELous provides a command-line interface through the marvelous command (installed with the package).

Basic Syntax

marvelous <config_file> [options]

Arguments

Argument

Description

config_file

Path to the configuration file (required)

Options

Option

Description

-v, --verbose

Enable verbose output with detailed logging

--outpath PATH

Override output directory from config file

--dry-run

Validate configuration without running pipeline

--log-file PATH

Write log output to file (warnings will not print to console)

--version

Show version number and exit

-h, --help

Show help message and exit

Examples

Run full pipeline:

marvelous /path/to/config.cnf -v

Validate configuration (dry run):

marvelous /path/to/config.cnf --dry-run

Override output directory:

marvelous /path/to/config.cnf --outpath /custom/output/dir

Write warnings to file:

marvelous /path/to/config.cnf -v --log-file analysis.log

Programmatic Usage

MARVELous can also be used as a Python library for integration into scripts or notebooks. See the marvel.pipeline module documentation for details.

Output Files

Extraction Output

When variant extraction is enabled, MARVELous creates:

File

Description

{VarOutput}_carriers.tsv.gz

Carrier matrix (samples × variants/genes)

{VarOutput}_summary.tsv.gz

Extraction summary with variant counts

Carrier file format:

id      GENE1   GENE2   variant_1       variant_2
sample1 1       0       1               0
sample2 0       1       0               1
sample3 2       0       1               1

Values indicate allele count (0, 1, or 2 for diploid). If variants are combined using the cat_column option, the number can become higher, because it is a sum of the variants. For more information on the values, see Advanced Features.

Association Output

When association testing is enabled, MARVELous creates for each exposure:

File

Description

{exposure}_results.tsv.gz

Association test results

{exposure}_baseline.tsv

Baseline characteristics table

Results file columns:

Column

Description

Model

Covariate model name

Model name

Statistical test name

Variable

Outcome variable name

Exposure

Exposure variable name

N (Cases)

Number of cases (binary outcomes)

N (Samples)

Total sample size

Exposed

Number of exposed samples

Non-exposed

Number of non-exposed samples

Estimate

Effect estimate (beta or OR)

Std. Error

Standard error

Test statistic

Test statistic value

P-value

P-value

Estimate (95% CI)

Formatted estimate with confidence interval

OR (95% CI)

Odds ratio with confidence interval (binary outcomes)

Dry Run Mode

Use --dry-run to validate your configuration without running the analysis:

marvelous config.cnf --dry-run

This will:

  1. Parse and validate the configuration file

  2. Check that all input files exist

  3. Verify column names in input files

  4. Check that specified tests are defined

  5. Print a configuration summary

Workflow Examples

Full Analysis

A typical full analysis workflow:

# 1. Validate configuration
marvelous analysis.cnf --dry-run

# 2. Run full pipeline
marvelous analysis.cnf -v --log-file analysis.log

# 3. Check results
ls ./results/

The configuration file can be created manually, or using a helper function included in the package. For more information see Configuration Reference.

Please refer to the full example here: Command-line interface example.

Two-Stage Workflow

For large analyses or more control, run extraction and association separately:

Stage 1: Extraction

Create extraction.cnf:

[GenoInput]
chr22        /data/chr22.vcf.gz

[VarInput]
variants     /data/variants.tsv

[Output]
VarOutput    /results/carriers

[Options]
extract_variants     True
association_analysis False

Run:

marvelous extraction.cnf -v

Stage 2: Association

Create association.cnf:

[ExpInput]
carriers     /results/carriers_carriers.tsv.gz

[PhenoInput]
phenotypes   /data/outcomes.tsv
covariates   /data/covariates.tsv

[BinTests]
disease      GLM-Binom;FISHER

[Covs]
Adjusted     age;sex

[Options]
extract_variants     False
association_analysis True
output_path  /results

Run:

marvelous association.cnf -v

Error Handling

Common errors and solutions:

Configuration file not found:

FileNotFoundError: Configuration file not found: config.cnf

Solution: Check the path to your configuration file.

Missing required headers:

ConfigHeaderMissingError: The following headers are missing: ['PhenoInput']

Solution: Add the required section to your configuration file.

Input file not found:

FileNotFoundError: The following input files are missing: ['/path/to/file.vcf.gz']

Solution: Verify paths in your configuration file.

Column not found:

InputValidationError: The columns ['outcome1'] were not present in the input files

Solution: Check column names in your phenotype file.

Unsupported genetic file type:

TypeError: Unsupported genetic file extension

Solution: Use VCF (.vcf, .vcf.gz), BGEN (.bgen), or PLINK (.bed) files.

See Also