Usage Guide
This page covers how to run MARVELous from the command line.
Command-Line Interface
MARVELous provides a command-line interface through the marvelous command
(installed with the package).
Basic Syntax
marvelous <config_file> [options]
Arguments
Argument |
Description |
|---|---|
|
Path to the configuration file (required) |
Options
Option |
Description |
|---|---|
|
Enable verbose output with detailed logging |
|
Override output directory from config file |
|
Validate configuration without running pipeline |
|
Write log output to file (warnings will not print to console) |
|
Show version number and exit |
|
Show help message and exit |
Examples
Run full pipeline:
marvelous /path/to/config.cnf -v
Validate configuration (dry run):
marvelous /path/to/config.cnf --dry-run
Override output directory:
marvelous /path/to/config.cnf --outpath /custom/output/dir
Write warnings to file:
marvelous /path/to/config.cnf -v --log-file analysis.log
Programmatic Usage
MARVELous can also be used as a Python library for integration into scripts or notebooks. See the marvel.pipeline module documentation for details.
Output Files
Extraction Output
When variant extraction is enabled, MARVELous creates:
File |
Description |
|---|---|
|
Carrier matrix (samples × variants/genes) |
|
Extraction summary with variant counts |
Carrier file format:
id GENE1 GENE2 variant_1 variant_2
sample1 1 0 1 0
sample2 0 1 0 1
sample3 2 0 1 1
Values indicate allele count (0, 1, or 2 for diploid). If variants are combined using the cat_column option, the number can become higher, because it is a sum of the variants. For more information on the values, see Advanced Features.
Association Output
When association testing is enabled, MARVELous creates for each exposure:
File |
Description |
|---|---|
|
Association test results |
|
Baseline characteristics table |
Results file columns:
Column |
Description |
|---|---|
Model |
Covariate model name |
Model name |
Statistical test name |
Variable |
Outcome variable name |
Exposure |
Exposure variable name |
N (Cases) |
Number of cases (binary outcomes) |
N (Samples) |
Total sample size |
Exposed |
Number of exposed samples |
Non-exposed |
Number of non-exposed samples |
Estimate |
Effect estimate (beta or OR) |
Std. Error |
Standard error |
Test statistic |
Test statistic value |
P-value |
P-value |
Estimate (95% CI) |
Formatted estimate with confidence interval |
OR (95% CI) |
Odds ratio with confidence interval (binary outcomes) |
Dry Run Mode
Use --dry-run to validate your configuration without running the analysis:
marvelous config.cnf --dry-run
This will:
Parse and validate the configuration file
Check that all input files exist
Verify column names in input files
Check that specified tests are defined
Print a configuration summary
Workflow Examples
Full Analysis
A typical full analysis workflow:
# 1. Validate configuration
marvelous analysis.cnf --dry-run
# 2. Run full pipeline
marvelous analysis.cnf -v --log-file analysis.log
# 3. Check results
ls ./results/
The configuration file can be created manually, or using a helper function included in the package. For more information see Configuration Reference.
Please refer to the full example here: Command-line interface example.
Two-Stage Workflow
For large analyses or more control, run extraction and association separately:
Stage 1: Extraction
Create extraction.cnf:
[GenoInput]
chr22 /data/chr22.vcf.gz
[VarInput]
variants /data/variants.tsv
[Output]
VarOutput /results/carriers
[Options]
extract_variants True
association_analysis False
Run:
marvelous extraction.cnf -v
Stage 2: Association
Create association.cnf:
[ExpInput]
carriers /results/carriers_carriers.tsv.gz
[PhenoInput]
phenotypes /data/outcomes.tsv
covariates /data/covariates.tsv
[BinTests]
disease GLM-Binom;FISHER
[Covs]
Adjusted age;sex
[Options]
extract_variants False
association_analysis True
output_path /results
Run:
marvelous association.cnf -v
Error Handling
Common errors and solutions:
Configuration file not found:
FileNotFoundError: Configuration file not found: config.cnf
Solution: Check the path to your configuration file.
Missing required headers:
ConfigHeaderMissingError: The following headers are missing: ['PhenoInput']
Solution: Add the required section to your configuration file.
Input file not found:
FileNotFoundError: The following input files are missing: ['/path/to/file.vcf.gz']
Solution: Verify paths in your configuration file.
Column not found:
InputValidationError: The columns ['outcome1'] were not present in the input files
Solution: Check column names in your phenotype file.
Unsupported genetic file type:
TypeError: Unsupported genetic file extension
Solution: Use VCF (.vcf, .vcf.gz), BGEN (.bgen), or PLINK (.bed) files.
See Also
Configuration Reference - Configuration file reference
Getting Started - Quick start guide
Advanced Features - Advanced features