SeqMonk: A Practical Introduction for Genomic Data Exploration

Visualizing Differential Expression with SeqMonk — Step-by-Step Guide

This guide walks you through preparing data, running differential expression (DE) analysis, and visualizing results in SeqMonk (assumes basic familiarity with mapped read files and gene annotations). Steps use reasonable defaults so you can follow along without extra setup.

1. Prepare input files

Mapped reads: Use BAM files (sorted, indexed). One BAM per sample.
Annotation: GTF/GFF or a gene list compatible with SeqMonk.
Experimental design: Decide sample groups (e.g., control vs treated) and ensure consistent naming.

2. Create a new project and import data

Open SeqMonk → File → New Project.
Import your annotation: Data → Import annotation → load GTF/GFF.
Import BAM files: Data → Import mapped data → select all BAMs. SeqMonk will index and register them.

3. Define features (probes)

SeqMonk needs probes to quantify reads. For gene-level DE use annotated exons or generate custom probes.

Option A — Use annotation probes:
- Data → Features → Build features from annotation → choose “Genes” (or exons) → OK.
Option B — Create probes from genome windows:
- Data → Features → Create probes → set window size (e.g., 1 kb) → OK.
Option C — Import a custom probe list (if you have a specific gene list).

4. Quantify reads across probes

Data → Quantitation → Annotated probes → choose “Read counts” (or RPKM/TPM if you prefer normalization).
Use default strand settings unless your data are stranded (set appropriately).
SeqMonk calculates counts per probe per sample and stores them in the project.

5. Normalize counts

Normalization is essential before DE testing.

Data → Transformations → Read counts → choose “Depth correction” (or “Counts per million”).
For between-sample normalization suitable for DE, use:
- Data → Normalization → DESeq/DESeq2 or TMM (if available in your SeqMonk version) — pick DESeq if unsure.
Apply the normalization; SeqMonk will store normalized values.

6. Set up groups and replicates

Data → Sample grouping → Create group from sample name pattern OR manually assign samples to groups (e.g., Control, Treated).
Ensure replicates are correctly assigned and balanced where possible.

7. Run differential expression analysis

Data → Statistical analysis → Differential expression.
Select the quantitation to test (e.g., normalized counts).
Choose statistical test:
- Use DESeq (or DESeq2) for count data with replicates — default recommendation.
- Use Mann–Whitney or t-test only for simple, small comparisons where parametric assumptions hold.
Set significance thresholds (default p-value 0.05, adjust for multiple testing using Benjamini–Hochberg FDR).
Run analysis. SeqMonk produces a results table with log fold changes, p-values, and adjusted p-values.

8. Inspect and filter results

Open the DE results table: sort by adjusted p-value or log fold change.
Filter to genes meeting thresholds, e.g.:
- Adjusted p-value < 0.05 and |log2 fold change| ≥ 1.
Export filtered lists: Data → Export → export selections as CSV/TSV for downstream use.

9. Visualize differential expression

SeqMonk offers several visualization modes:

Scatter plots (MA plots):
- View → Plot → MA plot or Scatter plot.
- X-axis: mean expression; Y-axis: log fold change. Color significant points for clarity.
Volcano plots:
- View → Plot → Volcano plot (if available) or produce a scatter of log2FC vs -log10(p-value).
- Highlight significant genes with a different color or label top hits.
Heatmaps:
- Data → Heatmap → choose normalized quantitation and select genes of interest (e.g., top 50 DE genes).
- Configure clustering (hierarchical) and scaling (row z-score) to reveal patterns.
Genome browser tracks:
- Double-click a gene to open the probe view and inspect per-sample coverage tracks.
- Useful to validate DE candidates visually for consistent coverage differences.

10. Annotate and export figures

Add gene labels to plots for top hits.
Export plots/images: File → Export view or use the PNG/PDF export options for high-resolution figures.
Export the full results table or selected gene lists for pathway analysis.

11. Quick troubleshooting

Low replicate number: DE tests may lack power; report effect sizes and avoid overinterpreting marginal p-values.
Batch effects: If detected, include batch as a covariate in the design if supported or correct externally (e.g., limma/voom).
Zero inflation / low counts: Filter out probes with very low counts across all samples before running DE.

12. Example recommended workflow (default assumptions)

Import BAMs + GTF.
Build gene probes.
Quantify read counts.
Filter probes with sum counts < 10.
Normalize with DESeq.
Group samples (Control vs Treated).
Run DESeq differential expression.
Filter by adj. p-value < 0.05 and |log2FC| ≥ 1.
Generate volcano plot and heatmap of top 50 genes.
Export figures and gene list for enrichment analysis.

If you want, I can produce an exact SeqMonk menu-click list or a brief script-style checklist for your specific sample names and thresholds.

SeqMonk: A Practical Introduction for Genomic Data Exploration

Visualizing Differential Expression with SeqMonk — Step-by-Step Guide

1. Prepare input files

2. Create a new project and import data

3. Define features (probes)

4. Quantify reads across probes

5. Normalize counts

6. Set up groups and replicates

7. Run differential expression analysis

8. Inspect and filter results

9. Visualize differential expression

10. Annotate and export figures

11. Quick troubleshooting

12. Example recommended workflow (default assumptions)

Comments

Leave a Reply Cancel reply

More posts

Bandwidth Meter for Microsoft Virtual Server — Real-Time Network Monitoring Guide

Macrorit Partition Expert Professional Edition vs Competitors: Which Is Best?

CutLogic 1D Review: Features, Pricing, and Best Use Cases

VDFilter vs. Alternatives: Which Is Best for Your Workflow?