Route: rna-star
Alignment and quantification of RNA-seq data.
Segments:
- Trim adapters and low quality bases (Trimmomatic).
- Align to the reference genome (STAR).
- Align to other species and common contaminants (fastq_screen).
- Generate normalized genome browser tracks.
- Determine the distribution of the bases within the transcripts and 5’/3’ biases (Picard).
- Determine if the library is stranded and the strand orientation.
- Generate genes-samples counts matrix (featureCounts).
For differential expression analysis, follow with rna-star-groups-dge.
Usage
Set up a new analysis (common across all routes). If running for the first time, check the detailed usage instructions for an explanation of every step.
cd <project dir>
git clone --depth 1 https://github.com/igordot/sns
sns/generate-settings <genome>
sns/gather-fastqs <fastq dir>
Run rna-star
route.
sns/run rna-star
Check for potential problems.
grep "ERROR:" logs-sbatch/*
Output
Primary results:
BAM-STAR
: BAM files. Can be used for visual inspection of individual reads or additional analysis.BIGWIG
: BigWig files normalized to the total number of reads. Can be used for visual inspection of relative expression levels.quant.featurecounts.counts.txt
: Matrix of raw counts for all genes and samples.
Run metrics:
summary-combined.rna-star.csv
: Summary table that includes the number of reads, unique and multi-mapping alignment rate, number of counts assigned to genes, fraction of coding/UTR/intronic/intergenic bases.summary.fastqscreen.png
: Alignment rates for common species and contaminants.summary.qc-picard-rnaseqmetrics.png
: Distribution of the bases within the transcripts to determine potential 5’/3’ biases.
Additional output (can usually be deleted or used for troubleshooting):
genes.featurecounts.txt
: Table of genes based on the reference GTF.quant-*
: Raw counts for all genes for individual samples.