Skip to content

Sequence Characterisation And nanopoRe methyLation Evaluation Tool

License

Notifications You must be signed in to change notification settings

Leilanasd/SCARLET

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SCARLET

drawing


SCARLET provides a NEXTFLOW version of the R.O.B.I.N. 'live' tumour classification tool.

Data Input

To generate sequence data suitable for SCARLET analysis, we recommend either runnig the R.O.B.I.N. 'live' tool, or using Readfish with the file of targets at bin/NPHD_panel_hg38_clean.bed Either of these options will produce a data set suitable for analysis.

You need to provide:

  1. a sorted BAM file with methylation probabilities that has been aligned to GRCh38.
  2. the associated .bai index.
  3. the GRh38 genome reference sequence and annotation set (GTF).

Software Requirements:

git
docker
nextflow

Clone repo and download required models

git clone https://github.com/graemefox/SCARLET.git
wget https://gitlab.com/euskirchen-lab/crossNN/-/raw/master/models/Capper_et_al_NN.pkl?inline=false -O SCARLET/bin/Capper_et_al_NN.pkl

Pull the latest SCARLET docker image:

docker pull graefox/scarlet:latest

Pull the latest version of the required wf-human-variation workflow

nextflow pull epi2me-labs/wf-human-variation

Example command:

## define sample name, ID and output directory, input BAM and reference genome:

SAMPLE=sample_01
OUTDIR=${SAMPLE}_output
BAM=my_data.bam
REFERENCE=my_reference.fa.gz
ANNOTATIONS=my_annotation_set.gtf

## run the pipeline
nextflow run SCARLET/main.nf \
        -with-docker graefox/scarlet:latest \
        --sample $SAMPLE \
        --bam $BAM \
        --outdir $OUTDIR \
        --reference $REFERENCE \
        --annotations $ANNOTATIONS \
        --nanoplot \
        --sturgeon --rapidcns2 --nanodx

Optional extra parameters (with their default values)

These a have default values specified in the nextflow.config file, but you may override them on the CLI.

--threads 16 (CPUs to use [default: 64]) 
--bam_min_coverage (minimum coverage required to run the epi2melabs/wf-human-variation stages [ default: 5]) 
--minimum_mgmt_cov (minimum avg coverage at the mgmt promoter. Coverage must be greater than this to run the analysis of mgmt methylation)
--rapidcns2 (the nextflow will run the rapidCNS2 (https://github.com/areebapatel/Rapid-CNS2) classifier if the --rapidcns2 flag is passed [Defualt behaviour is to NOT run rapidCNS2])
--sturgeon (the nextflow will run the sturgeon (https://github.com/marcpaga/sturgeon) classifier if the --sturgeon flag is passed [Defualt behaviour is to NOT run sturgeon])
--nanodx (the nextflow will run the nanoDx (https://gitlab.com/pesk/nanoDx) classifier if the --nanodx flag is passed [Defualt behaviour is to NOT run nanoDx])
--nanoplot (nextflow will ALSO run NanoPlot to generate a QC report[ Default behaviour is to NOT run nanoplot])

To run with slurm

Add -process.executor='slurm' to your nextflow command, then run as normal. You do not need to submit a script with SBATCH, just run the nextflow command as normal and nextflow knows to submit each process into SLURM.

Troubleshooting tips

If the run seems to hang forever at the cnvpytor step, it may be that you have not indexed your input bam. This is also just quite a long process.

If you get the Docker Error: "docker: permission denied while trying to connect to the docker daemon socket".... on Ubuntu (based) systems, you need to add your user to the docker group. Follow the instructions here: (https://www.digitalocean.com/community/questions/how-to-fix-docker-got-permission-denied-while-trying-to-connect-to-the-docker-daemon-socket)

About

This workflow uses many third-party tools to function and relies on the hard work and expertise of their respective authors. This list includes (but may not be limited to...):

rapidCNS2

wf-human-variation

modkit

samtools

NanoPlot

mosdepth

methylartist

clairS-TO

CNVpytor

VCFtools

ANNOVAR

Sturgeon

NanoDX

Licence

SCARLET is distributed under a CC BY-NC 4.0 license. See LICENSE for more information. This license does not override any licenses that may be present in the third party tools used by SCARLET.

About

Sequence Characterisation And nanopoRe methyLation Evaluation Tool

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Nextflow 70.1%
  • R 20.9%
  • Python 9.0%