DERR (Detection Dual T-cell receptor in single cell sequencing) is a toolkit for:
DeRR required follwing python packages:
Users could using pip
or others package managers to install these packages like
pip install tqdm pandas biopython pysam networkx editdistance
Follwing tools are also required and shoule be able to access in PATH:
We recommand using a vitrual conda envoriment to install above packages and softwares:
# Create the envroiment and install the requirments
conda create -c conda-forge -c bioconda -n deer tqdm pandas biopython pysam networkx bwa samtools fastp editdistance -y
# As sometimes conda might be very slow, users could use mamba instead of conda for faster installation
conda install -n base -c conda-forge mamba #install mamba
mamba create -c conda-forge -c bioconda -n deer tqdm pandas biopython pysam networkx bwa samtools fastp editdistance -y #install requirments
# Activate the envoriment
conda activate deer
# Do some analysis
python DeRR.py --inf XXX --out XXX --threads number_of_threads
Typical DERR command for extraction Dual TCR will look like:
python DeRR.py --inf /path/to/manifest.tsv --out /path/to/result.tsv --threads X
Users should list all the FASTQ files and Cell IDs (barcode) in the manifest file. The manifest file should contain 3 tab-seprated columsn like
#For paired-end reads
Read1-file-name \t Read2-file-name \t Cell-id
#For single-end reads
Read1-file-name \t None \t Cell-id
A manifest file is like:
The result.tsv is like:
Vgene | Jgene | CDR3 | Counts | Chain | CellId |
---|---|---|---|---|---|
TRAV3 | TRAJ27 | CAHNTNAGKSTF | 13 | TRA | AAACCTGAGATCCTGT-1 |
TRBV3-1 | TRBJ2-7 | CASSQGGALTYEQYF | 198 | TRB | AAACCTGAGCGATAGC-1 |
TRBV11-2 | TRBJ22-4 | CASSFDGLAKNIQYF | 68 | TRB | AAACCTGAGGAGTCTG-1 |
TRAV9-2 | TRAJ49 | CALFAGNQFYF | 139 | TRA | AAACCTGCATCTGGTA-1 |
For 10X V(D)J sequencing data which don't provide FASTQ files for each cell, we provide a script help demulpitexing the data:
python SplitVDJbam.py --bam all_contig.bam --list cell_barcodes.json --out /path/to/fastq_output --file /path/to/Manifest.tsv
where all_contig.bam
and cell_barcodes.json
is the output from cellranger, usually located in ProjectName/outs