使用STAR(https://github.com/alexdobin/STAR?tab=readme-ov-file)将RNA-seq的reads比对到参考基因组上,比对对象是已经经过过滤后的文件
一、准备
(1)安装软件
1)二进制安装
STAR_2.7.11b.zip
环境变量处理
2)conda 安装
1 2 3
| conda create -n myenv conda activate myenv conda install -c bioconda star
|
过程
(1)生成索引文件(耗时巨大)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
| STAR \ --runThreadN 12 \ --genomeDir /data1/caoronglin/data/human \ --readFilesIn /data1/caoronglin/data/human/data/GM12878/results/trimmed/ENCFF481BWJ_trimmed.fq.gz \ --outFileNamePrefix /data1/caoronglin/data/human/output/ENCFF481BWJ_ \ --outSAMtype BAM SortedByCoordinate \ --outFilterType BySJout \ --outFilterMultimapNmax 20 \ --alignSJoverhangMin 8 \ --alignSJDBoverhangMin 1 \ --outSAMattrRGline "ID:GM12878 SM:GM12878 LB:library1 PU:unit1 PL:ILLUMINA" \ --outSAMmapqUnique 60 \ --limitBAMsortRAM 20000000000 \ --readFilesCommand "zcat" \ --outReadsUnmapped Fastx \ --quantMode GeneCounts \ --sjdbOverhang 99
|
(2)开始比对
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
| STAR \ --runThreadN 12 \ --genomeDir /data1/caoronglin/data/human \ --readFilesIn /data1/caoronglin/data/human/data/GM12878/results/trimmed/ENCFF974EKR_trimmed.fq.gz \ --outFileNamePrefix /data1/caoronglin/data/human/output/ENCFF974EKR_ \ --outSAMtype BAM SortedByCoordinate \ --outFilterType BySJout \ --outFilterMultimapNmax 20 \ --alignSJoverhangMin 8 \ --alignSJDBoverhangMin 1 \ --outSAMattrRGline ID:GM12878 SM:GM12878 LB:library1 PU:unit1 PL:ILLUMINA \ --outSAMmapqUnique 60 \ --limitBAMsortRAM 20000000000 \ --readFilesCommand zcat \ --outReadsUnmapped Fastx \ --quantMode GeneCounts \ --sjdbOverhang 99 \
|
(3) 比对结果
ENCFF974EKR_Log.final.out.txt
ENCFF824LLV_Log.final.out.txt