2021年1月23日星期六

Star index generation - 'std::bad_alloc' error

I was trying to generate a genome index using STAR index for mutant library 99,50 hours post fertilization (99H50) with the annotation form Lawson lab. The code I used is as follows.

module load STAR; STAR --runThreadN 10 --runMode genomeGenerate --genomeDir /gpfs/ysm/scratch60/polimanti/ag2646/99H50_new_annotation/z10starindex75/ --genomeFastaFiles /gpfs/ysm/scratch60/polimanti/ag2646/Lawsonreference/genome.fa --sjdbGTFfile /gpfs/ysm/scratch60/polimanti/ag2646/Lawsonreference/genes.gtf --sjdbOverhang 75  the batch script used to submit the job for creation of such indices is   dsq --job-file z10starindex75.txt --job-name z10starindex75 -c 10 --mem=100G -t 10:00:00 --mail-type=ALL --mail-user=aranyak.goswami@yale.edu  

I tried to run this code on my HPC cluster and it throws me an error as follows. Jan 22 22:41:39 ..... started STAR run Jan 22 22:41:39 ... starting to generate Genome files Jan 22 22:42:04 ... starting to sort Suffix Array. This may take a long time... Jan 22 22:42:09 ... sorting Suffix Array chunks and saving them to disk... Jan 22 22:47:18 ... loading chunks from disk, packing SA... Jan 22 22:47:42 ... finished generating suffix array Jan 22 22:47:42 ... generating Suffix Array index Jan 22 22:49:38 ... completed Suffix Array index Jan 22 22:49:38 ..... processing annotations GTF terminate called after throwing an instance of 'std::bad_alloc'

  what():  std::bad_alloc  /bin/sh: line 1: 186783 Aborted                 STAR --runThreadN 10 --runMode genomeGenerate --genomeDir /gpfs/ysm/scratch60/polimanti/ag2646/99H50_new_annotation/z10starindex75/ --genomeFastaFiles /gpfs/ysm/scratch60/polimanti/ag2646/Lawsonreference/genome.fa --sjdbGTFfile /gpfs/ysm/scratch60/polimanti/ag2646/Lawsonreference/genes.gtf --sjdbOverhang 75  

I googled and found out that such errors might originate from the allocation of memory and hence I ran from the space in the cluster where I have enough space. The memory usage for such job has been given by Job ID: 47861791 Array Job ID: 47861791_0 Cluster: farnam User/Group: ag2646/nicoli State: FAILED (exit code 134) Nodes: 1 Cores per node: 10 CPU Utilized: 00:36:34 CPU Efficiency: 45.14% of 01:21:00 core-walltime Job Wall-clock time: 00:08:06 Memory Utilized: 25.64 GB Memory Efficiency: 25.64% of 100.00 GB. I browsed the internet and tried to find out solutions. (1) I tried to reduce the number of threads from 10 to 1 to reduce the computational memory issue. (2) I tried to use allocate specific memory limits by using flags like ---limitGenomeGenerateRAM

48000000000  (3) --genomeChrBinNbits 16  Still the error is creeping in.  First few lines of my GTF file is chr12   UMMS    gene    6160446 6177944 .       -       .       gene_id "LL0000000001"; gene_name "a1cf";  

chr12 UMMS exon 6160446 6161260 . - . gene_id "LL0000000001"; gene_name "a1cf"; transcript_id "ENSDART00000152292"; chr12 UMMS exon 6163727 6163869 . - . gene_id "LL0000000001"; gene_name "a1cf"; transcript_id "ENSDART00000152292"; chr12 UMMS exon 6165086 6165222 . - . gene_id "LL0000000001"; gene_name "a1cf"; transcript_id "ENSDART00000152292"; chr12 UMMS exon 6165305 6165498 . - . gene_id "LL0000000001"; gene_name "a1cf"; transcript_id "ENSDART00000152292"; chr12 UMMS exon 6167117 6167396 . - . gene_id "LL0000000001"; gene_name "a1cf"; transcript_id "ENSDART00000152292"; chr12 UMMS exon 6168940 6169037 . - . gene_id "LL0000000001"; gene_name "a1cf"; transcript_id "ENSDART00000152292"; chr12 UMMS exon 6169982 6170146 . - . gene_id "LL0000000001"; gene_name "a1cf"; transcript_id "ENSDART00000152292"; chr12 UMMS exon 6170412 6170650 . - . gene_id "LL0000000001"; gene_name "a1cf"; transcript_id "ENSDART00000152292"; chr12 UMMS exon 6170731 6170861 . - . gene_id "LL0000000001"; gene_name "a1cf"; transcript_id "ENSDART00000152292"; Some of the lines of the genome fasta file is as follows

chr1 gatcttaaacatttattccccctgcaaacattttcaatcattacattgtc atttcccctccaaattaaatttagccagaggcgcacaacatacgacctct aaaaaaggtgctgtaacatgtacctatatgcagcaccactatatgagagc ggcatagcagtgtttagtcacttggttgctttgtttatattaacttgaaa gtgtgttttagctattgagtttaaacaaagggagcggtttacattgaatt aaaggcaactactgatgggttgtgtaatgtttcaaagagctgttgcagca tgagtggaaaataaaaccgtattagtgctgcctggcccagtttggcacaa aatggagcgattccattaagagaacgattcagcataagtggaacagcTAA AGtttatgaaaatttttaatctggatgtagagaatctcataacacagaaa

I have tried to provide as much detail as possible and any help will be helpful.

https://stackoverflow.com/questions/65866214/star-index-generation-stdbad-alloc-error January 24, 2021 at 09:06AM

没有评论:

发表评论