Thursday, May 10, 2018

Downloading sequences from NCBI:

This downloads the GenBank file and puts it into a file called CP011547.gbk (Just change the accession number in the first line to download any other sequence):
i=CP011547
curl -s  "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucleotide&id=${i}&rettype=gb&retmode=txt">$i.gbk

The sequence as nucleotide fasta:
curl -s  "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucleotide&id=${i}&rettype=fasta&retmode=txt">$i.fna

The CDS as protein fasta:
curl -s  "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucleotide&id=${i}&rettype=fasta_cds_aa&retmode=txt">$i.cds.faa

The CDS as nucleotide fasta:
curl -s  "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucleotide&id=${i}&rettype=fasta_cds_na&retmode=txt">$i.cds.fna