blast的数据库里面有这几个数据库,每一个的具体含义:
https://ncisf.org/index.php?q=software-databases/blast-databases
a list of the databases available on the cluster, including information about the database, it's source, update method and description.
all databases are located in /sw/db
name
type
update method
source
description
nt nucleic automatic - ncbi formatted. ftp://ftp.ncbi.nih.gov/blast/db/nt.* nucleotide sequence database, with entries from all traditional divisions of genbank, embl, and ddbj excluding bulk divisions (gss, sts, pat, est, and htg divisions. wgs entries are also excluded. not non-redundant.
nr protein automatic - ncbi formatted. ftp://ftp.ncbi.nih.gov/blast/db/nr.* non-redundant protein squence database with
entries from genpept, swissprot, pir, pdf, pdb
and ncbi refseq
swissprot protein automatic - ncbi formatted. ftp://ftp.ncbi.nih.gov/blast/db/swissprot.tar.gz swiss-prot sequence databases (last major update),
it's parent database is nr.
human_genomic nucleic automatic - ncbi formatted. ftp://ftp.ncbi.nih.gov/blast/db/human_genomic.* human refseq (nc_######) chromosome records
with gap adjusted concatenated nt_ contigs
est_human nucleic automatic - ncbi formatted. ftp://ftp.ncbi.nih.gov/blast/db/est_human.* alias and mask files for human subset of the est
database. these alias and mask files need all volumes
of est to function properly.
pataa protein automatic - ncbi formatted. ftp://ftp.ncbi.nih.gov/blast/db/pataa.* patent protein sequence database. directly from
uspto or from eu/japan patent agencies via embl/ddbj
patnt nucleic automatic - ncbi formatted. ftp://ftp.ncbi.nih.gov/blast/db/patnt.* patent nucleotide sequence database. directly from
uspto or from eu/japan patent agencies via embl/ddbj
pdbaa protein automatic - ncbi formatted. ftp://ftp.ncbi.nih.gov/blast/db/pdbaa.* protein sequneces from pdb protein structures, it's parent
database is nr.
pdbnt nucleic automatic - ncbi formatted. ftp://ftp.ncbi.nih.gov/blast/db/pdbnt/* nucleotide sequences from pdb nucleic acid structures.
it's parent database is nt. they are not the protein coding
sequences for the corresponding pdbaa entries.
sts nucleic automatic - ncbi formatted. ftp://ftp.ncbi.nih.gov/blast/db/sts.* sequences from the sts division of genbank, embl, and ddbj
vector nucleic automatic - ncbi formatted. ftp://ftp.ncbi.nih.gov/blast/db/vector.* vector sequence database. (note that for vector screening,
ncbi recommend using the univec database, please contact
support@qfab.org should you require this database).
