Tag: Basic Local Alignment Search Tool (BLAST)

Now Available! NCBI Hidden Markov Models (HMM) Release 19.0

Now Available! NCBI Hidden Markov Models (HMM) Release 19.0

Download release 19.0 of the NCBI protein profile Hidden Markov models (HMMs) used by the Prokaryotic Genome Annotation Pipeline (PGAP). You can search this collection against your favorite prokaryotic proteins to identify their function using the HMMER sequence analysis package.  

What’s new?  

Release 19.0 contains:  

  • 18,513 HMMs maintained by NCBI  
  • 465 new HMMs since release 18.0  

Continue reading “Now Available! NCBI Hidden Markov Models (HMM) Release 19.0”

An Updated Bacterial and Archaeal Reference Genome Collection is Available!

An Updated Bacterial and Archaeal Reference Genome Collection is Available!

Download the updated bacterial and archaeal reference genome collection! We built this collection of 22,420 genomes by selecting the “best” genome assembly for each species among the 450,000+ prokaryotic genomes in RefSeq. 

What’s new? 
  • One species is represented in this collection for the first time 
  • 323 species are represented by a better assembly
  • Six species were removed because of changes in NCBI Taxonomy or uncertainty in their species assignment 

Continue reading “An Updated Bacterial and Archaeal Reference Genome Collection is Available!”

Top Posts of 2025: A Look at the NCBI Insights Blog

Top Posts of 2025: A Look at the NCBI Insights Blog

As we begin a new year, let’s look back at the top viewed NCBI Insights Blog posts of 2025!   

In case you missed any of these, check them out: Continue reading “Top Posts of 2025: A Look at the NCBI Insights Blog”

An Updated Bacterial and Archaeal Reference Genome Collection is Available!

An Updated Bacterial and Archaeal Reference Genome Collection is Available!

Download the updated bacterial and archaeal reference genome collection! We built this collection of 22,082 genomes by selecting the “best” genome assembly for each species among the 440,000+ prokaryotic genomes in RefSeq. 

What’s new? 
  • 28 species are represented in this collection for the first time 
  • 228 species are represented by a better assembly 
  • Six species were removed because of changes in NCBI Taxonomy or uncertainty in their species assignment 

Continue reading “An Updated Bacterial and Archaeal Reference Genome Collection is Available!”

Now Available: Updated Bacterial and Archaeal Reference Genome Collection

Now Available: Updated Bacterial and Archaeal Reference Genome Collection

Download the updated bacterial and archaeal reference Genome collection! We built this collection of 21,794 genomes by selecting the “best” genome assembly for each species among the 400,000+ prokaryotic genomes in RefSeq, which is 536 more than was included in the January release. Continue reading “Now Available: Updated Bacterial and Archaeal Reference Genome Collection”

Faster, Better Results for Protein BLAST Searches

Faster, Better Results for Protein BLAST Searches

Effective August 2025, ClusteredNR will become the protein BLAST default database 

We are excited to announce that the default database for protein BLAST searches will soon be the NCBI ClusteredNR database! Introduced in 2022, ClusteredNR is a collection of protein sequence clusters built from the current default database, nr. The representative sequence is chosen for each cluster, which is generally well-annotated and indicates the function of the proteins in the cluster, helping you focus on meaningful biological insights and decreasing redundant results.  

What’s better about ClusteredNR?
  • Faster searches 
  • Decreased redundancy in results 
  • Broader taxonomic coverage in results 

Continue reading “Faster, Better Results for Protein BLAST Searches”

An updated bacterial and archaeal reference genome collection is available!

An updated bacterial and archaeal reference genome collection is available!

Download the updated bacterial and archaeal reference genome collection! We built this collection of 21,258 genomes by selecting the “best” genome assembly for each species among the 400,000+ prokaryotic genomes in RefSeq.

What’s new?

As previously announced, we updated our release process:

  1. There is now an incremental process. In addition to quarterly releases, there will be weekly updates to create references for new species which do not have a reference genome and to correct any inconsistencies in the set of references due to taxonomic merges. As a result, there may be more frequent updates to the reference set.
  2. There is now a history tracking file available under the ASSEMBLY_REPORTS path on FTP that lists the history of reference genome selection, including both prokaryotes and eukaryotes. 

Continue reading “An updated bacterial and archaeal reference genome collection is available!”

NCBI Taxonomy: Upcoming Changes to Viruses

NCBI Taxonomy: Upcoming Changes to Viruses

To reflect changes to the International Code of Virus Classification and Nomenclature (ICVCN) made by the International Committee on Taxonomy of Viruses (ICTV), NCBI will add binomial species names to about 3000 viruses. These updates to NCBI Taxonomy are planned for spring 2025, but you can view the changes now in the ICTV’s Virus Metadata Resource. 

We recognize that the former species names like Human immunodeficiency virus 1 (HIV-1) are broadly used in public health, educational institutions, and research. To minimize the impact of this change on those who use NCBI resources, we will add the new binomial species names (e.g. Lentivirus humimdef1) while keeping the former names available in the lineage for each species. The former names will move below the new binomial species name in the taxonomy hierarchy, ensuring continuity. Examples are provided below.   Continue reading “NCBI Taxonomy: Upcoming Changes to Viruses”

Updated Bacterial and Archaeal Reference Genome Collection now Available!

Updated Bacterial and Archaeal Reference Genome Collection now Available!

Download the updated bacterial and archaeal reference genome collection! We built this collection of 20,403 genomes by selecting the “best” genome assembly for each species among the 350,000+ prokaryotic genomes in RefSeq (except for E. coli for which two assemblies were selected as reference). Changes have been made to the selection criteria including upgrades for type and complete assemblies resulting in a much larger set of changes as compared to previous updates.

What’s New?
  • 2,298 species have an updated reference       
  • 1,123 species are represented in this collection for the first time
  • 1,125 species have a better reference assembly than in the April 2024 set
  • 50 species were removed because of changes in NCBI Taxonomy or uncertainty in their species assignment 

Continue reading “Updated Bacterial and Archaeal Reference Genome Collection now Available!”