Effective August 2025, ClusteredNR will become the protein BLAST default database
We are excited to announce that the default database for protein BLAST searches will soon be the NCBI ClusteredNR database!Introduced in 2022, ClusteredNR is a collection of protein sequence clusters built from the current default database, nr. The representative sequence is chosen for each cluster, which is generally well-annotated and indicates the function of the proteins in the cluster, helping you focus on meaningful biological insights and decreasing redundant results.
There is now an incremental process. In addition to quarterly releases, there will be weekly updates to create references for new species which do not have a reference genome and to correct any inconsistencies in the set of references due to taxonomic merges. As a result, there may be more frequent updates to the reference set.
There is now a history tracking file available under the ASSEMBLY_REPORTS path on FTP that lists the history of reference genome selection, including both prokaryotes and eukaryotes.
We recognize that the former species names like Human immunodeficiency virus 1 (HIV-1) are broadly used in public health, educational institutions, and research. To minimize the impact of this change on those who use NCBI resources, we will add the new binomial species names (e.g. Lentivirus humimdef1) while keeping the former names available in the lineage for each species. The former names will move below the new binomial species name in the taxonomy hierarchy, ensuring continuity. Examples are provided below. Continue reading “NCBI Taxonomy: Upcoming Changes to Viruses”→
Download the updated bacterial and archaeal reference genome collection! We built this collection of 20,403 genomes by selecting the “best” genome assembly for each species among the 350,000+ prokaryotic genomes in RefSeq (except for E. coli for which two assemblies were selected as reference). Changes have been made to the selection criteria including upgrades for type and complete assemblies resulting in a much larger set of changes as compared to previous updates.
What’s New?
2,298 species have an updated reference
1,123 species are represented in this collection for the first time
1,125 species have a better reference assembly than in the April 2024 set
50 species were removed because of changes in NCBI Taxonomy or uncertainty in their species assignment