Schema change release: May 19, 2025

MusicBrainz is announcing a new schema change release set for May 19, 2025. Erratum: Search upgrades will shortly follow it. Like most of our recent schema changes, it should have little or no impact to downstream users.

There is one change to a major replicated table worth mentioning upfront: the medium table will have a new gid column added. If you’re running custom SQL queries against the database that join the medium table at all, there is a small chance you could run into errors like ERROR: column reference "gid" is ambiguous if you’re not properly qualifying the columns being selected.

We’re also altering some columns on the artist_release and artist_release_group tables (see below for more details). These are materialized tables used by our website on the back-end to speed up certain pages; you should normally not be accessing them directly, but it’s worth mentioning just in case. These tables do exist on mirrors, but are only populated with data if you’ve run admin/BuildMaterializedTables before.

Besides replacing some functions/triggers, you generally shouldn’t have to worry about any other database breaking changes in this release.

Finally, here is the complete list of scheduled tickets:

Database schema

The following tickets change the database schema in some way.

  • MBS-9253: List EP release groups above singles on artist pages. A small change to the get_artist_release_group_rows function is required in order to be able to change the sorting of release groups to prioritize EPs over singles. The function will be changed to depend on the type’s child_order (which can be safely changed at any time) rather than its id for sorting. While this function exists on mirrors, the function change shouldn’t have any impact on them directly (but a change of the child_order of the types will affect the sorting for display on mirrors as well). We’ll be adding new triggers to the release_group_primary_type and release_group_secondary_type tables to run the function when the tables change – these triggers will also exist on mirrors.
  • MBS-13322: Race condition when removing unused URLs. A rare internal error can occur in one of our trigger functions that cleans up unused URLs. We’ll replace that function, delete_unused_url, updating it to avoid a “race condition” whereby a URL can become used again the moment before it’s deleted. This will have no impact on mirrors, as delete_unused_url is only invoked by triggers that don’t exist on mirrors.
  • MBS-13464: Inconsistent sorting of artist release/release group titles. In the May 2021 schema change, we added some new materialized tables to significantly speed up the loading of artists’ release and release group listings: the not-so-surprisingly named artist_release and artist_release_group tables. These work by efficiently indexing an artist’s releases and release groups by date and other attributes, and then finally by their titles. Except for efficiency reasons, we originally decided to only store the first character of the titles for sorting. That predictably leads to incorrect sorting in certain cases, like with undated live bootlegs, as shown in MBS-13464. After measuring the actual size impact, we’ve decided to update the artist_release and artist_release_group tables to replace their sort_character columns with name columns that store the complete titles.
  • MBS-13768Add MBIDs to mediums. Adds a gid column to the medium table, and a new medium_gid_redirect table. It generates MBIDs for existing mediums that will be replicated to mirrors.
  • MBS-13832: Also support PDF files in CAA / EAA index_listing (for is_front purposes). PDF files are never treated as front for cover art archive purposes, probably because they originally did not have PNG thumbnails generated by the Internet Archive. That changed quite a while ago though, and there seems to be no reason to single them out anymore. We will just replace the index_listing views for cover_art_archive and event_art_archive with ones amended to not filter out PDF files.

    Note: Databases created before schema 25 (2017) may be missing filesize columns from their cover_art_archive.index_listing view (MBS-14014). This upgrade will add those columns. Since this view is for internal use, and these columns already exist on databases created in the past 8 years, we believe this shouldn’t pose any real compatibility issue.
  • MBS-13964: Some recordings are missing a first release date. A bug was discovered that causes recordings to sometimes have incorrect first-release-date values if any of the releases they’re attached to are merged with the “append” strategy. We’ll be adding a new trigger to the medium table that updates recording_first_release_date properly when such merges occur. Note that since recording_first_release_date is a materialized table, this trigger will also run on mirrors; that way it’s kept up-to-date even after running admin/BuildMaterializedTables initially.
  • MBS-13966: Release group first release dates need to be recalculated. Another (unrelated) issue with “first release date” information, but this time with release groups rather than recordings. We’ve found that a very small percentage of release groups’ first release dates (as stored in the release_group_meta table and returned in the web service) are wrong; it’s uncommon, but one way this can occur is when all releases in a release group are moved out of it. To address this, we’ll update the set_release_group_first_release_date function, which will have the exact same signature as before. We’ll also run a script to rebuild the incorrect data.

Erratum: This list originally included MBS-13965: Extend entity attribute schema to mediums – until we realized this was actually already done eight years ago. Isn’t time weird.

Update 2025-05-02: MBS-13966 was moved to this section, because we found out that a function needs to be updated to fix the underlying issue.

Update 2025-05-08: MBS-14014 has been fixed and mentioned as part of MBS-13832 above.

Search indexes

Erratum: Solr 9 will be made available for mirrors in a separate release.

Data corrections to the recording_first_release_date and release_group_meta tables do affect indexed recording and release group data respectively. If you have live search indexing enabled, those changes should be propagated to the search indexes automatically. Otherwise, you will have to perform a full reindex of those entities’ search indexes.

We’ll post upgrade instructions for standalone/mirror servers on the day of the release. If you have any questions, feel free to comment below or on the relevant above-linked tickets.

2 thoughts on “Schema change release: May 19, 2025”

  1. The first complete set of data dump files (MB & Search) will be available on Wednesday, 21.05.2025?

  2. Yes, but the search will still be powered by Solr 7.

    The Solr 9 upgrade experienced a few more issues that forced us to delay it to a separate release. Hopefully, later on this week if possible.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.