Database Schema

Ideally, we should be using the database schema worked out for the hullworks (Writers' Website Toolkit) project.

The following are specified for a MySQL database. The initial starting point here is the Christgau website database, although there are changes: dropping some of the idiosyncrasies of matching Christgau's legacy data, and adding some additional information that Christgau had no interest in, and/or that provide greater flexibility in manipulating the data.

The biggest difference between this and most music metadata databases is that we do not specify a fundamental role for songs. This is intended as a major simplification, since it eliminates the need for per-song data. On the other hand, it precludes album lookup by song/songwriter, and makes it hard to explore how songs are packaged into albums.

Also see the New Release tables.

Artist Information

Each album is attributed to one artist. Each artist has a name. Because the names alone are not necessarily unique, it may be necessary to add a qualifier to help distinguish artists who share the same name. (Figuring out a systematic way to use this field will be a problem: possible approaches include dates, genre, instrument.)

An artist may be a person or a group of people; an artist may be an alias. In general, there should be a distinct artist record for each distinct album artist attribution. However, it is useful to group some artist names together.

Table: tz_artist
idint unsigned auto_increment primary key: ID, used by other tables to refer to an artist.
namevarchar(250) not null: Artist name.
qualvarchar(128) null: Optional qualifier for name, used to uniquely identify artists with shared names.
sortvarchar(250) not null: Sort key for artist name.
ranksmallint: A ranking number assigned to each artist. Although assignment of such a number is arbitrary, the purpose of this field is only to support more selective artist lists. In particular, we typically wish to suppress artist names that are wholly subordinate to other artist names. Suggested values: 0: artist should not be indexed, since only appears as a secondary name for other artists; 1: default value, artist belongs in comprehensive indexes; 2: artist should appear as prominent in genre or other division; 3: artist should appear as prominent overall. It's possible that these values could be generated heuristically based on other database data.
Table: tz_artist_map
major_idinteger references tz_artist.id: Major artist identifier: this is a primary artist identity, and any minor artist identities mapped to it are included under the artist.
minor_idinteger references tz_artist.id: Minor artist identifier: this is a secondary artist identity, which is usually accessed under one or more major artist identities.

There is a many-to-many relationship between people and artists, so we break per-person information out into a separate table. The person table can be used for purposes other than mapping to the artist table, but that is its main purpose here.

Table: tz_person
idint unsigned auto_increment primary key: ID, used by other tables to refer to a person.
namevarchar(250) not null: Person name.
sortvarchar(250) not null: Sort key for person name.
bdaydate null: Birth date.
ddaydate null: Death date.
countryvarchar(60): Country/location, encoded with two-letter nation code, optionally followed by colon, state/province, colon, city. Multiple locations can be separated by semicolon. Each location can specify a set of years in curly braces.
Table: tz_group_map
person_idint unsigned references tz_person.id: Person.
group_idint unsigned references tz_artist.id: Group (artist) person maps to.

Label Information

Table: tz_label
idint unsigned auto_increment primary key: ID, used by other tables to refer to a label.
namevarchar(250) not null: Label name.
qualvarchar(128) null: Optional qualifier for name, used to uniquely identify labels with shared names.
sortvarchar(250) not null: Sort key for label name.
Table: tz_label_map
major_idinteger references tz_label.id: Major label identifier: this is the primary label identity, and any minor label identities to it are included under the label.
minor_idinteger references tz_label.id: Minor label identifier: this is a secondary label identity, which is usually accessed under one or more major label identities.

Album Information

Table: tz_album
idint unsigned auto_increment primary key: ID, used by other tables to refer to an artist.
artist_idint unsigned references tz_artist.id: Artist album is attributed to.
namevarchar(250) not null: Album name (title).
Table: tz_release
idint unsigned auto_increment primary key: ID, used by other tables to refer to an album release.
album_idint unsigned references tz_album.id: Album common information.
label_idint unsigned references tz_label.id: Label.
label_novarchar(80): Label identification number.
rel_datedate: Release date.