GBIF Backbone → Catalogue of Life (COL 26.5 XR) identifier mapping

This TSV file gbif-col265.tsv.gz maps taxon identifiers from the legacy GBIF Backbone Taxonomy (integer usageKeys) to the corresponding taxa in the Catalogue of Life 26.5 Extended Release (COL 26.5 XR) (alphanumeric identifiers). It lets you translate any record, link, or analysis keyed on old GBIF backbone IDs to the taxonomy that GBIF now uses by default.

Why this mapping exists

GBIF has discontinued the periodic build of its own backbone taxonomy and switched its primary taxonomy to the Catalogue of Life Extended Release. The old GBIF Backbone is frozen but kept available for backwards compatibility, and every occurrence is now matched against both taxonomies in parallel. Anyone with existing data, code, or persistent links built on the integer backbone keys needs a crosswalk to the new COL identifiers — that is what this file provides.

Identifier formats

The two taxonomies use different, non-overlapping identifier schemes, so the source of any key is unambiguous:

File structure

Tab-separated values (TSV), one row per GBIF backbone usage, with a header row. Each row carries the source taxon from the GBIF Backbone (gbif: columns) and the taxon it was matched to in COL 26.5 XR (col: columns).

GBIF Backbone (source)

ColumnDescription
gbif:IDInteger usage key in the GBIF Backbone Taxonomy.
gbif:statusTaxonomic status in the backbone (e.g. accepted, provisionally accepted).
gbif:rankTaxonomic rank (e.g. kingdom, genus, species).
gbif:scientificNameScientific name (canonical, without authorship).
gbif:authorshipName authorship.
gbif:kingdomgbif:genusHigher classification (kingdom, phylum, class, order, family, genus).
gbif:speciesParent species name for infraspecific records.

COL 26.5 XR (target)

ColumnDescription
col:IDAlphanumeric identifier of the matched name in COL 26.5 XR.
col:rankRank of the matched COL name.
col:scientificNameMatched COL scientific name (canonical).
col:authorshipAuthorship of the matched COL name.
col:statusStatus of the matched name in COL (e.g. accepted, provisionally accepted, synonym).
col:acceptedIDIf the match is a synonym, the COL ID of the accepted taxon it points to.
col:acceptedScientificNameAccepted COL name (when col:status is synonym).
col:acceptedAuthorshipAuthorship of the accepted COL name.
col:kingdomcol:genusHigher classification of the accepted COL taxon.
col:classificationFull COL classification as a |-separated list of RANK:name authorship pairs, from genus up to domain.

Notes on interpretation

We primarily used ChecklistBank matching services to generate it. A basic analysis of the matching gaps has been done. Notably all UNITE and GTDB molecular TOU names have no match in COL any longer. COL uses a newer version 10 of UNITE.

Looking up COL identifiers via the GBIF matching service

You don't need this file for one-off lookups. The GBIF species-match API (v2) can resolve a GBIF Backbone usage key to its COL XR equivalent on demand. Point the match service at the COL Extended Release with checklistKey=xcol and pass the backbone key as taxonID=gbif:<usageKey>:

GET https://api.gbif.org/v2/species/match?checklistKey=xcol&taxonID=gbif:797

Here gbif:797 is the backbone key for the order Lepidoptera and xcol is the alias for the Catalogue of Life Extended Release (equivalently, checklistKey=7ddf754f-d193-4cc9-b351-99906754a03b). The response carries the matched COL usage; its key is the COL identifier — the same value as the col:ID column in this file:

{
  "usage": {
    "key": "B6L67",
    "name": "Lepidoptera",
    "rank": "ORDER",
    "status": "ACCEPTED"
  },
  "diagnostics": { "matchType": "VARIANT" }
  // the full response also includes the "classification" array
}

Notes:

Sources & downloads