DNA barcoding is a powerful tool for species identification from any type of biological tissue. DNA barcodes are linked to voucher specimens in scientific collections (museums) and deposited in a public, open-access database (BOLD).
NorBOL is a network of Norwegian biodiversity institutions and individual scientists engaged in DNA barcoding of the fauna and flora of Norway. The Norwegian University Museums provide biological tissue samples from museum specimens. The Norwegian GBIF-node has now linked the first museum specimens in GBIF to their BOLD sequence data.
Linking barcode data from BOLD to specimens in GBIF has a high priority in the current GBIF work-plan. The GBIF Science Committee (represented by chair Rod Page) published in December 2016 a snapshot of the iBOL dataset (doi:10.15468/inygc6) including a total of 2,789,906 occurrences, including 39,956 occurrences from Norway. However, the link to the museum specimens themselves was not maintained.
Examples from the GBIF iBOL 2016 dataset:
- GBIF iBOL occurrence record: https://www.gbif.org/occurrence/1415958347
- BOLD data record: http://bins.boldsystems.org/index.php/Public_RecordView?processid=LON2542-15
Linking Norwegian university museum specimens to BOLD
Together with the NorBOL coordinator at the UiO Natural History Museum in Oslo, Gunnhild Marthinsen and Lars Erik Johannessen at the NHMO DNA Bank, we have now started to link the specimens at NHMO together with the corresponding BOLD DNA barcode data records. The most reliable specimen identifier in GBIF is the dwc:occurrenceID. There is also the traditional and (more) human readable dwc:catalogNumber identifying a museum specimen. The BOLD Process ID is the most important identifier for material samples corresponding to the museum specimens. BOLD also provide a "Museum ID" and a "Sample ID" however, nether match exactly the occurrenceID or the catalogNumber in GBIF.
GBIF | BOLD |
occurrenceKey = 1426521030 | Process ID = NOBAS010-14 |
occurrenceID = urn:catalog:O:F:75130 | Museum ID = O-F-75130 |
catalogNumber = 75130 | Sample ID = O-F-75130 |
eventID/fieldNumber = [blank] | Field ID = MY1-0568 |
GBIF API: 1426521030 | BOLD API: NOBAS010-14 |
![BOLD](/news/2017/images/bold-nobas010-14.png)
API: http://www.boldsystems.org/index.php/API_Public/sequence?ids=NOBAS010-14
To establish the link, Gunnhild provided a table with only two columns including the BOLD Process ID and the catalogNumber. Based on this list we used the BOLD API to extract data from BOLD. Information from BOLD was mapped to the Darwin Core MeasurementOrFact extension and added to the Museum Specimen specimen published in GBIF. The BOLD Process ID identifiers will be registered in the national Norwegian Museum collection management system (MUSIT) to allow for automatic mapping of all the Norwegian Museum Collections to BOLD in a similar manner. With this pilot we linked 390 fungi specimens to BOLD, and we plan to implement automatic routines for mapping all Norwegian museum collections in GBIF to BOLD.
![GBIF Portal](/news/2017/images/gbif-urn-catalog-o-f-75130.png)
API: http://api.gbif.org/v1/occurrence/1426521030/verbatim
Some of the other examples:
- https://www.gbif.org/occurrence/1141981356
- https://www.gbif.org/occurrence/1229492086
- https://www.gbif.org/occurrence/1229542859
- https://www.gbif.org/occurrence/1229539073
- https://www.gbif.org/occurrence/1229507876
- https://www.gbif.org/occurrence/1229507961
- https://www.gbif.org/occurrence/1229507901
- https://www.gbif.org/occurrence/1095063188
- https://www.gbif.org/occurrence/1229507959
- ...
Join us to discuss the mapping of terms from the BOLD API at GitHub!
https://github.com/GBIF-Europe/bold_sequence
Barcode data in GBIF:
- Consortium for the Barcode of Life, https://www.gbif.org/participant/287 (GBIF member since 2005)
- International Barcode of Life Consortium, https://www.gbif.org/participant/353 (GBIF member since 2011)
- Centre for Biodiversity Genomics (BIOUG), University of Guelph, https://doi.org/10.5886/qzxxd2pa (dataset published July 2014)
- iPhylo blog post, http://iphylo.blogspot.no/2016/12/dna-barcoding-taxonomy-now-in-gbif.html (blog December 2016)
- iBOL dataset in GBIF, https://doi.org/10.15468/inygc6 (snapshot dataset December 2016, 2,789,906 occurrences)
Further reading:
- DNA Barcoding: https://en.wikipedia.org/wiki/DNA_barcoding
- BOLD - Barcode of Life datasystems at Guelph University in Canada
- BOLD API, http://v4.boldsystems.org/index.php/api_home
- NorBOL - Norwegian Barcode of Life | twitter/norwbol
- CBOL - Consortium for the Barcode of Life, at Smithsonian
- iBOL - International Barcode of Life Project