Collections and specimens


Accelerating the rate at which historic specimen-based data are made discoverable and accessible.

Progress: significant (significant progress made, further investment needed to complete)


The past 250 years of biodiversity research have resulted in a treasure trove of preserved specimens held in the world’s natural history collections, and they are still being added to today. These collections form the irreplaceable foundation of our knowledge of biodiversity, as well as a source of DNA samples for future analysis. Digitizing the data embedded within these specimens dramatically improves our understanding of species distributions, morphology and population variation, including changes over time. As with published materials, the scale of the task and its importance to the framework means this component requires continued and long-term investment. Widespread development and adoption of the most efficient techniques could dramatically accelerate current digitization efforts, making continued improvement of methodologies an urgent task.

In recent years, museums and herbaria have increasingly begun to capture the data contained in these specimens and their labels, and in accompanying field notes, making them available through aggregators such as GBIF, its network of national nodes and data publishers, and through thematic networks such as the Ocean Biogeographic Information System (OBIS). But the work is labour intensive and their efforts are dwarfed by the scale of the task. Some have begun to accelerate digitization efforts through exploring automation, adopting highly-efficient workflows, and the use of volunteers. Others have pioneered crowd sourcing through making specimen images available online, for example the Biodiversity Volunteer portal of the Atlas of Living Australia (ALA), and the ‘Herbonautes’ initiative of the Muséum National d’Histoire Naturelle in Paris.

The next step will be to document and share current best practices to help institutions choose the optimum approach to accelerate digitization. Institutions will need to prioritize digitization, while funding bodies and governments will have to make the resources available to develop the skills and deploy enough people to accelerate the task. Data quality improvements are also fundamental as digitization rates accelerate, whether through automated tools, or feedback mechanisms via the biodiversity knowledge network and fitness-for-use and annotation.

In the short term, natural history collections should continue to develop and document accelerated digitization techniques which organizations like GBIF and its nodes can use to develop training materials and programmes for smaller institutions.

In the medium term, GBIF and digitization projects should develop global infrastructure to support accelerated workflows, including the generation of identifiers and crowd-sourcing clearing houses.

In the long term, fully automated mass digitization should eliminate most bottlenecks, clearing the way to complete the digitization effort.