Environmental Barcoding through Massively Parallelized Sequencing
Canada faces an ongoing challenge to assess and report on the status of biodiversity at a regional and national scale. This is driven by two linked factors: (1) lack of a consistent methodology to measure biodiversity, particularly in remote habitats and (2) a consequent lack of data observed and recorded in a consistent manner. As a result, Canadians remain uninformed about a key component of their natural resources and, equally important, sustainable economic development in Canada is constrained by a lack of data by which to balance economic benefits against environmental costs in an informed way.
Biodiversity monitoring programs in Canada are focused on the delivery of sound science to regulators for the purpose of timely and accurate reporting on the state of the environment, and to provide assurance to the Canadian public that their natural environment is being protected. Such information is of equal importance and value to industries involved in natural resource extraction (mining, oil & gas), natural resources management (forestry), and agriculture, all of whom benefit – both in terms of home and overseas markets – from the ability to demonstrate that they are following best practices in environmental management and the stewardship on natural resources.
The exploitation of Canada’s North is expanding as technological improvements and rising commodity prices make northern reserves economically attractive. For example, oil sands alone are estimated to be worth ca. $350 billion (5% of Canada’s national capital), yet the loss of environmental goods and services (e.g. carbon sinks, water purification, country foods) associated with their extraction remains unquantified. In this regard, DNA-based biomonitoring offers a rapid, cost-effective and information-rich method of data collection, which can be easily integrated into long-term ecosystem monitoring programs and site-specific inventory and assessment.
Integration of DNA-based taxonomic analysis into Canada’s National Parks system is already underway, where the need to balance change without compromising visitor experience presents a unique challenge for Parks management. Parks managers are keen to extend their management tools to embrace this new technology, providing it can be made more accessible and can provide timely feedback to support management needs.
Environmental barcoding is a democratising technology, in that it makes information available to indigenous people which was previously only obtainable from a small group of experts. The rapid assessment of hunting and fishing areas, habitats for key species at risk such as fish, waterfowl and other game, and for culturally critical species such as caribou would be a significant contribution to their sound and sustainable management.
The need to provide rapid and accurate advice regarding the onset and impacts of a changing climate remain uppermost in the minds of Canadians, as our country is in the front line for climate impacts. The northward spread of southern species is predicted to occur with unforeseen consequences for ecosystem management. DNA-based taxonomy, linked to next generation sequencing is a key technology which promises to deliver data coverage on a scale unprecedented in previous biosurvey approaches, allowing better alignment with other earth observing systems, including satellite observation and geographic information systems – two other areas with strong potential synergy in relation to sustainable management of natural resources, and also two frontier areas where Canada is also an international leader.
In this project we developed an approach for an optimal use of NGS in environmental analysis. In other words, our project sets the stage for moving the powerful NGS tools from basic academic research to the real-world biomonitoring applications. The environmental barcoding technology is now available to Canadian Agencies and academic users as well as wider scientific community and international organizations. Our software pipeline for analysis of environmental barcoding data has been developed as an open-source online system, as was originally planned in our proposal, and is accessible to Canadian and international users freely.
DNA barcoding represents a novel genomics exercise – the collection of sequence information from a standard gene region across eukaryotic life. A 648 base pair segment of the mitochondrial gene cytochrome oxidase 1 (CO1) has now been selected as the core barcode region for eukaryotes. The horizontal survey of sequence diversity in this gene region is valuable in many contexts; it enables species identification and discovery; it reveals factors influencing rates of molecular evolution and species age; it allows detailed study of evolutionary pathways in the CO1 protein. Motivated by these factors, planning is now underway for a massive international research program to rapidly expand the reference library of CO1 sequences. The International Barcode of Life Project (IBOL), a $150M, 5-year program driven by Canada, will involve researchers from 25 nations. Over this interval, IBOL will deliver barcode records for 500K species and subsequent efforts will produce a barcode reference library for all eukaryotes.
Although completion of a library for all eukaryotes may require 20 years, DNA barcoding is already gaining application as barcode libraries reach closure in varied groups. It is now clear that one particularly important area of application for DNA barcoding will lie in species identifications. It is also apparent that such applications will involve two different technology streams: point-of-contact analysis of single specimens and massive barcode screens. Point-of-contact devices allowing immediate analysis will be critical for species identifications in contexts such as port inspections and pest control.
By contrast, the second technology stream, the focus of this application, will enable the analysis of mixed biotic samples, albeit less rapidly. We emphasize that many eukaryotes are too small, too numerous or too conjoined to be analyzed using conventional barcode protocols. Our current application will break this barrier by developing the protocols required for the analysis of any collection of eukaryotes. We term this approach ‘environmental barcoding’ and are sure that its implementation lies in coupling massively parallelized sequencing technologies with new informatics tools. Such analysis will certainly advance biodiversity monitoring.
Imagine the newly sophisticated capacity to monitor environmental quality that would result if we could rapidly gain information on the species composition of any environmental sample. However, the implications of our work are broader – environmental barcoding represents the metagenomics tool for eukaryotic life. This application seeks funding to develop environmental barcoding as a technology stream that can be adopted by sequencing platforms. Our work will make use of new parallelized sequencers, but considerable technological innovation will be required to enable them to support environmental barcoding. We point particularly to the need for new informatics tools to analyze sequence data and to the need for new protocols to enable barcode recovery from large, admixed samples of life. We not only expect to conquer these challenges, but we also plan work to show how environmental barcoding can support a real-world need, the biomonitoring of Canada’s inland waters.
We will carry out the latter work in close collaboration with researchers at Environment Canada who not only bring deep expertise in environmental sampling, but will be important end users of this technology. We will also collaborate with colleagues at the Stanford Genome Technology Centre who are world leaders in massively parallelized sequencing technologies. The environmental barcoding technology developed in this project will reinforce Canada’s leadership position in DNA barcoding by introducing the first application of this approach to biodiversity monitoring of eukaryote communities.