- Standardizes data formatting and data entry.
- Provides validation and automates data transformations.
- Represents a vital tool for data curation.
A crucial next step in data preparation is checking the contextual data from different provinces and labs before sharing with the National Microbiology Laboratory’s access-controlled database and the public access Canadian VirusSeq Data Portal launched in April 2021.
What does data curation look like for CanCOGeN?
- The curation process involves:
- Checks for consistency and completeness of the data, as well as verification that the data makes sense.
- Troubleshooting, developing and updating standards to align with public health needs.
- Converting data to ensure it meets the database requirements of different organizations.
- Post-submission corrections and updates.
See a more detailed breakdown of the curation process.
Privacy and legal concerns
- Data sharing permissions vary across public health jurisdictions, so data curators must be aware of the many ethical, legal and privacy issues associated with different datasets and resources, such as those from access-controlled databases versus those from public access databases.
- A CanCOGeN data curator coordinates with the National Microbiology Laboratory and provincial partners to ensure these considerations are addressed.
Next steps for curated data and remaining challenges
“The curated data goes beyond simply reporting by providing a framework for communicating data about how viral infections are being transmitted through a diverse population.”
– Nithu John, Research Assistant at Simon Fraser University and Curator for the Canadian VirusSeq Data Portal (CanCOGeN).
The Canadian COVID-19 Genomics Network (CanCOGeN) is on a mission to respond to COVID-19 by generating accessible and usable data from viral and host genomes to inform public health and policy decisions, and guide treatment and vaccine development. This pan-Canadian consortium is led by Genome Canada, in partnership with six regional Genome Centres, the National Microbiology Lab and provincial public health labs, genome sequencing centres (through CGEn), hospitals, academia and industry across the country.