-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding content from the Nordic Pandemic Infrastructure (NeIC PaRI) project #37
Conversation
Great work @wna-se |
Caught up with Wolmar and wanted to summarise our talk (with Womar's permission): The PR is still in progress (more content will be added as time goes by). However, at this stage, it might be good to give Wolmar some feedback on if things are where they are 'supposed to be' and where other pieces might go. Therefore, it's beneficial to give comments at this stage from the editors side (I will commit some more time to all of this in general tomorrow) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started reading this through (looks really good!), but didn't continue to read through thoroughly as I started to wonder a bit more about where this should go. Right now, it's added as a 'showcase'. I see the logic in placing it as a showcase because it's a project that brought together multiple technologies. However, when we defined 'showcases', I thought that they would be more e.g. the COVID-19 Data Portal/Platform, i.e. tools. Under the strictest definition of 'tool', this would not really fit as a showcase. However, I think that I would be useful to keep it all together, and maybe a showcase is the best way to do that. I think it might also be good to have this information at least linked in the 'pathogen characterisation' sections though. Having it linked at various sections (rather than actually split between sections), would perhaps make maintenance of this content easier than breaking it all up. I think it depends on out definition of 'what is a showcase' though, for where it might fit best
Produced by the Nordic Pandemic Research Infrastructure project (NeIC PaRI) | ||
[https://neic.no/pari/](https://neic.no/pari/) | ||
|
||
This document aims to 1) outline what constitutes a good and ultimately (re)usable viral genome data record in a data repository with a Nordic perspective; and 2) provide guidance to projects, labs and other organisations producing or commissioning viral sequencing data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This document aims to 1) outline what constitutes a good and ultimately (re)usable viral genome data record in a data repository with a Nordic perspective; and 2) provide guidance to projects, labs and other organisations producing or commissioning viral sequencing data. | |
This document aims to: (1) outline what constitutes a good and ultimately (re)usable viral genome data record in a data repository with a Nordic perspective, and (2) provide guidance to projects, labs and other organisations producing or commissioning viral sequence data. |
|
||
## Study design and documentation | ||
|
||
National and international recommendations from public health authorities, epidemic surveillance programs and research data communities should be considered when planning a new study or surveillance programme. In particular, you could consult relevant guidance issued by national and international surveillance programs while considering widely adopted guidelines for research documentation, and recommendations provided by data sharing platforms and communities such as INSDC[^1], EMBL-EBI[^2], and RDA[^3]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
National and international recommendations from public health authorities, epidemic surveillance programs and research data communities should be considered when planning a new study or surveillance programme. In particular, you could consult relevant guidance issued by national and international surveillance programs while considering widely adopted guidelines for research documentation, and recommendations provided by data sharing platforms and communities such as INSDC[^1], EMBL-EBI[^2], and RDA[^3]. | |
National and international recommendations from public health authorities, epidemic surveillance programs and research data communities should be considered when planning a new study or surveillance programme. In particular, you could consult relevant guidance issued by national and international surveillance programs while considering widely adopted guidelines for research documentation and recommendations provided by data sharing platforms and communities such as INSDC[^1], EMBL-EBI[^2], and RDA[^3]. |
Thank you for the input! I haven’t had time to work on this since the face-to-face workshop I should have some time in the upcoming weeks and should be able to try to experiment with dissecting the text to populate the other parts of the IDTkit and/or aligning it with the emerging template for the showcases #63 |
Thank you @wna-se for the contribution and all the time you already spent on this! I do get divided between keeping all the content together in a single showcase to represent the entire product together, and splitting the general/universal advice and guidelines from the showcase specific application. I do fear that keeping all the info together will lead to a lot of content duplication and that one of the nice products out of IDTk will be a distilled set of best practices out of all the showcases. |
- [Swedish COVID-19 Data Portal Support Services](https://www.covid19dataportal.se/support_services/) | ||
- [SciLifeLab Data Guidelines](https://scilifelab-data-guidelines.readthedocs.io/en/latest/docs/covid-19/index.html) | ||
|
||
### NeIC PaRI case study |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would the three cases make for three independent showcase pages?
They do seem to fit as combinations of tools, services, and standards to gather, process, and submit data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that the three below could be showcases too, perhaps and maybe linked here when they're up? Maybe the national pages could also be linked for the above examples too?
|
||
Combining information on virus characteristics with clinical and epidemiological data is desirable when studying various aspects of a pandemic. Viral genomic data provide insight into some important virus characteristics and should be reported to and shared with the global research and clinical communities as early and as openly as possible. The following text will provide guidance on sharing data from viral genome sequencing. The focus will be on the processing and documentation required for sharing the data files containing reads produced by sequencing instruments and consensus sequences produced by analytic workflows for genome assembly—both of which constitute valuable resources for different audiences. | ||
|
||
## Study design and documentation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- SPLIT SUGGESTION. Could be restructured as considerations within the concrete topic 'good practices in data sharing', we originally mentioned about making repository specific, but perhaps this could be more 'general considerations, regardless of repository'
- Quality assessment | ||
- Output files | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- SPLIT SUGGESTION END- though perhaps Nordic considerations could be generalised, included somehow in the National resources pages of any relevant countries as almost 'additional information?)
|
||
Aspects related to common workflows for processing SARS-CoV-2 genome data includes keeping references to which protocols and versions were used in preparing and sequencing each individual sample and recording which samples and sequencing libraries were prepared together. This information can be used to identify and address issues related to both workflow/library specific artifacts and sample contamination and it would also be used to choose appropriate configurations and versions in analytic workflows. Some of this information can also be reflected in naming conventions for samples and libraries. | ||
|
||
### Selected references on study design and documentation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- SPLIT SUGGESTION Integrate disease specific sources to split suggestion 2? RDM link could go in page metadata.
|
||
- [ECDC’s Surveillance and study protocols](https://www.ecdc.europa.eu/en/covid-19/surveillance/study-protocols) | ||
|
||
- [ECDC’s TESSy reporting protocol for COVID-19](https://www.ecdc.europa.eu/sites/default/files/documents/COVID-19-Reporting-Protocol-v5.1.pdf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- END SPLIT SUGGESTION
|
||
- [ECDC’s TESSy reporting protocol for COVID-19](https://www.ecdc.europa.eu/sites/default/files/documents/COVID-19-Reporting-Protocol-v5.1.pdf) | ||
|
||
## Data sharing platforms and data standards |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- SPLIT SUGGESTION - integrate into Pathogen characterisation - data sources page as appropriate
|
||
[^10]: Some kits also include targeted amplification of human host control sequences | ||
|
||
### Data standards for SARS-CoV-2 genome data sharing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- SPLIT SUGGESTION - goes as a concrete topic specifically about genome data sharing within the pathogen characterisation, data description page. Can have breakdown as needed for different 'considerations'. Could also be split by specific repositories, if appropriate.
: Extend this table with additional rows corresponding to fields from the [NeIC PaRI Data dictionary for SARS-CoV-2 genome data](https://docs.google.com/spreadsheets/d/1gNpdZKOUKPemMUHR107JRSeaWjPczlMIUp5fP5-kR9g/edit#gid=1524315810) | ||
|
||
Custom fields | ||
: Extend this table with additional rows corresponding to additional factors documented for this assembly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- SPLIT SUGGESTION END - recommend significant breakage into different 'considerations' within the concrete topic though. The below references should be integrated as appropriate though, remembering to stick to references specific to infectious disease
- [SARS-CoV-2 ENA submission workflow + guidance for structuring and releasing metadata](https://dx.doi.org/10.17504/protocols.io.buqnnvve). | ||
- [SARS-CoV2 GISAID submission protocol](https://dx.doi.org/10.17504/protocols.io.bumknu4w). | ||
|
||
## National support / infrastructures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- SPLIT SUGGESTION - These should be integrated into the respective National pages/ created as different showcases
### Sweden | ||
|
||
- [Swedish COVID-19 Data Portal Support Services](https://www.covid19dataportal.se/support_services/) | ||
- [SciLifeLab Data Guidelines](https://scilifelab-data-guidelines.readthedocs.io/en/latest/docs/covid-19/index.html) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- SPLIT SUGGESTION END
- [Swedish COVID-19 Data Portal Support Services](https://www.covid19dataportal.se/support_services/) | ||
- [SciLifeLab Data Guidelines](https://scilifelab-data-guidelines.readthedocs.io/en/latest/docs/covid-19/index.html) | ||
|
||
### NeIC PaRI case study |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
6 cont. SPLIT SUGGESTION Perhaps the 'case study' could also integrate into national pages?
|
||
This document aims to 1) outline what constitutes a good and ultimately (re)usable viral genome data record in a data repository with a Nordic perspective; and 2) provide guidance to projects, labs and other organisations producing or commissioning viral sequencing data. | ||
|
||
## Introduction |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- SPLIT SUGGESTION: Could generalise this to be more of an introduction about data description in the pathogen characterisation section
## Introduction | ||
|
||
Combining information on virus characteristics with clinical and epidemiological data is desirable when studying various aspects of a pandemic. Viral genomic data provide insight into some important virus characteristics and should be reported to and shared with the global research and clinical communities as early and as openly as possible. The following text will provide guidance on sharing data from viral genome sequencing. The focus will be on the processing and documentation required for sharing the data files containing reads produced by sequencing instruments and consensus sequences produced by analytic workflows for genome assembly—both of which constitute valuable resources for different audiences. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- END SPLIT SUGGESTION
This pull request aims to contribute content from the Nordic Pandemic Infrastructure (NeIC PaRI) under the relevant parts of the IDTkit (#32). So far I’ve only added a markdown version of the guide under Showcase and I open this pull request to keep track of the effort and to open up a discussion on where different parts of the content could go.