Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding content from the Nordic Pandemic Infrastructure (NeIC PaRI) project #37

Closed
wants to merge 2 commits into from

Conversation

wna-se
Copy link
Contributor

@wna-se wna-se commented Sep 12, 2022

This pull request aims to contribute content from the Nordic Pandemic Infrastructure (NeIC PaRI) under the relevant parts of the IDTkit (#32). So far I’ve only added a markdown version of the guide under Showcase and I open this pull request to keep track of the effort and to open up a discussion on where different parts of the content could go.

@bgruening
Copy link
Contributor

Great work @wna-se

@LianeHughes
Copy link
Contributor

Caught up with Wolmar and wanted to summarise our talk (with Womar's permission):

The PR is still in progress (more content will be added as time goes by). However, at this stage, it might be good to give Wolmar some feedback on if things are where they are 'supposed to be' and where other pieces might go. Therefore, it's beneficial to give comments at this stage from the editors side (I will commit some more time to all of this in general tomorrow)

Copy link
Contributor

@LianeHughes LianeHughes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started reading this through (looks really good!), but didn't continue to read through thoroughly as I started to wonder a bit more about where this should go. Right now, it's added as a 'showcase'. I see the logic in placing it as a showcase because it's a project that brought together multiple technologies. However, when we defined 'showcases', I thought that they would be more e.g. the COVID-19 Data Portal/Platform, i.e. tools. Under the strictest definition of 'tool', this would not really fit as a showcase. However, I think that I would be useful to keep it all together, and maybe a showcase is the best way to do that. I think it might also be good to have this information at least linked in the 'pathogen characterisation' sections though. Having it linked at various sections (rather than actually split between sections), would perhaps make maintenance of this content easier than breaking it all up. I think it depends on out definition of 'what is a showcase' though, for where it might fit best

Produced by the Nordic Pandemic Research Infrastructure project (NeIC PaRI)
[https://neic.no/pari/](https://neic.no/pari/)

This document aims to 1) outline what constitutes a good and ultimately (re)usable viral genome data record in a data repository with a Nordic perspective; and 2) provide guidance to projects, labs and other organisations producing or commissioning viral sequencing data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This document aims to 1) outline what constitutes a good and ultimately (re)usable viral genome data record in a data repository with a Nordic perspective; and 2) provide guidance to projects, labs and other organisations producing or commissioning viral sequencing data.
This document aims to: (1) outline what constitutes a good and ultimately (re)usable viral genome data record in a data repository with a Nordic perspective, and (2) provide guidance to projects, labs and other organisations producing or commissioning viral sequence data.


## Study design and documentation

National and international recommendations from public health authorities, epidemic surveillance programs and research data communities should be considered when planning a new study or surveillance programme. In particular, you could consult relevant guidance issued by national and international surveillance programs while considering widely adopted guidelines for research documentation, and recommendations provided by data sharing platforms and communities such as INSDC[^1], EMBL-EBI[^2], and RDA[^3].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
National and international recommendations from public health authorities, epidemic surveillance programs and research data communities should be considered when planning a new study or surveillance programme. In particular, you could consult relevant guidance issued by national and international surveillance programs while considering widely adopted guidelines for research documentation, and recommendations provided by data sharing platforms and communities such as INSDC[^1], EMBL-EBI[^2], and RDA[^3].
National and international recommendations from public health authorities, epidemic surveillance programs and research data communities should be considered when planning a new study or surveillance programme. In particular, you could consult relevant guidance issued by national and international surveillance programs while considering widely adopted guidelines for research documentation and recommendations provided by data sharing platforms and communities such as INSDC[^1], EMBL-EBI[^2], and RDA[^3].

@wna-se
Copy link
Contributor Author

wna-se commented Sep 19, 2022

Thank you for the input! I haven’t had time to work on this since the face-to-face workshop I should have some time in the upcoming weeks and should be able to try to experiment with dissecting the text to populate the other parts of the IDTkit and/or aligning it with the emerging template for the showcases #63

@rabuono
Copy link
Collaborator

rabuono commented Sep 20, 2022

Thank you @wna-se for the contribution and all the time you already spent on this!

I do get divided between keeping all the content together in a single showcase to represent the entire product together, and splitting the general/universal advice and guidelines from the showcase specific application.

I do fear that keeping all the info together will lead to a lot of content duplication and that one of the nice products out of IDTk will be a distilled set of best practices out of all the showcases.

- [Swedish COVID-19 Data Portal Support Services](https://www.covid19dataportal.se/support_services/)
- [SciLifeLab Data Guidelines](https://scilifelab-data-guidelines.readthedocs.io/en/latest/docs/covid-19/index.html)

### NeIC PaRI case study
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the three cases make for three independent showcase pages?
They do seem to fit as combinations of tools, services, and standards to gather, process, and submit data.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that the three below could be showcases too, perhaps and maybe linked here when they're up? Maybe the national pages could also be linked for the above examples too?


Combining information on virus characteristics with clinical and epidemiological data is desirable when studying various aspects of a pandemic. Viral genomic data provide insight into some important virus characteristics and should be reported to and shared with the global research and clinical communities as early and as openly as possible. The following text will provide guidance on sharing data from viral genome sequencing. The focus will be on the processing and documentation required for sharing the data files containing reads produced by sequencing instruments and consensus sequences produced by analytic workflows for genome assembly—both of which constitute valuable resources for different audiences.

## Study design and documentation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. SPLIT SUGGESTION. Could be restructured as considerations within the concrete topic 'good practices in data sharing', we originally mentioned about making repository specific, but perhaps this could be more 'general considerations, regardless of repository'

- Quality assessment
- Output files


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. SPLIT SUGGESTION END- though perhaps Nordic considerations could be generalised, included somehow in the National resources pages of any relevant countries as almost 'additional information?)


Aspects related to common workflows for processing SARS-CoV-2 genome data includes keeping references to which protocols and versions were used in preparing and sequencing each individual sample and recording which samples and sequencing libraries were prepared together. This information can be used to identify and address issues related to both workflow/library specific artifacts and sample contamination and it would also be used to choose appropriate configurations and versions in analytic workflows. Some of this information can also be reflected in naming conventions for samples and libraries.

### Selected references on study design and documentation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. SPLIT SUGGESTION Integrate disease specific sources to split suggestion 2? RDM link could go in page metadata.


- [ECDC’s Surveillance and study protocols](https://www.ecdc.europa.eu/en/covid-19/surveillance/study-protocols)

- [ECDC’s TESSy reporting protocol for COVID-19](https://www.ecdc.europa.eu/sites/default/files/documents/COVID-19-Reporting-Protocol-v5.1.pdf)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. END SPLIT SUGGESTION


- [ECDC’s TESSy reporting protocol for COVID-19](https://www.ecdc.europa.eu/sites/default/files/documents/COVID-19-Reporting-Protocol-v5.1.pdf)

## Data sharing platforms and data standards
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. SPLIT SUGGESTION - integrate into Pathogen characterisation - data sources page as appropriate


[^10]: Some kits also include targeted amplification of human host control sequences

### Data standards for SARS-CoV-2 genome data sharing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. SPLIT SUGGESTION - goes as a concrete topic specifically about genome data sharing within the pathogen characterisation, data description page. Can have breakdown as needed for different 'considerations'. Could also be split by specific repositories, if appropriate.

: Extend this table with additional rows corresponding to fields from the [NeIC PaRI Data dictionary for SARS-CoV-2 genome data](https://docs.google.com/spreadsheets/d/1gNpdZKOUKPemMUHR107JRSeaWjPczlMIUp5fP5-kR9g/edit#gid=1524315810)

Custom fields
: Extend this table with additional rows corresponding to additional factors documented for this assembly
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. SPLIT SUGGESTION END - recommend significant breakage into different 'considerations' within the concrete topic though. The below references should be integrated as appropriate though, remembering to stick to references specific to infectious disease

- [SARS-CoV-2 ENA submission workflow + guidance for structuring and releasing metadata](https://dx.doi.org/10.17504/protocols.io.buqnnvve).
- [SARS-CoV2 GISAID submission protocol](https://dx.doi.org/10.17504/protocols.io.bumknu4w).

## National support / infrastructures
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. SPLIT SUGGESTION - These should be integrated into the respective National pages/ created as different showcases

### Sweden

- [Swedish COVID-19 Data Portal Support Services](https://www.covid19dataportal.se/support_services/)
- [SciLifeLab Data Guidelines](https://scilifelab-data-guidelines.readthedocs.io/en/latest/docs/covid-19/index.html)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. SPLIT SUGGESTION END

- [Swedish COVID-19 Data Portal Support Services](https://www.covid19dataportal.se/support_services/)
- [SciLifeLab Data Guidelines](https://scilifelab-data-guidelines.readthedocs.io/en/latest/docs/covid-19/index.html)

### NeIC PaRI case study
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6 cont. SPLIT SUGGESTION Perhaps the 'case study' could also integrate into national pages?


This document aims to 1) outline what constitutes a good and ultimately (re)usable viral genome data record in a data repository with a Nordic perspective; and 2) provide guidance to projects, labs and other organisations producing or commissioning viral sequencing data.

## Introduction
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. SPLIT SUGGESTION: Could generalise this to be more of an introduction about data description in the pathogen characterisation section

## Introduction

Combining information on virus characteristics with clinical and epidemiological data is desirable when studying various aspects of a pandemic. Viral genomic data provide insight into some important virus characteristics and should be reported to and shared with the global research and clinical communities as early and as openly as possible. The following text will provide guidance on sharing data from viral genome sequencing. The focus will be on the processing and documentation required for sharing the data files containing reads produced by sequencing instruments and consensus sequences produced by analytic workflows for genome assembly—both of which constitute valuable resources for different audiences.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. END SPLIT SUGGESTION

@wna-se
Copy link
Contributor Author

wna-se commented Jan 9, 2024

This pull request was superseded by other contributions, such as issue #151 and PR #194 .

@wna-se wna-se closed this Jan 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Test the structure with content from the Nordic Pandemic Infrastructure (NeIC PaRI) project’s guide
4 participants