Sharing data on COVID-19
Make your COVID-19 data accessible to the scientific community by releasing them in public databases along with respective metadata.
Metadata
Metadata are “data about data”, I.E. the information a dataset must be provided with in order to correctly interpret, manage and store it over time. Metadata generally include information on the methodology used to collect the data, on instrumental procedures, definitions of variables, units of measurement, indications on file formats, software used to collect and/or process the data and more. Metadata can be collected in simple text files and archived together with the dataset.
Researchers are strongly encouraged to use standard metadata, where they exist. It is strongly recommended to start defining and collecting metadata from the very beginning of the research project.
In proteomics it is suggested to use the Minimum Information About a Proteomics Experiment (MIAPE) standard, using the controlled vocabulary defined by the Proteomics Standards Initiative: PSI CVs.
More information on standards and formats for metadata is collected in the FAIRsharing.org resource.
Repositories
To identify the most suitable resource for sharing your data, we suggest using tools such as FAIRsharing, using the ‘proteomics’ keyword.
Protein-protein interaction data
For protein-protein interactions, at binary level or at network level, it is highly recommended to use the MINT database. Guidelines for data submission can be found on the dedicated page.
Mass spectrometry
For mass spectrometry experiments it is strongly advised to use the PRIDE repository provided by the ProteomeXchange Consortium. Data can be submitted using the PX Submission Tool.
Protein structures
This class includes proteins structural biology data of and structural data about other biological macromolecules.
For X-ray and NMR crystallographic data it is suggested to use the Protein Data Bank in Europe (PDBE) database.
For electron microscopy and tomography data the Electron Microscopy Public Image Archive (EMPIAR) database is recommended.
Molecular biology sensible data
For molecular biology data generated from human samples that can potentially be used to identify a specific subject (and must therefore be protected by controlled access) it is recommended to use the European Genome-phenome Archive (EGA).
Every institution is likely to provide local repositories for this type of data. You are kindly invited to contact your Data Management or IT service for support.
Related human and viral data
For molecular biology data of SARS-CoV-2 data generated in combination with host data (ES: combined sequencing studies of host transcriptome and viral genotype), storage in a local repository is recommended together with the registration of the datasets at the BioSamples database.
Local Repositories
Some local repositories are available at department, university, institute level.
We suggest contacting your Data Stewardship or IT service for more information and support.