metadata - OpenMD.com Journal Search

medna-metadata: an open-source data management system for tracking environmental DNA samples and metadata.

Bioinformatics (Oxford, England) Sep 2022

Environmental DNA (eDNA), as a rapidly expanding research field, stands to benefit from shared resources including sampling protocols, study designs, discovered...

Summary PubMed Full Text PDF

Authors: M Kimble, S Allers, K Campbell...

MOTIVATION

Environmental DNA (eDNA), as a rapidly expanding research field, stands to benefit from shared resources including sampling protocols, study designs, discovered sequences, and taxonomic assignments to sequences. High-quality community shareable eDNA resources rely heavily on comprehensive metadata documentation that captures the complex workflows covering field sampling, molecular biology lab work, and bioinformatic analyses. There are limited sources that provide documentation of database development on comprehensive metadata for eDNA and these workflows and no open-source software.

RESULTS

We present medna-metadata, an open-source, modular system that aligns with Findable, Accessible, Interoperable, and Reusable guiding principles that support scholarly data reuse and the database and application development of a standardized metadata collection structure that encapsulates critical aspects of field data collection, wet lab processing, and bioinformatic analysis. Medna-metadata is showcased with metabarcoding data from the Gulf of Maine (Polinski et al., 2019).

AVAILABILITY AND IMPLEMENTATION

The source code of the medna-metadata web application is hosted on GitHub (https://github.com/Maine-eDNA/medna-metadata). Medna-metadata is a docker-compose installable package. Documentation can be found at https://medna-metadata.readthedocs.io/en/latest/?badge=latest. The application is implemented in Python, PostgreSQL and PostGIS, RabbitMQ, and NGINX, with all major browsers supported. A demo can be found at https://demo.metadata.maine-edna.org/.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Topics: Metadata; DNA, Environmental; Data Management; Software; Databases, Factual

PubMed: 35960154
DOI: 10.1093/bioinformatics/btac556

Whole genomes from bacteria collected at diagnostic units around the world 2020.

Scientific Data Sep 2023

The Two Weeks in the World research project has resulted in a dataset of 3087 clinically relevant bacterial genomes with pertaining metadata, collected from 59...

Summary PubMed Full Text PDF

Authors: Sidsel Nag, Gunhild Larsen, Judit Szarvas...

The Two Weeks in the World research project has resulted in a dataset of 3087 clinically relevant bacterial genomes with pertaining metadata, collected from 59 diagnostic units in 35 countries around the world during 2020. A relational database is available with metadata and summary data from selected bioinformatic analysis, such as species prediction and identification of acquired resistance genes.

Topics: Bacteria; Computational Biology; Databases, Factual; Genome, Bacterial; Metadata

PubMed: 37717051
DOI: 10.1038/s41597-023-02502-7

Community-curated and standardised metadata of published ancient metagenomic samples with AncientMetagenomeDir.

Scientific Data Jan 2021

Ancient DNA and RNA are valuable data sources for a wide range of disciplines. Within the field of ancient metagenomics, the number of published genetic datasets has...

Summary PubMed Full Text PDF

Authors: James A Fellows Yates, Aida Andrades Valtueña, Åshild J Vågene...

Ancient DNA and RNA are valuable data sources for a wide range of disciplines. Within the field of ancient metagenomics, the number of published genetic datasets has risen dramatically in recent years, and tracking this data for reuse is particularly important for large-scale ecological and evolutionary studies of individual taxa and communities of both microbes and eukaryotes. AncientMetagenomeDir (archived at https://doi.org/10.5281/zenodo.3980833 ) is a collection of annotated metagenomic sample lists derived from published studies that provide basic, standardised metadata and accession numbers to allow rapid data retrieval from online repositories. These tables are community-curated and span multiple sub-disciplines to ensure adequate breadth and consensus in metadata definitions, as well as longevity of the database. Internal guidelines and automated checks facilitate compatibility with established sequence-read archives and term-ontologies, and ensure consistency and interoperability for future meta-analyses. This collection will also assist in standardising metadata reporting for future ancient metagenomic studies.

Topics: Databases, Genetic; Humans; Metadata; Metagenome; Metagenomics; Publications

PubMed: 33500403
DOI: 10.1038/s41597-021-00816-y

Harvesting metadata in clinical care: a crosswalk between FHIR, OMOP, CDISC and openEHR metadata.

Scientific Data Oct 2022

Metadata describe information about data source, type of creation, structure, status and semantics and are prerequisite for preservation and reuse of medical data. To...

Summary PubMed Full Text PDF

Authors: Caroline Bönisch, Dorothea Kesztyüs, Tibor Kesztyüs...

Metadata describe information about data source, type of creation, structure, status and semantics and are prerequisite for preservation and reuse of medical data. To overcome the hurdle of disparate data sources and repositories with heterogeneous data formats a metadata crosswalk was initiated, based on existing standards. FAIR Principles were included, as well as data format specifications. The metadata crosswalk is the foundation of data provision between a Medical Data Integration Center (MeDIC) and researchers, providing a selection of metadata information for research design and requests. Based on the crosswalk, metadata items were prioritized and categorized to demonstrate that not one single predefined standard meets all requirements of a MeDIC and only a maximum data set of metadata is suitable for use. The development of a convergence format including the maximum data set is the anticipated solution for an automated transformation of metadata in a MeDIC.

Topics: Metadata; Information Storage and Retrieval; Semantics; Reference Standards

PubMed: 36307424
DOI: 10.1038/s41597-022-01792-7

MetaboLights: open data repository for metabolomics.

Nucleic Acids Research Jan 2024

MetaboLights is a global database for metabolomics studies including the raw experimental data and the associated metadata. The database is cross-species and...

Summary PubMed Full Text PDF

Authors: Ozgur Yurekten, Thomas Payne, Noemi Tejera...

MetaboLights is a global database for metabolomics studies including the raw experimental data and the associated metadata. The database is cross-species and cross-technique and covers metabolite structures and their reference spectra as well as their biological roles and locations where available. MetaboLights is the recommended metabolomics repository for a number of leading journals and ELIXIR, the European infrastructure for life science information. In this article, we describe the continued growth and diversity of submissions and the significant developments in recent years. In particular, we highlight MetaboLights Labs, our new Galaxy Project instance with repository-scale standardized workflows, and how data public on MetaboLights are being reused by the community. Metabolomics resources and data are available under the EMBL-EBI's Terms of Use at https://www.ebi.ac.uk/metabolights and under Apache 2.0 at https://github.com/EBI-Metabolights.

Topics: Metabolomics; Metadata; Databases, Genetic; Internet

PubMed: 37971328
DOI: 10.1093/nar/gkad1045

Management of Metadata Types in Basic Cardiological Research.

Studies in Health Technology and... Sep 2021

Ensuring scientific reproducibility and compliance with documentation guidelines of funding bodies and journals is a topic of greatly increasing importance in biomedical...

Summary PubMed

Authors: Harald Kusch, Robert Kossen, Markus Suhr...

INTRODUCTION

Ensuring scientific reproducibility and compliance with documentation guidelines of funding bodies and journals is a topic of greatly increasing importance in biomedical research. Failure to comply, or unawareness of documentation standards can have adverse effects on the translation of research into patient treatments, as well as economic implications. In the context of the German Research Foundation-funded collaborative research center (CRC) 1002, an IT-infrastructure sub-project was designed. Its goal has been to establish standardized metadata documentation and information exchange benefitting the participating research groups with minimal additional documentation efforts.

METHODS

Implementation of the self-developed menoci-based research data platform (RDP) was driven by close communication and collaboration with researchers as early adopters and experts. Requirements analysis and concept development involved in person observation of experimental procedures, interviews and collaboration with researchers and experts, as well as the investigation of available and applicable metadata standards and tools. The Drupal-based RDP features distinct modules for the different documented data and workflow types, and both the development and the types of collected metadata were continuously reviewed and evaluated with the early adopters.

RESULTS

The menoci-based RDP allows for standardized documentation, sharing and cross-referencing of different data types, workflows, and scientific publications. Different modules have been implemented for specific data types and workflows, allowing for the enrichment of entries with specific metadata and linking to further relevant entries in different modules.

DISCUSSION

Taking the workflows and datasets of the frequently involved experimental service projects as a starting point for (meta-)data types to overcome irreproducibility of research data, results in increased benefits for researchers with minimized efforts. While the menoci-based RDP with its data models and metadata schema was originally developed in a cardiological context, it has been implemented and extended to other consortia at GÃűttingen Campus and beyond in different life science research areas.

Topics: Biomedical Research; Documentation; Humans; Metadata; Reproducibility of Results; Workflow

PubMed: 34545820
DOI: 10.3233/SHTI210542

Fast and Accurate Metadata Authoring Using Ontology-Based Recommendations.

AMIA ... Annual Symposium Proceedings.... 2017

In biomedicine, high-quality metadata are crucial for finding experimental datasets, for understanding how experiments were performed, and for reproducing those...

Summary PubMed Full Text PDF

Authors: Marcos Martínez-Romero, Martin J O'Connor, Ravi D Shankar...

In biomedicine, high-quality metadata are crucial for finding experimental datasets, for understanding how experiments were performed, and for reproducing those experiments. Despite the recent focus on metadata, the quality of metadata available in public repositories continues to be extremely poor. A key difficulty is that the typical metadata acquisition process is time-consuming and error prone, with weak or nonexistent support for linking metadata to ontologies. There is a pressing need for methods and tools to speed up the metadata acquisition process and to increase the quality of metadata that are entered. In this paper, we describe a methodology and set of associated tools that we developed to address this challenge. A core component of this approach is a value recommendation framework that uses analysis of previously entered metadata and ontology-based metadata specifications to help users rapidly and accurately enter their metadata. We performed an initial evaluation of this approach using metadata from a public metadata repository.

Topics: Biological Ontologies; Biomedical Research; Data Accuracy; Data Analysis; Metadata; Methods

PubMed: 29854196
DOI: No ID Found

Musical Instrument Identification Using Deep Learning Approach.

Sensors (Basel, Switzerland) Apr 2022

The work aims to propose a novel approach for automatically identifying all instruments present in an audio excerpt using sets of individual convolutional neural... (Review)

Summary PubMed Full Text PDF

Review

Authors: Maciej Blaszke, Bożena Kostek

The work aims to propose a novel approach for automatically identifying all instruments present in an audio excerpt using sets of individual convolutional neural networks (CNNs) per tested instrument. The paper starts with a review of tasks related to musical instrument identification. It focuses on tasks performed, input type, algorithms employed, and metrics used. The paper starts with the background presentation, i.e., metadata description and a review of related works. This is followed by showing the dataset prepared for the experiment and its division into subsets: training, validation, and evaluation. Then, the analyzed architecture of the neural network model is presented. Based on the described model, training is performed, and several quality metrics are determined for the training and validation sets. The results of the evaluation of the trained network on a separate set are shown. Detailed values for precision, recall, and the number of true and false positive and negative detections are presented. The model efficiency is high, with the metric values ranging from 0.86 for the guitar to 0.99 for drums. Finally, a discussion and a summary of the results obtained follows.

Topics: Algorithms; Benchmarking; Deep Learning; Metadata; Neural Networks, Computer

PubMed: 35459018
DOI: 10.3390/s22083033

Interpretative Labor and the Bane of Nonstandardized Metadata in Public Health Surveillance and Food Safety.

Clinical Infectious Diseases : An... Oct 2021

Open-source DNA sequence databases have long been touted as beneficial to public health, including the facilitation of earlier detection and response to infectious...

Summary PubMed

Authors: James B Pettengill, Jennifer Beal, Maria Balkey...

Open-source DNA sequence databases have long been touted as beneficial to public health, including the facilitation of earlier detection and response to infectious disease outbreaks. Of critical importance to harnessing these benefits is the metadata that describe general and other domain-specific attributes (eg, collection location, isolate type) of a sample. Unlike the sequence data, metadata are often incomplete and lack adherence to an international standard. Here, we describe the problem posed by such variable and incomplete metadata in terms of interpretative labor costs (the time and energy necessary to make sense of the signal in the genetic data) and the impact such metadata have on foodborne outbreak detection and response. Improving the quality of sequence-associated metadata would allow for earlier detection of emerging food safety hazards and allow faster response to foodborne outbreaks.

Topics: Disease Outbreaks; Food Safety; Foodborne Diseases; Humans; Metadata; Public Health; Public Health Surveillance

PubMed: 34240118
DOI: 10.1093/cid/ciab615

Mettertron - Bridging Metadata Repositories and Terminology Servers.

Studies in Health Technology and... Sep 2023

To provide clinical data in distributed research architectures, a fundamental challenge involves defining and distributing suitable metadata within Metadata...

Summary PubMed

Authors: Jan Schladetzky, Ann-Kristin Kock-Schoppenhauer, Cora Drenkhahn...

To provide clinical data in distributed research architectures, a fundamental challenge involves defining and distributing suitable metadata within Metadata Repositories. Especially for structured data, data elements need to be bound against suitable terminologies; otherwise, other systems will only be able to interpret the data with complex and error-prone manual involvement. As current Metadata Repository implementations lack support for querying externally defined terminologies in FHIR terminology servers, we propose an intermediate solution that uses appropriate annotations on metadata elements to allow run-time Terminology Services mediated queries of that metadata. This allows a very clear separation of concerns between the two related systems, greatly simplifying terminological maintenance. The system performed well in a prototypical deployment.

Topics: Metadata

PubMed: 37697859
DOI: 10.3233/SHTI230721