metadata - OpenMD.com Journal Search

Open reproducible scientometric research with Alexandria3k.

PloS One 2023

Considerable scientific work involves locating, analyzing, systematizing, and synthesizing other publications, often with the help of online scientific publication...

Summary PubMed Full Text PDF

Authors: Diomidis Spinellis

Considerable scientific work involves locating, analyzing, systematizing, and synthesizing other publications, often with the help of online scientific publication databases and search engines. However, use of online sources suffers from a lack of repeatability and transparency, as well as from technical restrictions. Alexandria3k is a Python software package and an associated command-line tool that can populate embedded relational databases with slices from the complete set of several open publication metadata sets. These can then be employed for reproducible processing and analysis through versatile and performant queries. We demonstrate the software's utility by visualizing the evolution of publications in diverse scientific fields and relationships among them, by outlining scientometric facts associated with COVID-19 research, and by replicating commonly-used bibliometric measures and findings regarding scientific productivity, impact, and disruption.

Topics: Databases, Factual; Search Engine; Bibliometrics; Metadata; Research Design

PubMed: 38032908
DOI: 10.1371/journal.pone.0294946

DICODerma: A Practical Approach for Metadata Management of Images in Dermatology.

Journal of Digital Imaging Oct 2022

Clinical images are vital for diagnosing and monitoring skin diseases, and their importance has increased with the growing popularity of machine learning. Lack of...

Summary PubMed Full Text PDF

Authors: Bell Raj Eapen, Feroze Kaliyadan, Karalikkattil T Ashique...

Clinical images are vital for diagnosing and monitoring skin diseases, and their importance has increased with the growing popularity of machine learning. Lack of standards has stifled innovation in dermatological imaging, unlike other image-intensive specialties such as radiology. We investigate the meta-requirements for utilizing the popular DICOM standard for metadata management of images in dermatology. We propose practical design solutions and provide open-source tools to integrate dermatologists' workflow with enterprise imaging systems. Using the tool, dermatologists can tag, search, organize and convert clinical images to the DICOM format. We believe that our less disruptive approach will improve the adoption of standards in the specialty.

Topics: Humans; Dermatology; Diagnostic Imaging; Metadata; Radiology Information Systems; Workflow

PubMed: 35488074
DOI: 10.1007/s10278-022-00636-5

medna-metadata: an open-source data management system for tracking environmental DNA samples and metadata.

Bioinformatics (Oxford, England) Sep 2022

Environmental DNA (eDNA), as a rapidly expanding research field, stands to benefit from shared resources including sampling protocols, study designs, discovered...

Summary PubMed Full Text PDF

Authors: M Kimble, S Allers, K Campbell...

MOTIVATION

Environmental DNA (eDNA), as a rapidly expanding research field, stands to benefit from shared resources including sampling protocols, study designs, discovered sequences, and taxonomic assignments to sequences. High-quality community shareable eDNA resources rely heavily on comprehensive metadata documentation that captures the complex workflows covering field sampling, molecular biology lab work, and bioinformatic analyses. There are limited sources that provide documentation of database development on comprehensive metadata for eDNA and these workflows and no open-source software.

RESULTS

We present medna-metadata, an open-source, modular system that aligns with Findable, Accessible, Interoperable, and Reusable guiding principles that support scholarly data reuse and the database and application development of a standardized metadata collection structure that encapsulates critical aspects of field data collection, wet lab processing, and bioinformatic analysis. Medna-metadata is showcased with metabarcoding data from the Gulf of Maine (Polinski et al., 2019).

AVAILABILITY AND IMPLEMENTATION

The source code of the medna-metadata web application is hosted on GitHub (https://github.com/Maine-eDNA/medna-metadata). Medna-metadata is a docker-compose installable package. Documentation can be found at https://medna-metadata.readthedocs.io/en/latest/?badge=latest. The application is implemented in Python, PostgreSQL and PostGIS, RabbitMQ, and NGINX, with all major browsers supported. A demo can be found at https://demo.metadata.maine-edna.org/.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Topics: Metadata; DNA, Environmental; Data Management; Software; Databases, Factual

PubMed: 35960154
DOI: 10.1093/bioinformatics/btac556

Clinical implications of molecular subtyping in bladder cancer.

Current Opinion in Urology Jul 2019

The purpose of this review is to examine and evaluate similarities and differences in bladder cancer expression subtypes and to understand the clinical implications of... (Review)

Summary PubMed Full Text PDF

Review

Authors: Uttam Satyal, Rahmat K Sikder, David McConkey...

PURPOSE OF REVIEW

The purpose of this review is to examine and evaluate similarities and differences in bladder cancer expression subtypes and to understand the clinical implications of the molecular subtyping.

RECENT FINDINGS

Four independent classification systems have been described, and there are broad similarities among the subtyping callers. Two major subtypes have been identified, that is, luminal and basal, with underlying subcategories based on various distinct characteristics. Luminal tumors generally bear a better prognosis and increased survival than basal tumors, although there is subtle variation in prognosis among the different subtypes within the luminal and basal classifications. Clinical subtyping is now commercially available, although there are limitations to its generalizability and application.

SUMMARY

Expression subtyping is a new method to personalize bladder cancer management. However, there is probably not sufficient evidence to incorporate use into current standards-of-care. Validation cohorts with clinically meaningful outcomes may further establish the clinical relevance of molecular subtyping of bladder cancer. Additionally, genetic alterations in bladder cancer may 'color' the interpretation of individual tumors beyond the expression subtype to truly personalize care for bladder cancer.

Topics: Biomarkers, Tumor; Gene Expression Profiling; Humans; Immunophenotyping; Metadata; Mutation; Neoplasm Invasiveness; Prognosis; Urinary Bladder Neoplasms

PubMed: 31158107
DOI: 10.1097/MOU.0000000000000641

Whole genomes from bacteria collected at diagnostic units around the world 2020.

Scientific Data Sep 2023

The Two Weeks in the World research project has resulted in a dataset of 3087 clinically relevant bacterial genomes with pertaining metadata, collected from 59...

Summary PubMed Full Text PDF

Authors: Sidsel Nag, Gunhild Larsen, Judit Szarvas...

The Two Weeks in the World research project has resulted in a dataset of 3087 clinically relevant bacterial genomes with pertaining metadata, collected from 59 diagnostic units in 35 countries around the world during 2020. A relational database is available with metadata and summary data from selected bioinformatic analysis, such as species prediction and identification of acquired resistance genes.

Topics: Bacteria; Computational Biology; Databases, Factual; Genome, Bacterial; Metadata

PubMed: 37717051
DOI: 10.1038/s41597-023-02502-7

Evaluation of repositories for sharing individual-participant data from clinical studies.

Trials Mar 2019

Data repositories have the potential to play an important role in the effective and safe sharing of individual-participant data (IPD) from clinical studies. We analysed... (Review)

Summary PubMed Full Text PDF

Review

Authors: Rita Banzi, Steve Canham, Wolfgang Kuchinke...

BACKGROUND

Data repositories have the potential to play an important role in the effective and safe sharing of individual-participant data (IPD) from clinical studies. We analysed the current landscape of data repositories to create a detailed description of available repositories and assess their suitability for hosting data from clinical studies, from the perspective of the clinical researcher.

METHODS

We assessed repositories that enable storage, sharing, discoverability, re-use of the IPD and associated documents from clinical studies using a pre-defined set of 34 items and publicly available information from April to June 2018. For this purpose, we developed an indicator set to capture the maturity of the repositories' procedures and their suitability for the hosting of IPD. The indicators cover guidelines for data upload and data de-identification, data quality controls, contracts for upload and storage, flexibility of access, application of identifiers, availability of metadata, and long-term preservation.

RESULTS

We analysed 25 repositories, from an initial set of 55 identified as possibly relevant. Half of the included repositories were generic, i.e. not limited to a specific disease or clinical area and 13 were launched in the last 8 years. The sample was extremely heterogeneous and included repositories developed by research funders, infrastructures, universities, and editors. All but three repositories do not apply a fee for uploading, storage or access to data. None of the repositories completely demonstrated all the items included in the indicator set, but three repositories (Dryad, Drum, EASY) met - fully or partially - all items. Flexibility of data-access modalities appears to be limited, being lacking in half of the repositories.

CONCLUSIONS

Our evaluation, though often hampered by the lack of sufficient information, can help researchers to find a suitable repository for their datasets. Some repositories are more mature because of their support for clinical dataset preparation, contractual agreements, metadata and identifiers, different modalities of access, and long-term preservation of data. Further work is now required to achieve a more robust and accurate system for evaluation, which in turn may encourage the sharing of clinical study data.

TRIAL REGISTRATION

Study protocol available at https://zenodo.org/record/1438261#.W64kW9Egrcs .

Topics: Access to Information; Big Data; Clinical Studies as Topic; Data Collection; Data Mining; Databases, Factual; Humans; Information Dissemination; Metadata

PubMed: 30876434
DOI: 10.1186/s13063-019-3253-3

The OpenScience Slovenia metadata dataset.

Data in Brief Feb 2020

The OpenScience Slovenia metadata dataset contains metadata entries for Slovenian public domain academic documents which include undergraduate and postgraduate theses,...

Summary PubMed Full Text PDF

Authors: Mladen Borovič, Marko Ferme, Janez Brezovnik...

The OpenScience Slovenia metadata dataset contains metadata entries for Slovenian public domain academic documents which include undergraduate and postgraduate theses, research and professional articles, along with other academic document types. The data within the dataset was collected as a part of the establishment of the Slovenian Open-Access Infrastructure which defined a unified document collection process and cataloguing for universities in Slovenia within the infrastructure repositories. The data was collected from several already established but separate library systems in Slovenia and merged into a single metadata scheme using metadata deduplication and merging techniques. It consists of text and numerical fields, representing attributes that describe documents. These attributes include document titles, keywords, abstracts, typologies, authors, issue years and other identifiers such as URL and UDC. The potential of this dataset lies especially in text mining and text classification tasks and can also be used in development or benchmarking of content-based recommender systems on real-world data.

PubMed: 31890793
DOI: 10.1016/j.dib.2019.104942

Ten simple rules for annotating sequencing experiments.

PLoS Computational Biology Oct 2020

Summary PubMed Full Text PDF

Authors: Irene Stevens, Abdul Kadir Mukarram, Matthias Hörtenhuber...

Topics: Computational Biology; Gene Ontology; Genomics; Metadata; Molecular Sequence Annotation; Sequence Analysis, DNA

PubMed: 33017400
DOI: 10.1371/journal.pcbi.1008260

Thera-SAbDab: the Therapeutic Structural Antibody Database.

Nucleic Acids Research Jan 2020

The Therapeutic Structural Antibody Database (Thera-SAbDab; http://opig.stats.ox.ac.uk/webapps/therasabdab) tracks all antibody- and nanobody-related therapeutics...

Summary PubMed Full Text PDF

Authors: Matthew I J Raybould, Claire Marks, Alan P Lewis...

The Therapeutic Structural Antibody Database (Thera-SAbDab; http://opig.stats.ox.ac.uk/webapps/therasabdab) tracks all antibody- and nanobody-related therapeutics recognized by the World Health Organisation (WHO), and identifies any corresponding structures in the Structural Antibody Database (SAbDab) with near-exact or exact variable domain sequence matches. Thera-SAbDab is synchronized with SAbDab to update weekly, reflecting new Protein Data Bank entries and the availability of new sequence data published by the WHO. Each therapeutic summary page lists structural coverage (with links to the appropriate SAbDab entries), alignments showing where any near-matches deviate in sequence, and accompanying metadata, such as intended target and investigated conditions. Thera-SAbDab can be queried by therapeutic name, by a combination of metadata, or by variable domain sequence - returning all therapeutics that are within a specified sequence identity over a specified region of the query. The sequences of all therapeutics listed in Thera-SAbDab (461 unique molecules, as of 5 August 2019) are downloadable as a single file with accompanying metadata.

Topics: Antibodies; Clinical Trials as Topic; Databases, Protein; Humans; Internet; Metadata; Sequence Alignment; User-Computer Interface

PubMed: 31555805
DOI: 10.1093/nar/gkz827

Community-curated and standardised metadata of published ancient metagenomic samples with AncientMetagenomeDir.

Scientific Data Jan 2021

Ancient DNA and RNA are valuable data sources for a wide range of disciplines. Within the field of ancient metagenomics, the number of published genetic datasets has...

Summary PubMed Full Text PDF

Authors: James A Fellows Yates, Aida Andrades Valtueña, Åshild J Vågene...

Ancient DNA and RNA are valuable data sources for a wide range of disciplines. Within the field of ancient metagenomics, the number of published genetic datasets has risen dramatically in recent years, and tracking this data for reuse is particularly important for large-scale ecological and evolutionary studies of individual taxa and communities of both microbes and eukaryotes. AncientMetagenomeDir (archived at https://doi.org/10.5281/zenodo.3980833 ) is a collection of annotated metagenomic sample lists derived from published studies that provide basic, standardised metadata and accession numbers to allow rapid data retrieval from online repositories. These tables are community-curated and span multiple sub-disciplines to ensure adequate breadth and consensus in metadata definitions, as well as longevity of the database. Internal guidelines and automated checks facilitate compatibility with established sequence-read archives and term-ontologies, and ensure consistency and interoperability for future meta-analyses. This collection will also assist in standardising metadata reporting for future ancient metagenomic studies.

Topics: Databases, Genetic; Humans; Metadata; Metagenome; Metagenomics; Publications

PubMed: 33500403
DOI: 10.1038/s41597-021-00816-y