-
Nucleic Acids Research Jan 2024Europe PMC (https://europepmc.org/) is an open access database of life science journal articles and preprints, which contains over 42 million abstracts and over 9...
Europe PMC (https://europepmc.org/) is an open access database of life science journal articles and preprints, which contains over 42 million abstracts and over 9 million full text articles accessible via the website, APIs and bulk download. This publication outlines new developments to the Europe PMC platform since the last database update in 2020 (1) and focuses on five main areas. (i) Improving discoverability, reproducibility and trust in preprints by indexing new preprint content, enriching preprint metadata and identifying withdrawn and removed preprints. (ii) Enhancing support for text and data mining by expanding the types of annotations provided and developing the Europe PMC Annotations Corpus, which can be used to train machine learning models to increase their accuracy and precision. (iii) Developing the Article Status Monitor tool and email alerts, to notify users about new articles and updates to existing records. (iv) Positioning Europe PMC as an open scholarly infrastructure through increasing the portion of open source core software, improving sustainability and accessibility of the service.
Topics: Biological Science Disciplines; Data Mining; Europe; Software; Databases, Bibliographic; Internet
PubMed: 37994696
DOI: 10.1093/nar/gkad1085 -
Tomography (Ann Arbor, Mich.) Oct 2023Digital Imaging and Communications in Medicine (DICOM) is an international standard that defines a format for storing medical images and a protocol to enable and... (Review)
Review
Digital Imaging and Communications in Medicine (DICOM) is an international standard that defines a format for storing medical images and a protocol to enable and facilitate data communication among medical imaging systems. The DICOM standard has been instrumental in transforming the medical imaging world over the last three decades. Its adoption has been a significant experience for manufacturers, healthcare users, and research scientists. In this review, thirty years after introducing the standard, we discuss the innovation, advantages, and limitations of adopting the DICOM and its possible future directions.
Topics: Software; Radiology Information Systems; Diagnostic Imaging
PubMed: 37888737
DOI: 10.3390/tomography9050145 -
Nucleic Acids Research Jan 2024GenBank® (https://www.ncbi.nlm.nih.gov/genbank/) is a comprehensive, public database that contains 25 trillion base pairs from over 3.7 billion nucleotide sequences for...
GenBank® (https://www.ncbi.nlm.nih.gov/genbank/) is a comprehensive, public database that contains 25 trillion base pairs from over 3.7 billion nucleotide sequences for 557 000 formally described species. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. Recent updates include policies for including spatio-temporal metadata, clarified documentation for GenBank data processing, enhanced foreign contamination screening tools, new processes in the Submission Portal, migration of Entrez Genome and Assembly displays into NCBI Datasets, and the impending retirement of tbl2asn, replaced by table2asn.
Topics: Base Sequence; Databases, Nucleic Acid; Genomics; Internet; Humans
PubMed: 37889039
DOI: 10.1093/nar/gkad903 -
Brain Informatics Aug 2023Brain-computer interface (BCI), an emerging technology that facilitates communication between brain and computer, has attracted a great deal of research in recent years.... (Review)
Review
Brain-computer interface (BCI), an emerging technology that facilitates communication between brain and computer, has attracted a great deal of research in recent years. Researchers provide experimental results demonstrating that BCI can restore the capabilities of physically challenged people, hence improving the quality of their lives. BCI has revolutionized and positively impacted several industries, including entertainment and gaming, automation and control, education, neuromarketing, and neuroergonomics. Notwithstanding its broad range of applications, the global trend of BCI remains lightly discussed in the literature. Understanding the trend may inform researchers and practitioners on the direction of the field, and on where they should invest their efforts more. Noting this significance, we have analyzed 25,336 metadata of BCI publications from Scopus to determine advancement of the field. The analysis shows an exponential growth of BCI publications in China from 2019 onwards, exceeding those from the United States that started to decline during the same period. Implications and reasons for this trend are discussed. Furthermore, we have extensively discussed challenges and threats limiting exploitation of BCI capabilities. A typical BCI architecture is hypothesized to address two prominent BCI threats, privacy and security, as an attempt to make the technology commercially viable to the society.
PubMed: 37540385
DOI: 10.1186/s40708-023-00199-3 -
JAMIA Open Jul 2023With the burgeoning development of computational phenotypes, it is increasingly difficult to identify the right phenotype for the right tasks. This study uses a...
With the burgeoning development of computational phenotypes, it is increasingly difficult to identify the right phenotype for the right tasks. This study uses a mixed-methods approach to develop and evaluate a novel metadata framework for retrieval of and reusing computational phenotypes. Twenty active phenotyping researchers from 2 large research networks, Electronic Medical Records and Genomics and Observational Health Data Sciences and Informatics, were recruited to suggest metadata elements. Once consensus was reached on 39 metadata elements, 47 new researchers were surveyed to evaluate the utility of the metadata framework. The survey consisted of 5-Likert multiple-choice questions and open-ended questions. Two more researchers were asked to use the metadata framework to annotate 8 type-2 diabetes mellitus phenotypes. More than 90% of the survey respondents rated metadata elements regarding phenotype definition and validation methods and metrics positively with a score of 4 or 5. Both researchers completed annotation of each phenotype within 60 min. Our thematic analysis of the narrative feedback indicates that the metadata framework was effective in capturing rich and explicit descriptions and enabling the search for phenotypes, compliance with data standards, and comprehensive validation metrics. Current limitations were its complexity for data collection and the entailed human costs.
PubMed: 37181728
DOI: 10.1093/jamiaopen/ooad032 -
Scientific Data Sep 2023The expansive production of data in materials science, their widespread sharing and repurposing requires educated support and stewardship. In order to ensure that this...
The expansive production of data in materials science, their widespread sharing and repurposing requires educated support and stewardship. In order to ensure that this need helps rather than hinders scientific work, the implementation of the FAIR-data principles () must not be too narrow. Besides, the wider materials-science community ought to agree on the strategies to tackle the challenges that are specific to its data, both from computations and experiments. In this paper, we present the result of the discussions held at the workshop on “Shared Metadata and Data Formats for Big-Data Driven Materials Science”. We start from an operative definition of metadata, and the features that a FAIR-compliant metadata schema should have. We will mainly focus on computational materials-science data and propose a constructive approach for the of the (meta)data related to ground-state and excited-states calculations, potential-energy sampling, and generalized workflows. Finally, challenges with the of experimental (meta)data and materials-science ontologies are presented together with an outlook of how to meet them.
PubMed: 37709811
DOI: 10.1038/s41597-023-02501-8 -
Scientific Data Jul 2023The Minimum Information for High Content Screening Microscopy Experiments (MIHCSME) is a metadata model and reusable tabular template for sharing and integrating high...
The Minimum Information for High Content Screening Microscopy Experiments (MIHCSME) is a metadata model and reusable tabular template for sharing and integrating high content imaging data. It has been developed by combining the ISA (Investigations, Studies, Assays) metadata standard with a semantically enriched instantiation of REMBI (Recommended Metadata for Biological Images). The tabular template provides an easy-to-use practical implementation of REMBI, specifically for High Content Screening (HCS) data. In addition, ISA compliance enables broader integration with other types of experimental data, paving the way for visual omics and multi-Omics integration. We show the utility of MIHCSME for HCS data using multiple examples from the Leiden FAIR Cell Observatory, a Euro-Bioimaging flagship node for high content screening and the pilot node for implementing Findable, Accessible, Interoperable and Reusable (FAIR) bioimaging data throughout the Netherlands Bioimaging network.
PubMed: 37460560
DOI: 10.1038/s41597-023-02367-w -
Scientific Data Sep 2023It is essential to publish and make available environmental data gathered by emerging robotic platforms to contribute to the Global Ocean Observing System (GOOS),...
It is essential to publish and make available environmental data gathered by emerging robotic platforms to contribute to the Global Ocean Observing System (GOOS), supported by the United Nations - Decade of Ocean Science for Sustainable Development (2021-2030). The transparency of these unique observational datasets needs to be supported by the corresponding robotic records. The data describing the observational platform behaviour and its performance are necessary to validate the environmental data and repeat consistently the in-situ robotic deployment. The Free and Open Source Software (FOSS), proposed in this manuscript, describes how, using the established approach in Earth Sciences, the data characterising marine robotic missions can be formatted and shared following the FAIR (Findable, Accessible, Interoperable, Reusable) principles. The manuscript is a step-by-step guide to render marine robotic telemetry FAIR and publishable. State-of-the-art protocols for metadata and data formatting are proposed, applied and integrated automatically using Jupyter Notebooks to maximise visibility and ease of use. The method outlined here aims to be a first fundamental step towards FAIR interdisciplinary observational science.
PubMed: 37704657
DOI: 10.1038/s41597-023-02495-3 -
Computational and Structural... 2023In the fast-evolving landscape of biomedical research, the emergence of big data has presented researchers with extraordinary opportunities to explore biological... (Review)
Review
In the fast-evolving landscape of biomedical research, the emergence of big data has presented researchers with extraordinary opportunities to explore biological complexities. In biomedical research, big data imply also a big responsibility. This is not only due to genomics data being sensitive information but also due to genomics data being shared and re-analysed among the scientific community. This saves valuable resources and can even help to find new insights . To fully use these opportunities, detailed and correct metadata are imperative. This includes not only the availability of metadata but also their correctness. Metadata integrity serves as a fundamental determinant of research credibility, supporting the reliability and reproducibility of data-driven findings. Ensuring metadata availability, curation, and accuracy are therefore essential for bioinformatic research. Not only must metadata be readily available, but they must also be meticulously curated and ideally error-free. Motivated by an accidental discovery of a critical metadata error in patient data published in two high-impact journals, we aim to raise awareness for the need of correct, complete, and curated metadata. We describe how the metadata error was found, addressed, and present examples for metadata-related challenges in omics research, along with supporting measures, including tools for checking metadata and software to facilitate various steps from data analysis to published research.
PubMed: 37860229
DOI: 10.1016/j.csbj.2023.10.006 -
Data in Brief Oct 2023The present article introduces Zanadamu, a comprehensive geo-temporal-referenced dataset that amalgamates all published stable isotope carbon and oxygen measurements on...
The present article introduces Zanadamu, a comprehensive geo-temporal-referenced dataset that amalgamates all published stable isotope carbon and oxygen measurements on tooth enamel from African hominins, dated between 4.4 and 0.005 Ma. Zanadamu serves as a research tool for investigating hominin evolution by facilitating the examination of how different hominin species explored food resources and interacted with their local paleoenvironments. The dataset is structured in a machine-readable format, and its metadata organization allows for facile statistical analyses and comparisons with other types of isotopic records, including ancient and modern humans and other primates. Zanadamu is part of the AfriArch data initiative, which aims at compiling datasets for the study of ancient Africa. This an active initiative, and we strive to update Zanadamu as novel data becomes available.
PubMed: 37701712
DOI: 10.1016/j.dib.2023.109522