-
Environmental Monitoring and Assessment Dec 2023A scientifically informed approach to decision-making is key to ensuring the sustainable management of ecosystems, especially in the light of increasing human pressure...
A scientifically informed approach to decision-making is key to ensuring the sustainable management of ecosystems, especially in the light of increasing human pressure on habitats and species. Protected areas, with their long-term institutional mandate for biodiversity conservation, play an important role as data providers, for example, through the long-term monitoring of natural resources. However, poor data management often limits the use and reuse of this wealth of information. In this paper, we share lessons learned in managing long-term data from the Italian Alpine national parks. Our analysis and examples focus on specific issues faced by managers of protected areas, which partially differ from those faced by academic researchers, predominantly owing to different mission, governance, and temporal perspectives. Rigorous data quality control, the use of appropriate data management tools, and acquisition of the necessary skills remain the main obstacles. Common protocols for data collection offer great opportunities for the future, and complete recovery and documentation of time series is an urgent priority. Notably, before data can be shared, protected areas should improve their data management systems, a task that can be achieved only with adequate resources and a long-term vision. We suggest strategies that protected areas, funding agencies, and the scientific community can embrace to address these problems. The added value of our work lies in promoting engagement with managers of protected areas and in reporting and analysing their concrete requirements and problems, thereby contributing to the ongoing discussion on data management and sharing through a bottom-up approach.
Topics: Humans; Ecosystem; Conservation of Natural Resources; Data Management; Environmental Monitoring; Biodiversity
PubMed: 38051448
DOI: 10.1007/s10661-023-11851-0 -
BMC Bioinformatics Nov 2023Biclustering of biologically meaningful binary information is essential in many applications related to drug discovery, like protein-protein interactions and gene...
Biclustering of biologically meaningful binary information is essential in many applications related to drug discovery, like protein-protein interactions and gene expressions. However, for robust performance in recently emerging large health datasets, it is important for new biclustering algorithms to be scalable and fast. We present a rapid unsupervised biclustering (RUBic) algorithm that achieves this objective with a novel encoding and search strategy. RUBic significantly reduces the computational overhead on both synthetic and experimental datasets shows significant computational benefits, with respect to several state-of-the-art biclustering algorithms. In 100 synthetic binary datasets, our method took [Formula: see text] s to extract 494,872 biclusters. In the human PPI database of size [Formula: see text], our method generates 1840 biclusters in [Formula: see text] s. On a central nervous system embryonic tumor gene expression dataset of size 712,940, our algorithm takes 101 min to produce 747,069 biclusters, while the recent competing algorithms take significantly more time to produce the same result. RUBic is also evaluated on five different gene expression datasets and shows significant speed-up in execution time with respect to existing approaches to extract significant KEGG-enriched bi-clustering. RUBic can operate on two modes, base and flex, where base mode generates maximal biclusters and flex mode generates less number of clusters and faster based on their biological significance with respect to KEGG pathways. The code is available at ( https://github.com/CMATERJU-BIOINFO/RUBic ) for academic use only.
Topics: Humans; Algorithms; Databases, Factual; Cluster Analysis; Data Management; Gene Expression Profiling
PubMed: 37974081
DOI: 10.1186/s12859-023-05534-3 -
Medical Care Dec 2023Data infrastructure for cancer research is centered on registries that are often augmented with payer or hospital discharge databases, but these linkages are limited. A...
BACKGROUND
Data infrastructure for cancer research is centered on registries that are often augmented with payer or hospital discharge databases, but these linkages are limited. A recent alternative in some states is to augment registry data with All-Payer Claims Databases (APCDs). These linkages capture patient-centered economic outcomes, including those driven by insurance and influence health equity, and can serve as a prototype for health economics research.
OBJECTIVES
To describe and assess the utility of a linkage between the Colorado APCD and Colorado Central Cancer Registry (CCCR) data for 2012-2017.
RESEARCH DESIGN, PARTICIPANTS, AND MEASURES
This cohort study of 91,883 insured patients evaluated the Colorado APCD-CCCR linkage on its suitability to assess demographics, area-level data, insurance, and out-of-pocket expenses 3 and 6 months after cancer diagnosis.
RESULTS
The linkage had high validity, with over 90% of patients in the CCCR linked to the APCD, but gaps in APCD health plans limited available claims at diagnosis. We highlight the advantages of the CCCR-APCD, such as granular race and ethnicity classification, area-level data, the ability to capture supplemental plans, medical and pharmacy out-of-pocket expenses, and transitions in insurance plans.
CONCLUSIONS
Linked data between registries and APCDs can be a cornerstone of a robust data infrastructure and spur innovations in health economics research on cost, quality, and outcomes. A larger infrastructure could comprise a network of state APCDs that maintain linkages for research and surveillance.
Topics: Humans; Cohort Studies; Neoplasms; Registries; Data Management; Colorado
PubMed: 37963034
DOI: 10.1097/MLR.0000000000001904 -
Sensors (Basel, Switzerland) Nov 2023The use of cloud computing, big data, IoT, and mobile applications in the public transportation industry has resulted in the generation of vast and complex data, of...
The use of cloud computing, big data, IoT, and mobile applications in the public transportation industry has resulted in the generation of vast and complex data, of which the large data volume and data variety have posed several obstacles to effective data sensing and processing with high efficiency in a real-time data-driven public transportation management system. To overcome the above-mentioned challenges and to guarantee optimal data availability for data sensing and processing in public transportation perception, a public transportation sensing platform is proposed to collect, integrate, and organize diverse data from different data sources. The proposed data perception platform connects multiple data systems and some edge intelligent perception devices to enable the collection of various types of data, including traveling information of passengers and transaction data of smart cards. To enable the efficient extraction of precise and detailed traveling behavior, an efficient field-level data lineage exploration method is proposed during logical plan generation and is integrated into the FlinkSQL system seamlessly. Furthermore, a row-level fine-grained permission control mechanism is adopted to support flexible data management. With these two techniques, the proposed data management system can support efficient data processing on large amounts of data and conducts comprehensive analysis and application of business data from numerous different sources to realize the value of the data with high data safety. Through operational testing in real environments, the proposed platform has proven highly efficient and effective in managing organizational operations, data assets, data life cycle, offline development, and backend administration over a large amount of various types of public transportation traffic data.
PubMed: 38005614
DOI: 10.3390/s23229228 -
Database : the Journal of Biological... Nov 2023Over the last couple of decades, there has been a rapid growth in the number and scope of agricultural genetics, genomics and breeding databases and resources. The...
Over the last couple of decades, there has been a rapid growth in the number and scope of agricultural genetics, genomics and breeding databases and resources. The AgBioData Consortium (https://www.agbiodata.org/) currently represents 44 databases and resources (https://www.agbiodata.org/databases) covering model or crop plant and animal GGB data, ontologies, pathways, genetic variation and breeding platforms (referred to as 'databases' throughout). One of the goals of the Consortium is to facilitate FAIR (Findable, Accessible, Interoperable, and Reusable) data management and the integration of datasets which requires data sharing, along with structured vocabularies and/or ontologies. Two AgBioData working groups, focused on Data Sharing and Ontologies, respectively, conducted a Consortium-wide survey to assess the current status and future needs of the members in those areas. A total of 33 researchers responded to the survey, representing 37 databases. Results suggest that data-sharing practices by AgBioData databases are in a fairly healthy state, but it is not clear whether this is true for all metadata and data types across all databases; and that, ontology use has not substantially changed since a similar survey was conducted in 2017. Based on our evaluation of the survey results, we recommend (i) providing training for database personnel in a specific data-sharing techniques, as well as in ontology use; (ii) further study on what metadata is shared, and how well it is shared among databases; (iii) promoting an understanding of data sharing and ontologies in the stakeholder community; (iv) improving data sharing and ontologies for specific phenotypic data types and formats; and (v) lowering specific barriers to data sharing and ontology use, by identifying sustainability solutions, and the identification, promotion, or development of data standards. Combined, these improvements are likely to help AgBioData databases increase development efforts towards improved ontology use, and data sharing via programmatic means. Database URL https://www.agbiodata.org/databases.
Topics: Animals; Data Management; Plant Breeding; Genomics; Databases, Factual; Information Dissemination
PubMed: 37971715
DOI: 10.1093/database/baad076 -
Frontiers in Public Health 2023Non-Fungible Tokens (NFTs) are digital assets that are verified using blockchain technology to ensure authenticity and ownership. NFTs have the potential to...
INTRODUCTION
Non-Fungible Tokens (NFTs) are digital assets that are verified using blockchain technology to ensure authenticity and ownership. NFTs have the potential to revolutionize healthcare by addressing various issues in the industry.
METHOD
The goal of this study was to identify the applications of NFTs in healthcare. Our scoping review was conducted in 2023. We searched the Scopus, IEEE, PubMed, Web of Science, Science Direct, and Cochrane scientific databases using related keywords. The article selection process was based on Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA).
RESULTS
After applying inclusion and exclusion criteria, a total of 13 articles were chosen. Then extracted data was summarized and reported. The most common application of NFTs in healthcare was found to be in health data management with 46% frequency, followed by supply chain management with 31% frequency. Furthermore, Ethereum is the main blockchain platform that is applied in NFTs in healthcare with 70%.
DISCUSSION
The findings from this review indicate that the NFTs that are currently used in healthcare could transform it. Also, it appears that researchers have not yet investigated the numerous potentials uses of NFTs in the healthcare field, which could be utilized in the future.
Topics: Humans; Data Management; Databases, Factual; Industry; Research Personnel; Technology
PubMed: 38074727
DOI: 10.3389/fpubh.2023.1266385 -
Journal of Medical Internet Research Aug 2023The health care industry has faced various challenges over the past decade as we move toward a digital future where services and data are available on demand. The... (Review)
Review
The health care industry has faced various challenges over the past decade as we move toward a digital future where services and data are available on demand. The systems of interconnected devices, users, data, and working environments are referred to as the Internet of Health Care Things (IoHT). IoHT devices have emerged in the past decade as cost-effective solutions with large scalability capabilities to address the constraints on limited resources. These devices cater to the need for remote health care services outside of physical interactions. However, IoHT security is often overlooked because the devices are quickly deployed and configured as solutions to meet the demands of a heavily saturated industry. During the COVID-19 pandemic, studies have shown that cybercriminals are exploiting the health care industry, and data breaches are targeting user credentials through authentication vulnerabilities. Poor password use and management and the lack of multifactor authentication security posture within IoHT cause a loss of millions according to the IBM reports. Therefore, it is important that health care authentication security moves toward adaptive multifactor authentication (AMFA) to replace the traditional approaches to authentication. We identified a lack of taxonomy for data models that particularly focus on IoHT data architecture to improve the feasibility of AMFA. This viewpoint focuses on identifying key cybersecurity challenges in a theoretical framework for a data model that summarizes the main components of IoHT data. The data are to be used in modalities that are suited for health care users in modern IoHT environments and in response to the COVID-19 pandemic. To establish the data taxonomy, a review of recent IoHT papers was conducted to discuss the related work in IoHT data management and use in next-generation authentication systems. Reports, journal articles, conferences, and white papers were reviewed for IoHT authentication data technologies in relation to the problem statement of remote authentication and user management systems. Only publications written in English from the last decade were included (2012-2022) to identify key issues within the current health care practices and their management of IoHT devices. We discuss the components of the IoHT architecture from the perspective of data management and sensitivity to ensure privacy for all users. The data model addresses the security requirements of IoHT users, environments, and devices toward the automation of AMFA in health care. We found that in health care authentication, the significant threats occurring were related to data breaches owing to weak security options and poor user configuration of IoHT devices. The security requirements of IoHT data architecture and identified impactful methods of cybersecurity for health care devices, data, and their respective attacks are discussed. Data taxonomy provides better understanding, solutions, and improvements of user authentication in remote working environments for security features.
Topics: Humans; Confidentiality; Telemedicine; Pandemics; COVID-19; Internet; Computer Security
PubMed: 37490633
DOI: 10.2196/44114 -
Journal of Chemical Information and... Nov 2023Web ontologies are important tools in modern scientific research because they provide a standardized way to represent and manage web-scale amounts of complex data. In...
Web ontologies are important tools in modern scientific research because they provide a standardized way to represent and manage web-scale amounts of complex data. In chemistry, a semantic database for chemical species is indispensable for its ability to interrelate and infer relationships, enabling a more precise analysis and prediction of chemical behavior. This paper presents OntoSpecies, a web ontology designed to represent chemical species and their properties. The ontology serves as a core component of The World Avatar knowledge graph chemistry domain and includes a wide range of identifiers, chemical and physical properties, chemical classifications and applications, and spectral information associated with each species. The ontology includes provenance and attribution metadata, ensuring the reliability and traceability of data. Most of the information about chemical species are sourced from PubChem and ChEBI data on the respective compound Web pages using a software agent, making OntoSpecies a comprehensive semantic database of chemical species able to solve novel types of problems in the field. Access to this reliable source of chemical data is provided through a SPARQL end point. The paper presents example use cases to demonstrate the contribution of OntoSpecies in solving complex tasks that require integrated semantically searchable chemical data. The approach presented in this paper represents a significant advancement in the field of chemical data management, offering a powerful tool for representing, navigating, and analyzing chemical information to support scientific research.
Topics: Knowledge Discovery; Reproducibility of Results; Software; Databases, Factual; Semantics
PubMed: 37883649
DOI: 10.1021/acs.jcim.3c00820 -
ACS Sensors Mar 2024Sensing systems necessitate automation to reduce human effort, increase reproducibility, and enable remote sensing. In this perspective, we highlight different types of... (Review)
Review
Sensing systems necessitate automation to reduce human effort, increase reproducibility, and enable remote sensing. In this perspective, we highlight different types of sensing systems with elements of automation, which are based on flow injection and sequential injection analysis, microfluidics, robotics, and other prototypes addressing specific real-world problems. Finally, we discuss the role of computer technology in sensing systems. Automated flow injection and sequential injection techniques offer precise and efficient sample handling and dependable outcomes. They enable continuous analysis of numerous samples, boosting throughput, and saving time and resources. They enhance safety by minimizing contact with hazardous chemicals. Microfluidic systems are enhanced by automation to enable precise control of parameters and increase of analysis speed. Robotic sampling and sample preparation platforms excel in precise execution of intricate, repetitive tasks such as sample handling, dilution, and transfer. These platforms enhance efficiency by multitasking, use minimal sample volumes, and they seamlessly integrate with analytical instruments. Other sensor prototypes utilize mechanical devices and computer technology to address real-world issues, offering efficient, accurate, and economical real-time solutions for analyte identification and quantification in remote areas. Computer technology is crucial in modern sensing systems, enabling data acquisition, signal processing, real-time analysis, and data storage. Machine learning and artificial intelligence enhance predictions from the sensor data, supporting the Internet of Things with efficient data management.
Topics: Humans; Artificial Intelligence; Reproducibility of Results; Automation; Robotics; Microfluidics
PubMed: 38363106
DOI: 10.1021/acssensors.3c01887 -
Neuro-oncology Advances 2023With medical software platforms moving to cloud environments with scalable storage and computing, the translation of predictive artificial intelligence (AI) models to... (Review)
Review
With medical software platforms moving to cloud environments with scalable storage and computing, the translation of predictive artificial intelligence (AI) models to aid in clinical decision-making and facilitate personalized medicine for cancer patients is becoming a reality. Medical imaging, namely radiologic and histologic images, has immense analytical potential in neuro-oncology, and models utilizing integrated radiomic and pathomic data may yield a synergistic effect and provide a new modality for precision medicine. At the same time, the ability to harness multi-modal data is met with challenges in aggregating data across medical departments and institutions, as well as significant complexity in modeling the phenotypic and genotypic heterogeneity of pediatric brain tumors. In this paper, we review recent pathomic and integrated pathomic, radiomic, and genomic studies with clinical applications. We discuss current challenges limiting translational research on pediatric brain tumors and outline technical and analytical solutions. Overall, we propose that to empower the potential residing in radio-pathomics, systemic changes in cross-discipline data management and end-to-end software platforms to handle multi-modal data sets are needed, in addition to embracing modern AI-powered approaches. These changes can improve the performance of predictive models, and ultimately the ability to advance brain cancer treatments and patient outcomes through the development of such models.
PubMed: 37841693
DOI: 10.1093/noajnl/vdad119