-
Heliyon Sep 2022The Cretaceous and Neogene deposits from the Mamfe Basin consisting of sandstone, shale and claystone were studied using petrography, and major, traces and REEs analyses...
The Cretaceous and Neogene deposits from the Mamfe Basin consisting of sandstone, shale and claystone were studied using petrography, and major, traces and REEs analyses to address sediment source, environment setting, prevailing paleoclimate as well as tectonic regime of the basin. The angular to subangular shape of detrital grains reflects the mineralogical and textural immaturity of sediments and the proximity of the sediment supply source. Sedimentary rocks are composed of a significant number of lithic debris, organic matter, ostracods as well as subrounded heavy minerals referring to notable igneous and metamorphic rocks bordering the Mamfe Basin. The plots of major element ratios including iron oxide/potasium oxide (FeO/KO) against silicium oxide/aluminium oxide (SiO/AlO) combined with that of sodium oxide/potassium oxide (NaO/KO) compared to silicium oxide/aluminium oxide (SiO/AlO) are characteristic of greywacke and shale with few arkoses. The pronounced Eu negative anomaly of chondrite normalized REEs along with the plot of La/Th vs Hf and Co/Th vs La/Sc suggest that sediments are in general from felsic and intermediate source rock provenance, only subordinated contribution of mafic source. The negative anomaly of Yb suggests igneous fractionation under highly reducing conditions. The chemical index of alteration values of 47-70 combined with chemical index of weathering values of 0.6-84 suggest low to moderate weathering process of the sediment in the basin. This result is further confirmed by an index of chemical variability values of 0.6-100 and Zr/Sc ratio of 0.06-2.96. The REEs distribution displays a substantial content in LREE, low content in HREE and noticible proportion of (La/Yb) ratio (mean >9), poor (Gd/Yb) ratio in the Cross River Formation (mean <2) and slightly moderate (Gd/Yb) ratio in the other formations (mean >2). This result implies that sediments from the Ngeme, Nfaitok and Baso formations derived from post-Archean rocks. Geochemical paleoenvironmental proxies including Sr/Cu, Sr/Ba, Ga/Rb vs Sr/Cu and SiO vs KO + NaO+AlO are in favor of arid to semi-arid conditions during the deposition. Trace Elemental ratios such as Sr/Cu, Sr/Ba, V/Ni, U/Th, Ni/Co, V/Sc, and V/Cr values indicate a predominance of oxic conditions during deposition. In contrast, some authigenic pyrite, hematite, siderite and vivianite which are iron-rich minerals suggests episodic reducing conditions in the basin. The study provides a valuable information in evaluating sediments source, depositional environment, tectonic regime as well as the paleoclimatic conditions prevailing in the basin during the depositional period. The geochemistry of rocks of the Ngeme and Baso formations suggest passive continental margin setting and Ngeme, Nfaitok and Cross River formations suggest oceanic island Arc tectonic setting.
PubMed: 36097494
DOI: 10.1016/j.heliyon.2022.e10304 -
ACS Omega Feb 2022Organic matter (OM) is the material basis for hydrocarbon generation. Its type and abundance determine the hydrocarbon generation ability of source rocks, which is...
Organic matter (OM) is the material basis for hydrocarbon generation. Its type and abundance determine the hydrocarbon generation ability of source rocks, which is closely related to the provenance and sedimentary environment of source rocks. The tectonic backgrounds of the eastern and western subsags (ESS and WSS) of the Lishui Sag in the East China Sea Shelf Basin are significantly different and their influence on the OM in the source rocks is worthy of attention. This paper comprehensively analyzes the provenance and environmental characteristics and their influence on the features of the OM of Paleocene source rocks in the ESS and WSS. The study finds that the source rocks in the ESS have multiple sources. During the deposition period, as the salinity and paleoproductivity of the water column increased, the proportion of OM in the autochthonous components of the water column continued to increase, but the overall water column was in an oxidizing environment, resulting in a generally low abundance of OM. The provenance of the source rocks in the WSS was relatively simple and terrestrial. Also, the sedimentary environment had little effect on the type of OM. However, the whole water column of the WSS was in an anoxic environment, so the OM was better preserved, resulting in a higher abundance. Due to the influence of provenance and the sedimentary environment in different areas of the sag, the characteristics of OM in the source rocks are different, so relevant exploration strategies need to be adopted in actual exploration.
PubMed: 35224339
DOI: 10.1021/acsomega.1c05764 -
Bioinformatics (Oxford, England) Jun 2021Reproducibility is of central importance to the scientific process. The difficulty of consistently replicating and verifying experimental results is magnified in the era...
MOTIVATION
Reproducibility is of central importance to the scientific process. The difficulty of consistently replicating and verifying experimental results is magnified in the era of big data, in which bioinformatics analysis often involves complex multi-application pipelines operating on terabytes of data. These processes result in thousands of possible permutations of data preparation steps, software versions and command-line arguments. Existing reproducibility frameworks are cumbersome and involve redesigning computational methods. To address these issues, we developed RepeatFS, a file system that records, replicates and verifies informatics workflows with no alteration to the original methods. RepeatFS also provides several other features to help promote analytical transparency and reproducibility, including provenance visualization and task automation.
RESULTS
We used RepeatFS to successfully visualize and replicate a variety of bioinformatics tasks consisting of over a million operations with no alteration to the original methods. RepeatFS correctly identified all software inconsistencies that resulted in replication differences.
AVAILABILITYAND IMPLEMENTATION
RepeatFS is implemented in Python 3. Its source code and documentation are available at https://github.com/ToniWestbrook/repeatfs.
SUPPLEMENTARY INFORMATION
Supplementary data are available at Bioinformatics online.
Topics: Automation; Computational Biology; Reproducibility of Results; Software; Workflow
PubMed: 33230554
DOI: 10.1093/bioinformatics/btaa950 -
Cytotechnology Jul 2002Cultured cell lines have become an extremely valuable resource, both in academic research and in industrial biotechnology. However, their value is frequently compromised...
Cultured cell lines have become an extremely valuable resource, both in academic research and in industrial biotechnology. However, their value is frequently compromised by misidentification and undetected microbial contamination. As detailed elsewhere in this volume, the technology, both simple and sophisticated, is available to remedy the problems of misidentification and contamination, given the will to apply it. Combined with proper records of the origin and history of the cell line, assays for authentication and contamination contribute to the provenance of the cell line. Detailed records should start from the initiation or receipt of the cell line, and should incorporate data on the donor as well as the tissue from which the cell line was derived, should continue with details of maintenance, and include any accidental as well as deliberate deviations from normal maintenance. Records should also contain details of authentication and regular checks for contamination. With this information, preferably stored in a database, and suitable backed up, the provenance of the cell line so created makes the cell line a much more valuable resource, fit for validation in industrial applications and more likely to provide reproducible experimental results when disseminated for research in other laboratories.
PubMed: 19003293
DOI: 10.1023/A:1022949730029 -
Frontiers in Genetics 2022Fair and equitable benefit sharing of genetic resources is an expectation of the Nagoya Protocol. Although the Nagoya Protocol does not yet formally apply to Digital...
Fair and equitable benefit sharing of genetic resources is an expectation of the Nagoya Protocol. Although the Nagoya Protocol does not yet formally apply to Digital Sequence Information ("DSI"), discussions are currently underway regarding to include such data through ongoing Convention on Biological Diversity ("CBD") negotiations. While Indigenous Peoples and Local Communities ("IPLC") expect the value generated from genomic data to be subject to benefit sharing arrangements, a range of views are currently being expressed by Nation States, IPLC and other stakeholders. The use of DSI gives rise to unique considerations, creating a gray area as to how it should be considered under the Nagoya Protocol's Access and Benefit Sharing ("ABS") principles. One way for benefit sharing to be enhanced is through the connection of data to proper provenance information. A significant development is the use of digital labeling systems to ensure that the origin of samples is appropriately disclosed. The Traditional Knowledge and Biocultural Labels initiative offers a practical option for data provided to genomic databases. In particular, the BioCultural Labels ("BC Labels") are a mechanism for Indigenous communities to identify and maintain provenance, origin and authority over biocultural material and data generated from Indigenous land and waters held in research, cultural institutions and data repositories. This form of cultural metadata adds value to the research endeavor and the creation of Indigenous fields within databases adds transparency and accountability to the research environment.
PubMed: 36212139
DOI: 10.3389/fgene.2022.1014044 -
Patterns (New York, N.Y.) Aug 2020Deep learning, a set of approaches using artificial neural networks, has generated rapid recent advancements in machine learning. Deep learning does, however, have the...
Deep learning, a set of approaches using artificial neural networks, has generated rapid recent advancements in machine learning. Deep learning does, however, have the potential to reduce the reproducibility of scientific results. Model outputs are critically dependent on the data and processing approach used to initially generate the model, but this provenance information is usually lost during model training. To avoid a future reproducibility crisis, we need to improve our deep-learning model management. The FAIR principles for data stewardship and software/workflow implementation give excellent high-level guidance on ensuring effective reuse of data and software. We suggest some specific guidelines for the generation and use of deep-learning models in science and explain how these relate to the FAIR principles. We then present dtoolAI, a Python package that we have developed to implement these guidelines. The package implements automatic capture of provenance information during model training and simplifies model distribution.
PubMed: 33205122
DOI: 10.1016/j.patter.2020.100073 -
Scientific Reports Jun 2021Chronic wasting disease (CWD) is a fatal, contagious, neurodegenerative prion disease affecting both free-ranging and captive cervid species. CWD is spread via direct or...
Chronic wasting disease (CWD) is a fatal, contagious, neurodegenerative prion disease affecting both free-ranging and captive cervid species. CWD is spread via direct or indirect contact or oral ingestion of prions. In the gastrointestinal tract, prions enter the body through microfold cells (M-cells), and the abundance of these cells can be influenced by the gut microbiota. To explore potential links between the gut microbiota and CWD, we collected fecal samples from farmed and free-ranging white-tailed deer (Odocoileus virginianus) around the Midwest, USA. Farmed deer originated from farms that were depopulated due to CWD. Free-ranging deer were sampled during annual deer harvests. All farmed deer were tested for CWD via ELISA and IHC, and we used 16S rRNA gene sequencing to characterize the gut microbiota. We report significant differences in gut microbiota by provenance (Farm 1, Farm 2, Free-ranging), sex, and CWD status. CWD-positive deer from Farm 1 and 2 had increased abundances of Akkermansia, Lachnospireacea UCG-010, and RF39 taxa. Overall, differences by provenance and sex appear to be driven by diet, while differences by CWD status may be linked to CWD pathogenesis.
Topics: Animals; Deer; Enzyme-Linked Immunosorbent Assay; Female; Gastrointestinal Microbiome; Male; Prions; RNA, Ribosomal, 16S; Wasting Disease, Chronic
PubMed: 34168170
DOI: 10.1038/s41598-021-89896-9 -
Journal of Biomedical Semantics Sep 2018Biomedical knowledge graphs have become important tools to computationally analyse the comprehensive body of biomedical knowledge. They represent knowledge as...
BACKGROUND
Biomedical knowledge graphs have become important tools to computationally analyse the comprehensive body of biomedical knowledge. They represent knowledge as subject-predicate-object triples, in which the predicate indicates the relationship between subject and object. A triple can also contain provenance information, which consists of references to the sources of the triple (e.g. scientific publications or database entries). Knowledge graphs have been used to classify drug-disease pairs for drug efficacy screening, but existing computational methods have often ignored predicate and provenance information. Using this information, we aimed to develop a supervised machine learning classifier and determine the added value of predicate and provenance information for drug efficacy screening. To ensure the biological plausibility of our method we performed our research on the protein level, where drugs are represented by their drug target proteins, and diseases by their disease proteins.
RESULTS
Using random forests with repeated 10-fold cross-validation, our method achieved an area under the ROC curve (AUC) of 78.1% and 74.3% for two reference sets. We benchmarked against a state-of-the-art knowledge-graph technique that does not use predicate and provenance information, obtaining AUCs of 65.6% and 64.6%, respectively. Classifiers that only used predicate information performed superior to classifiers that only used provenance information, but using both performed best.
CONCLUSION
We conclude that both predicate and provenance information provide added value for drug efficacy screening.
Topics: Biological Ontologies; Computer Graphics; Drug Evaluation, Preclinical; False Negative Reactions; ROC Curve
PubMed: 30189889
DOI: 10.1186/s13326-018-0189-6 -
Neuroinformatics Jan 2022Results of computational analyses require transparent disclosure of their supporting resources, while the analyses themselves often can be very large scale and involve...
Results of computational analyses require transparent disclosure of their supporting resources, while the analyses themselves often can be very large scale and involve multiple processing steps separated in time. Evidence for the correctness of any analysis should include not only a textual description, but also a formal record of the computations which produced the result, including accessible data and software with runtime parameters, environment, and personnel involved. This article describes FAIRSCAPE, a reusable computational framework, enabling simplified access to modern scalable cloud-based components. FAIRSCAPE fully implements the FAIR data principles and extends them to provide fully FAIR Evidence, including machine-interpretable provenance of datasets, software and computations, as metadata for all computed results. The FAIRSCAPE microservices framework creates a complete Evidence Graph for every computational result, including persistent identifiers with metadata, resolvable to the software, computations, and datasets used in the computation; and stores a URI to the root of the graph in the result's metadata. An ontology for Evidence Graphs, EVI ( https://w3id.org/EVI ), supports inferential reasoning over the evidence. FAIRSCAPE can run nested or disjoint workflows and preserves provenance across them. It can run Apache Spark jobs, scripts, workflows, or user-supplied containers. All objects are assigned persistent IDs, including software. All results are annotated with FAIR metadata using the evidence graph model for access, validation, reproducibility, and re-use of archived data and software.
Topics: Metadata; Reproducibility of Results; Software; Workflow
PubMed: 34264488
DOI: 10.1007/s12021-021-09529-4 -
PloS One 2024This study, conducted in China in November 2020, was aimed at exploring the variations in growth traits among different provenances and families as well as to select...
This study, conducted in China in November 2020, was aimed at exploring the variations in growth traits among different provenances and families as well as to select elite materials of Juglans mandshurica. Thus, seeds of 44 families from six J. mandshurica provenances in Heilongjiang and Jilin provinces were sown in the nursery and then transplanted out in the field. At the age of 5 years, seven growth traits were assessed, and a comprehensive analysis was conducted as well as selection of provenance and families. Analysis of variance revealed statistically significant (P < 0.01) differences in seven growth traits among different provenances and families, thereby justifying the pursuit of further breeding endeavors. The genetic coefficient of variation (GCV) for all traits ranged from 5.44% (branch angle) to 21.95% (tree height) whereas the phenotypic coefficient of variation (PCV) ranged from 13.74% (tapering) to 38.50% (branch number per node), indicating considerable variability across the traits. Further, all the studied traits except stem straightness degree, branch angle and branch number per node, showed high heritability (Tree height, ground diameter, mean crown width and tapering, over 0.7±0.073), indicating that the variation in these traits is primarily driven by genetic factors. Correlation analysis revealed a strong positive correlation (r > 0.8) between tree height and ground diameter (r = 0.86), tree height and mean crown width (r = 0.82), and ground diameter and mean crown width (r = 0.83). This suggests that these relationships can be employed for more precise predictions of the growth and morphological characteristics of trees, as well as the selection of superior materials. There was a strong correlation between temperature factors and growth traits. Based on the comprehensive scores in this study, Sanchazi was selected as elite provenance. Using the top-percentile selection criteria, SC1, SC8, DJC15, and DQ18 were selected as elite families. These selected families exhibit genetic gains of over 10% in tree height, ground diameter and mean crown width, signifying their significant potential in forestry for enhancing timber production and reducing production cycles, thereby contributing to sustainable forest management. In this study, the growth traits of J. mandshurica were found to exhibit stable variation, and there were correlations between these traits. The selected elite provenance and families of J. mandshurica showed faster growth, which is advantageous for the subsequent breeding and promotion of improved J. mandshurica varieties.
Topics: Juglans; Plant Breeding; Trees; Forests; China
PubMed: 38451964
DOI: 10.1371/journal.pone.0298918