-
ACS Omega Feb 2022Organic matter (OM) is the material basis for hydrocarbon generation. Its type and abundance determine the hydrocarbon generation ability of source rocks, which is...
Organic matter (OM) is the material basis for hydrocarbon generation. Its type and abundance determine the hydrocarbon generation ability of source rocks, which is closely related to the provenance and sedimentary environment of source rocks. The tectonic backgrounds of the eastern and western subsags (ESS and WSS) of the Lishui Sag in the East China Sea Shelf Basin are significantly different and their influence on the OM in the source rocks is worthy of attention. This paper comprehensively analyzes the provenance and environmental characteristics and their influence on the features of the OM of Paleocene source rocks in the ESS and WSS. The study finds that the source rocks in the ESS have multiple sources. During the deposition period, as the salinity and paleoproductivity of the water column increased, the proportion of OM in the autochthonous components of the water column continued to increase, but the overall water column was in an oxidizing environment, resulting in a generally low abundance of OM. The provenance of the source rocks in the WSS was relatively simple and terrestrial. Also, the sedimentary environment had little effect on the type of OM. However, the whole water column of the WSS was in an anoxic environment, so the OM was better preserved, resulting in a higher abundance. Due to the influence of provenance and the sedimentary environment in different areas of the sag, the characteristics of OM in the source rocks are different, so relevant exploration strategies need to be adopted in actual exploration.
PubMed: 35224339
DOI: 10.1021/acsomega.1c05764 -
Clinical Pharmacology and Therapeutics Apr 2020The increasing volume and complexity of data now being captured across multiple settings and devices offers the opportunity to deliver a better characterization of...
The increasing volume and complexity of data now being captured across multiple settings and devices offers the opportunity to deliver a better characterization of diseases, treatments, and the performance of medicinal products in individual healthcare systems. Such data sources, commonly labeled as big data, are generally large, accumulating rapidly, and incorporate multiple data types and forms. Determining the acceptability of these data to support regulatory decisions demands an understanding of data provenance and quality in addition to confirming the validity of new approaches and methods for processing and analyzing these data. The Heads of Agencies and the European Medicines Agency Joint Big Data Taskforce was established to consider these issues from the regulatory perspective. This review reflects the thinking from its first phase and describes the big data landscape from a regulatory perspective and the challenges to be addressed in order that regulators can know when and how to have confidence in the evidence generated from big datasets.
Topics: Big Data; Data Science; Drug and Narcotic Control; Humans
PubMed: 31846513
DOI: 10.1002/cpt.1736 -
Bioinformatics (Oxford, England) Jun 2021Reproducibility is of central importance to the scientific process. The difficulty of consistently replicating and verifying experimental results is magnified in the era...
MOTIVATION
Reproducibility is of central importance to the scientific process. The difficulty of consistently replicating and verifying experimental results is magnified in the era of big data, in which bioinformatics analysis often involves complex multi-application pipelines operating on terabytes of data. These processes result in thousands of possible permutations of data preparation steps, software versions and command-line arguments. Existing reproducibility frameworks are cumbersome and involve redesigning computational methods. To address these issues, we developed RepeatFS, a file system that records, replicates and verifies informatics workflows with no alteration to the original methods. RepeatFS also provides several other features to help promote analytical transparency and reproducibility, including provenance visualization and task automation.
RESULTS
We used RepeatFS to successfully visualize and replicate a variety of bioinformatics tasks consisting of over a million operations with no alteration to the original methods. RepeatFS correctly identified all software inconsistencies that resulted in replication differences.
AVAILABILITYAND IMPLEMENTATION
RepeatFS is implemented in Python 3. Its source code and documentation are available at https://github.com/ToniWestbrook/repeatfs.
SUPPLEMENTARY INFORMATION
Supplementary data are available at Bioinformatics online.
Topics: Automation; Computational Biology; Reproducibility of Results; Software; Workflow
PubMed: 33230554
DOI: 10.1093/bioinformatics/btaa950 -
Frontiers in Plant Science 2023has the characteristics of rapid growth and high resistance. However, there is little research on molecular breeding of , which is essential to shortening breeding life...
has the characteristics of rapid growth and high resistance. However, there is little research on molecular breeding of , which is essential to shortening breeding life and selecting quality varieties. Therefore, a crucial step before selective breeding can be carried out to increase the wood quality of is identifying genetic diversity and population structure using single nucleotide polymorphism (SNP) markers. In this study, the genetic diversity of 1 generation 196 families from 23 geographically defined was assessed using 1,677,732 SNP markers identified by whole genome resequencing. SNP annotation showed that the ratio of non-synonymous to synonymous coding mutations was 0.83. Principal component analysis (PCA), phylogenetic tree, and population structure analysis permitted the families to be categorized into three groups, one of which (G2) contains most of the Indonesian (IDN) and Papua New Guinea (PNG) families. Genetic relationship analysis showed that IDN was closely related to PNG. Genetic diversity analysis showed that He, PIC, I, and H mean values were 0.2502, 0.2027, 0.3815, and 0.2680, respectively. PCA analysis classified various provenances in QLD into two categories (G1 and G3). The genetic diversity of G3 was higher than that of G2. The results of genetic differentiation (Fst) showed that PNG region was divided into two groups (PNG1 and PNG2), the Fst (0.172) between QLD and PNG2 region was higher than QLD and PNG1, and the Fst (0.024) between IDN and PNG1 is smaller than IDN and PNG2. A Mantel test revealed a positive correlation between the genetic and geographic distance of . This study has a certain reference value for genetic identification, germplasm preservation, and breeding of . Also, it provides a basis for subsequent association analysis to explore excellent alleles and introduction.
PubMed: 38162312
DOI: 10.3389/fpls.2023.1278427 -
Progress in Biophysics and Molecular... Jan 2024One of the foundational principles of recent developments in evolutionary biology has been the acknowledgement of homeostasis as an organising principle of cellular...
One of the foundational principles of recent developments in evolutionary biology has been the acknowledgement of homeostasis as an organising principle of cellular development from unicellular origins. Fundamentally, this concerns the balance between the inside of a biological entity and its environment. Given that the organ of balance is the ear, and that the evolutionary provenance of the vestibular system can be traced back to fish, music provides a rich foundation for evolutionary biological inquiry. This paper considers a specific dimensional relationship in sonic experience between noise, signal, redundancy and anticipation. Drawing on the physics of Bohm and more recent developments in Rowlands's nilpotent quantum mechanics, I argue that the relationship between these four parameters is not only that they represent aspects of sonic experience, but that they are dimensionally distinct, where noise can be considered to be scalar, a signal (or a note) is a vector (having magnitude and direction), redundancy is bi-vectorial (involving degrees of repetition of signals over time), and anticipation is tri-vectorial (involving reflexive consideration of different orders of redundancy). In outlining the dimensional distinction between these variables, an analysis is presented which considers the relationship between the Shannon entropy of different dimensions in music. This shows that the entropy of noise has a particular bearing on the entropy of the other dimensions. This dimensional relation is also reflected in biological evidence, where Torday has shown there to be a direct correlation between the effect of gravitational "noise" on cellular communication, and by extension the evolution of consciousness.
Topics: Animals; Music; Cell Communication; Homeostasis; Consciousness; Cell Differentiation
PubMed: 38103652
DOI: 10.1016/j.pbiomolbio.2023.11.006 -
Frontiers in Plant Science 2023is an orchid with medicinal and nutritional properties that has received increasing attention because of its health benefits; however, there is limited information...
is an orchid with medicinal and nutritional properties that has received increasing attention because of its health benefits; however, there is limited information about the metabolic basis of these properties. In this report, secondary metabolites and the antioxidant activity of stem samples from three provenances were analyzed, using a UHPLC-QqQ-MS/MS-based metabolomics approach. In total, 411 metabolites were identified including 8 categories such as flavonoids and phenolic acids, 136 of which were differential metabolites. These differentially accumulated metabolites (DAMs) were mainly enriched in secondary metabolic pathways such as flavone, flavonol, tropane, piperidine, pyridine, isoquinoline alkaloid biosynthesis and tyrosine metabolism. The metabolomic profiling suggested that the quantity and content of flavonoid compounds accounted for the highest proportion of total metabolites. Hierarchical cluster analysis (HCA) showed that the marker metabolites of from the three provenances were mainly flavonoids, alkaloids and phenolic acids. Correlation analysis identified that 48 differential metabolites showed a significant positive correlation with antioxidant capacity (r ³ 0.8 and p < 0.0092), and flavonoids were the main factors affecting the different antioxidant activities. It is worth noting that quercetin-3-O-sophoroside-7-O-rhamnoside and dihydropinosylvin methyl ether might be the main compounds causing the differences in antioxidant capacity of Yunnan provenance (YN), Zhejiang provenance (ZJ), and Guizhou provenance (GZ). These finding provides valuable information for screening varieties, quality control and product development of .
PubMed: 36760636
DOI: 10.3389/fpls.2023.1060242 -
JAMA Internal Medicine Jan 2024Both the commercial sector and academia play a vital role in medicine development. Ongoing debates exist on their contribution and the value of medicinal products...
IMPORTANCE
Both the commercial sector and academia play a vital role in medicine development. Ongoing debates exist on their contribution and the value of medicinal products entering the market.
OBJECTIVE
To identify the provenance and clinical benefit of medicines that entered the French market between 2008 and 2018.
DESIGN AND SETTING
In this cross-sectional study, the provenance of each medicine in the French market was established via a review of multiple sources documenting at least 2 matching findings per product. The clinical benefit was assigned using the matched scale developed from the Prescrire and Haute Autorité de Santé (HAS) gradings. The χ2 test was used to analyze the proportions and frequencies of medicines graded by Prescrire and HAS by origin, therapeutic category, and clinical benefit.
MAIN OUTCOMES AND MEASURES
The origins and therapeutic categories of medicines. Clinical benefit based on Prescrire and HAS grading. Concordance of Prescrire and HAS grading.
RESULTS
Of the 632 medicines that entered the French market between 2008 and 2018, 464 originated (73%) in the commercial sector, and 168 originated (27%) in the academic setting or in collaboration with commercial enterprises. Prescrire graded psychotropic agents (13/14 [93%]), whereas HAS graded respiratory agents (24/25 [96%]) as the highest percentage of medicines that provided no added benefit. Prescrire graded 360 medicines (77.6%) that originated in the industry and 108 medicines (64.3%) that originated in the academic setting (P = .001) to have no added clinical benefit. HAS assigned such grading to 331 ([71.3%] industry) vs 104 ([61.9%] academia) (P = .02). Based on the Prescrire grading, academia invented more medicines delivering some added benefit 57 (33.9%) vs 98 (21.1%) invented by industry (P = .001). HAS grading on some added benefit 51 ([30.4%] academia) vs 121 ([26.1%] industry) did not reach statistical significance (P = .29). However, HAS grading on substantial added clinical benefit reached statistical significance in favor of academia (13 [7.7%] vs 12 [2.6%] in the industry; P = .003), whereas Prescrire grading did not (1.8% academia vs 1.3% industry; P = .64).
CONCLUSIONS AND RELEVANCE
More than 70% of medicines that entered the French market during the 10-year period originated in the commercial sector. Although most medicines were not graded as providing clinical benefit, medicines originating in the academic setting were more likely to be graded as conferring clinical benefit than those originating in the commercial setting.
Topics: Humans; Cross-Sectional Studies; Drug Industry; Commerce; Pharmaceutical Preparations
PubMed: 37983026
DOI: 10.1001/jamainternmed.2023.6249 -
Plants (Basel, Switzerland) Nov 2023(Lour.) Pers. is an important woody spice tree in southern China, and its fruit is a rich source of valuable essential oil. We surveyed and sampled germplasm resources...
(Lour.) Pers. is an important woody spice tree in southern China, and its fruit is a rich source of valuable essential oil. We surveyed and sampled germplasm resources from 36 provenances in nine Chinese provinces, and detected rich phenotypic diversity. The survey results showed that plants of SC-KJ, SC-HJ, and SC-LS provenance presented higher leaf area (LA); YN-SM and YN-XC plants had larger thousand-grain fresh weight (TFW); and HN-DX plants had the highest essential oil content (EOC). To explain the large differences in the phenotypes of among different habitats, we used Pearson's correlation analysis, multiple stepwise regression path analysis, and redundancy analysis to evaluate the phenotypic diversity of . It was found that compared to other traits, leaf and fruit traits had more significant geographical distributions, and that leaf phenotypes were correlated to fruit phenotypes. The results showed that elevation, latitude, longitude, total soil porosity (SP), soil bulk density (SBD), and average annual rainfall (AAR, mm) contributed significantly to the phenotypic diversity of . Geographical factors explained a higher percentage of variation in phenotypic diversity than did soil factors and climate factors. Plants of SC-KJ and HN-DX provenances could be important resources for domestication and breeding to develop new high-yielding varieties of this woody aromatic plant. This study describes significant phenotypic differences in related to adaptation to different environments, and provides a theoretical basis for the development of a breeding strategy and for optimizing cultivation.
PubMed: 37960137
DOI: 10.3390/plants12213781 -
Frontiers in Big Data 2022Data lakes are a fundamental building block for many industrial data analysis solutions and becoming increasingly popular in research. Often associated with big data use... (Review)
Review
Data lakes are a fundamental building block for many industrial data analysis solutions and becoming increasingly popular in research. Often associated with big data use cases, data lakes are, for example, used as central data management systems of research institutions or as the core entity of machine learning pipelines. The basic underlying idea of retaining data in its native format within a data lake facilitates a large range of use cases and improves data reusability, especially when compared to the schema-on-write approach applied in data warehouses, where data is transformed prior to the actual storage to fit a predefined schema. Storing such massive amounts of raw data, however, has its very own challenges, spanning from the general data modeling, and indexing for concise querying to the integration of suitable and scalable compute capabilities. In this contribution, influential papers of the last decade have been selected to provide a comprehensive overview of developments and obtained results. The papers are analyzed with regard to the applicability of their input to data lakes that serve as central data management systems of research institutions. To achieve this, contributions to data lake architectures, metadata models, data provenance, workflow support, and FAIR principles are investigated. Last, but not least, these capabilities are mapped onto the requirements of two common research personae to identify open challenges. With that, potential research topics are determined, which have to be tackled toward the applicability of data lakes as central building blocks for research data management.
PubMed: 36072823
DOI: 10.3389/fdata.2022.945720 -
GigaScience Nov 2019The automation of data analysis in the form of scientific workflows has become a widely adopted practice in many fields of research. Computationally driven...
BACKGROUND
The automation of data analysis in the form of scientific workflows has become a widely adopted practice in many fields of research. Computationally driven data-intensive experiments using workflows enable automation, scaling, adaptation, and provenance support. However, there are still several challenges associated with the effective sharing, publication, and reproducibility of such workflows due to the incomplete capture of provenance and lack of interoperability between different technical (software) platforms.
RESULTS
Based on best-practice recommendations identified from the literature on workflow design, sharing, and publishing, we define a hierarchical provenance framework to achieve uniformity in provenance and support comprehensive and fully re-executable workflows equipped with domain-specific information. To realize this framework, we present CWLProv, a standard-based format to represent any workflow-based computational analysis to produce workflow output artefacts that satisfy the various levels of provenance. We use open source community-driven standards, interoperable workflow definitions in Common Workflow Language (CWL), structured provenance representation using the W3C PROV model, and resource aggregation and sharing as workflow-centric research objects generated along with the final outputs of a given workflow enactment. We demonstrate the utility of this approach through a practical implementation of CWLProv and evaluation using real-life genomic workflows developed by independent groups.
CONCLUSIONS
The underlying principles of the standards utilized by CWLProv enable semantically rich and executable research objects that capture computational workflows with retrospective provenance such that any platform supporting CWL will be able to understand the analysis, reuse the methods for partial reruns, or reproduce the analysis to validate the published findings.
Topics: Genomics; Humans; Models, Theoretical; Software; Workflow
PubMed: 31675414
DOI: 10.1093/gigascience/giz095