-
MSphere Jun 2024The gut microbiome has the potential to buffer temporal variations in resource availability and consumption, which may play a key role in the ability of animals to adapt...
UNLABELLED
The gut microbiome has the potential to buffer temporal variations in resource availability and consumption, which may play a key role in the ability of animals to adapt to a broad range of habitats. We investigated the temporal composition and function of the gut microbiomes of wild common marmosets () exploiting a hot, dry environment-Caatinga-in northeastern Brazil. We collected fecal samples during two time periods (July-August and February-March) for 2 years from marmosets belonging to eight social groups. We used 16S rRNA gene amplicon sequencing, metagenomic sequencing, and butyrate RT-qPCR to assess changes in the composition and potential function of their gut microbiomes. Additionally, we identified the plant, invertebrate, and vertebrate components of the marmosets' diet via DNA metabarcoding. Invertebrate, but not plant or vertebrate, consumption varied across the year. However, gut microbiome composition and potential function did not markedly vary across study periods or as a function of diet composition. Instead, the gut microbiome differed markedly in both composition and potential function across marmosets residing in different social groups. We highlight the likely role of factors, such as behavior, residence, and environmental heterogeneity, in modulating the structure of the gut microbiome.
IMPORTANCE
In a highly socially cohesive and cooperative primate, group membership more strongly predicts gut microbiome composition and function than diet.
PubMed: 38940510
DOI: 10.1128/msphere.00233-24 -
Bioinformatics (Oxford, England) Jun 2024World Health Organization estimates that there were over 10 million cases of tuberculosis (TB) worldwide in 2019, resulting in over 1.4 million deaths, with a worrisome...
MOTIVATION
World Health Organization estimates that there were over 10 million cases of tuberculosis (TB) worldwide in 2019, resulting in over 1.4 million deaths, with a worrisome increasing trend yearly. The disease is caused by Mycobacterium tuberculosis (MTB) through airborne transmission. Treatment of TB is estimated to be 85% successful, however, this drops to 57% if MTB exhibits multiple antimicrobial resistance (AMR), for which fewer treatment options are available.
RESULTS
We develop a robust machine-learning classifier using both linear and nonlinear models (i.e. LASSO logistic regression (LR) and random forests (RF)) to predict the phenotypic resistance of Mycobacterium tuberculosis (MTB) for a broad range of antibiotic drugs. We use data from the CRyPTIC consortium to train our classifier, which consists of whole genome sequencing and antibiotic susceptibility testing (AST) phenotypic data for 13 different antibiotics. To train our model, we assemble the sequence data into genomic contigs, identify all unique 31-mers in the set of contigs, and build a feature matrix M, where M[i, j] is equal to the number of times the ith 31-mer occurs in the jth genome. Due to the size of this feature matrix (over 350 million unique 31-mers), we build and use a sparse matrix representation. Our method, which we refer to as MTB++, leverages compact data structures and iterative methods to allow for the screening of all the 31-mers in the development of both LASSO LR and RF. MTB++ is able to achieve high discrimination (F-1 >80%) for the first-line antibiotics. Moreover, MTB++ had the highest F-1 score in all but three classes and was the most comprehensive since it had an F-1 score >75% in all but four (rare) antibiotic drugs. We use our feature selection to contextualize the 31-mers that are used for the prediction of phenotypic resistance, leading to some insights about sequence similarity to genes in MEGARes. Lastly, we give an estimate of the amount of data that is needed in order to provide accurate predictions.
AVAILABILITY
The models and source code are publicly available on Github at https://github.com/M-Serajian/MTB-Pipeline.
Topics: Mycobacterium tuberculosis; Machine Learning; Drug Resistance, Bacterial; Microbial Sensitivity Tests; Anti-Bacterial Agents; Whole Genome Sequencing; Genome, Bacterial; Humans
PubMed: 38940175
DOI: 10.1093/bioinformatics/btae243 -
Bioinformatics (Oxford, England) Jun 2024Automated protein function prediction is a crucial and widely studied problem in bioinformatics. Computationally, protein function is a multilabel classification problem...
UNLABELLED
Automated protein function prediction is a crucial and widely studied problem in bioinformatics. Computationally, protein function is a multilabel classification problem where only positive samples are defined and there is a large number of unlabeled annotations. Most existing methods rely on the assumption that the unlabeled set of protein function annotations are negatives, inducing the false negative issue, where potential positive samples are trained as negatives. We introduce a novel approach named PU-GO, wherein we address function prediction as a positive-unlabeled ranking problem. We apply empirical risk minimization, i.e. we minimize the classification risk of a classifier where class priors are obtained from the Gene Ontology hierarchical structure. We show that our approach is more robust than other state-of-the-art methods on similarity-based and time-based benchmark datasets.
AVAILABILITY AND IMPLEMENTATION
Data and code are available at https://github.com/bio-ontology-research-group/PU-GO.
Topics: Proteins; Computational Biology; Gene Ontology; Databases, Protein; Algorithms
PubMed: 38940168
DOI: 10.1093/bioinformatics/btae237 -
Bioinformatics (Oxford, England) Jun 2024Charting cellular trajectories over gene expression is key to understanding dynamic cellular processes and their underlying mechanisms. While advances in single-cell...
BACKGROUND
Charting cellular trajectories over gene expression is key to understanding dynamic cellular processes and their underlying mechanisms. While advances in single-cell RNA-sequencing technologies and computational methods have pushed forward the recovery of such trajectories, trajectory inference remains a challenge due to the noisy, sparse, and high-dimensional nature of single-cell data. This challenge can be alleviated by increasing either the number of cells sampled along the trajectory (breadth) or the sequencing depth, i.e. the number of reads captured per cell (depth). Generally, these two factors are coupled due to an inherent breadth-depth tradeoff that arises when the sequencing budget is constrained due to financial or technical limitations.
RESULTS
Here we study the optimal allocation of a fixed sequencing budget to optimize the recovery of trajectory attributes. Empirical results reveal that reconstruction accuracy of internal cell structure in expression space scales with the logarithm of either the breadth or depth of sequencing. We additionally observe a power law relationship between the optimal number of sampled cells and the corresponding sequencing budget. For linear trajectories, non-monotonicity in trajectory reconstruction across the breadth-depth tradeoff can impact downstream inference, such as expression pattern analysis along the trajectory. We demonstrate these results for five single-cell RNA-sequencing datasets encompassing differentiation of embryonic stem cells, pancreatic beta cells, hepatoblast and multipotent hematopoietic cells, as well as induced reprogramming of embryonic fibroblasts into neurons. By addressing the challenges of single-cell data, our study offers insights into maximizing the efficiency of cellular trajectory analysis through strategic allocation of sequencing resources.
Topics: Single-Cell Analysis; Sequence Analysis, RNA; Humans; Animals; High-Throughput Nucleotide Sequencing
PubMed: 38940162
DOI: 10.1093/bioinformatics/btae258 -
Bioinformatics (Oxford, England) Jun 2024The study of bacterial genome dynamics is vital for understanding the mechanisms underlying microbial adaptation, growth, and their impact on host phenotype. Structural...
MOTIVATION
The study of bacterial genome dynamics is vital for understanding the mechanisms underlying microbial adaptation, growth, and their impact on host phenotype. Structural variants (SVs), genomic alterations of 50 base pairs or more, play a pivotal role in driving evolutionary processes and maintaining genomic heterogeneity within bacterial populations. While SV detection in isolate genomes is relatively straightforward, metagenomes present broader challenges due to the absence of clear reference genomes and the presence of mixed strains. In response, our proposed method rhea, forgoes reference genomes and metagenome-assembled genomes (MAGs) by encompassing all metagenomic samples in a series (time or other metric) into a single co-assembly graph. The log fold change in graph coverage between successive samples is then calculated to call SVs that are thriving or declining.
RESULTS
We show rhea to outperform existing methods for SV and horizontal gene transfer (HGT) detection in two simulated mock metagenomes, particularly as the simulated reads diverge from reference genomes and an increase in strain diversity is incorporated. We additionally demonstrate use cases for rhea on series metagenomic data of environmental and fermented food microbiomes to detect specific sequence alterations between successive time and temperature samples, suggesting host advantage. Our approach leverages previous work in assembly graph structural and coverage patterns to provide versatility in studying SVs across diverse and poorly characterized microbial communities for more comprehensive insights into microbial gene flux.
AVAILABILITY AND IMPLEMENTATION
rhea is open source and available at: https://github.com/treangenlab/rhea.
Topics: Microbiota; Metagenome; Genome, Bacterial; Metagenomics; Gene Transfer, Horizontal; Bacteria; Algorithms
PubMed: 38940156
DOI: 10.1093/bioinformatics/btae224 -
Bioinformatics (Oxford, England) Jun 2024Insertions and deletions (indels) influence the genetic code in fundamentally distinct ways from substitutions, significantly impacting gene product structure and...
MOTIVATION
Insertions and deletions (indels) influence the genetic code in fundamentally distinct ways from substitutions, significantly impacting gene product structure and function. Despite their influence, the evolutionary history of indels is often neglected in phylogenetic tree inference and ancestral sequence reconstruction, hindering efforts to comprehend biological diversity determinants and engineer variants for medical and industrial applications.
RESULTS
We frame determining the optimal history of indel events as a single Mixed-Integer Programming (MIP) problem, across all branch points in a phylogenetic tree adhering to topological constraints, and all sites implied by a given set of aligned, extant sequences. By disentangling the impact on ancestral sequences at each branch point, this approach identifies the minimal indel events that jointly explain the diversity in sequences mapped to the tips of that tree. MIP can recover alternate optimal indel histories, if available. We evaluated MIP for indel inference on a dataset comprising 15 real phylogenetic trees associated with protein families ranging from 165 to 2000 extant sequences, and on 60 synthetic trees at comparable scales of data and reflecting realistic rates of mutation. Across relevant metrics, MIP outperformed alternative parsimony-based approaches and reported the fewest indel events, on par or below their occurrence in synthetic datasets. MIP offers a rational justification for indel patterns in extant sequences; importantly, it uniquely identifies global optima on complex protein data sets without making unrealistic assumptions of independence or evolutionary underpinnings, promising a deeper understanding of molecular evolution and aiding novel protein design.
AVAILABILITY AND IMPLEMENTATION
The implementation is available via GitHub at https://github.com/santule/indelmip.
Topics: Phylogeny; INDEL Mutation; Evolution, Molecular; Algorithms; Computational Biology
PubMed: 38940131
DOI: 10.1093/bioinformatics/btae254 -
Frontiers in Bioscience (Landmark... Jun 2024Transcription factors (TFs) are essential proteins regulating gene expression by binding to specific nucleotide sequences upstream of genes. Among TF families, the... (Review)
Review
Transcription factors (TFs) are essential proteins regulating gene expression by binding to specific nucleotide sequences upstream of genes. Among TF families, the forkhead box (FOX) proteins, characterized by a conserved DNA-binding domain, play vital roles in various cellular processes, including cancer. The FOXA subfamily, encompassing FOXA1, FOXA2, and FOXA3, stands out for its pivotal role in mammalian development. FOXA1, initially identified in the liver, exhibits diverse expression across multiple organ tissues and plays a critical role in cell proliferation, differentiation, and tumor development. Its structural composition includes transactivation domains and a DNA-binding domain, facilitating its function as a pioneer factor, which is crucial for chromatin interaction and the recruitment of other transcriptional regulators. The involvement of FOXA1 in sex hormone-related tumors underscores its significance in cancer biology. This review provides an overview of multifaceted roles of FOXA1 in normal development and its implications in the pathogenesis of hormone-related cancers, particularly breast cancer and prostate cancer.
Topics: Humans; Hepatocyte Nuclear Factor 3-alpha; Male; Female; Breast Neoplasms; Prostatic Neoplasms; Gonadal Steroid Hormones; Neoplasms; Animals; Gene Expression Regulation, Neoplastic
PubMed: 38940052
DOI: 10.31083/j.fbl2906225 -
Frontiers in Bioscience (Landmark... Jun 2024Hormone receptors exert their function through binding with their ligands, which results in cellular signaling activation mediated by genomic or non-genomic mechanisms....
BACKGROUND
Hormone receptors exert their function through binding with their ligands, which results in cellular signaling activation mediated by genomic or non-genomic mechanisms. The intrinsic molecular communication of tick and its host comprises an endocrine regulation involving hormones. In the present study, we performed a molecular and analysis of a Membrane Associated Progesterone Receptor in (RmMAPRC).
METHODS
The RmMAPRC protein sequence was analyzed with bioinformatics tools, and its structure was characterized by three-dimensional (3D) modeling and molecular docking. A semi-quantitative reverse transcription and polymerase chain reaction (sqRT-PCR) assessed the gene presence and relative expression in tick organs and embryonic cells.
RESULTS
relative expression in salivary glands, ovaries, and embryonic cells showed overexpression of 3%, 13%, and 24%, respectively. Bioinformatic analysis revealed that RmMAPRC corresponded to a Progesterone Receptor Membrane Component 1 (RmPGRMC1) of ~23.7 kDa, with an N-terminal transmembrane domain and a C-terminal Cytochrome b5-like heme/steroid binding domain. The docking results suggest that RmPGRMC1 could bind to progesterone (P4), some progestins, and P4 antagonists. The phylogenetic reconstruction showed that spp. MAPRC receptors were clustered in a clade that includes , , and (RmMAPRC), and mammals and helminths MAPRC receptors clustered in two separated clades away from ticks.
CONCLUSIONS
The presence of RmPGRMC1 highlights the importance of transregulation as a conserved adaptive mechanism that has succeeded for arthropod parasites, making it a target for tick control.
Topics: Animals; Rhipicephalus; Receptors, Progesterone; Progesterone; Cattle; Molecular Docking Simulation; Host-Parasite Interactions; Female; Amino Acid Sequence; Protein Binding; Phylogeny
PubMed: 38940045
DOI: 10.31083/j.fbl2906238 -
Frontiers in Bioscience (Landmark... Jun 2024The endoplasmic reticulum (ER) played an important role in the folding, assembly and post-translational modification of proteins. ER homeostasis could be disrupted by... (Review)
Review
The endoplasmic reticulum (ER) played an important role in the folding, assembly and post-translational modification of proteins. ER homeostasis could be disrupted by the accumulation of misfolded proteins, elevated reactive oxygen species (ROS) levels, and abnormal Ca2+ signaling, which was referred to ER stress (ERS). Ferroptosis was a unique programmed cell death model mediated by iron-dependent phospholipid peroxidation and multiple signaling pathways. The changes of mitochondrial structure, the damage of glutathione peroxidase 4 (GPX4) and excess accumulation of iron were the main characteristics of ferroptosis. ROS produced by ferroptosis can interfere with the activity of protein-folding enzymes, leading to the accumulation of large amounts of unfolded proteins, thus causing ERS. On the contrary, the increase of ERS level could promote ferroptosis by the accumulation of iron ion and lipid peroxide, the up-regulation of ferroptosis related genes. At present, the studies on the relationship between ferroptosis and ERS were one-sided and lack of in-depth studies on the interaction mechanism. This review aimed to explore the molecular mechanism of cross-talk between ferroptosis and ERS, and provide new strategies and targets for the treatment of liver diseases.
Topics: Ferroptosis; Humans; Endoplasmic Reticulum Stress; Liver Diseases; Reactive Oxygen Species; Animals; Signal Transduction; Iron; Lipid Peroxidation; Endoplasmic Reticulum
PubMed: 38940044
DOI: 10.31083/j.fbl2906221 -
Frontiers in Bioscience (Landmark... Jun 2024Existing animal models for testing therapeutics in the skin are limited. Mouse and rat models lack similarity to human skin in structure and wound healing mechanism....
BACKGROUND
Existing animal models for testing therapeutics in the skin are limited. Mouse and rat models lack similarity to human skin in structure and wound healing mechanism. Pigs are regarded as the best model with regards to similarity to human skin; however, these studies are expensive, time-consuming, and only small numbers of biologic replicates can be obtained. In addition, local-regional effects of treating wounds that are closely adjacent to one-another with different treatments make assessment of treatment effectiveness difficult in pig models. Therefore, here, a novel nude mouse model of xenografted porcine hypertrophic scar (HTS) cells was developed. This model system was developed to test if supplying hypo-pigmented cells with exogenous alpha melanocyte stimulating hormone (α-MSH) will reverse pigment loss .
METHODS
Dyschromic HTSs were created in red Duroc pigs. Epidermal scar cells (keratinocytes and melanocytes) were derived from regions of hyper-, hypo-, or normally pigmented scar or skin and were cryopreserved. Dermal fibroblasts (DFs) were isolated separately. Excisional wounds were created on nude mice and a grafting dome was placed. DFs were seeded on day 0 and formed a dermis. On day 3, epidermal cells were seeded onto the dermis. The grafting dome was removed on day 7 and hypo-pigmented xenografts were treated with synthetic α-MSH delivered with microneedling. On day 10, the xenografts were excised and saved. Sections were stained using hematoxylin and eosin hematoxylin and eosin (H&E) to assess xenograft structure. RNA was isolated and quantitative real-time polymerase chain reaction (qRT-PCR) was performed for melanogenesis-related genes , , and .
RESULTS
The seeding of HTSDFs formed a dermis that is similar in structure and cellularity to HTS dermis from the porcine model. When hyper-, hypo-, and normally-pigmented epidermal cells were seeded, a fully stratified epithelium was formed by day 14. H&E staining and measurement of the epidermis showed the average thickness to be 0.11 ± 0.07 µm 0.06 ± 0.03 µm in normal pig skin. Hypo-pigmented xenografts that were treated with synthetic α-MSH showed increases in pigmentation and had increased gene expression of , , and compared to untreated controls (TYR: 2.7 ± 1.1 0.3 ± 1.1; TYRP1: 2.6 ± 0.6 0.3 ± 0.7; DCT 0.7 ± 0.9 0.3 ± 1-fold change from control; n = 3).
CONCLUSIONS
The developed nude mouse skin xenograft model can be used to study treatments for the skin. The cells that can be xenografted can be derived from patient samples or from pig samples and form a robust dual-skin layer containing epidermis and dermis that is responsive to treatment. Specifically, we found that hypo-pigmented regions of scar can be stimulated to make melanin by synthetic α-MSH .
Topics: Animals; Mice, Nude; Cicatrix, Hypertrophic; Mice; Disease Models, Animal; Swine; alpha-MSH; Humans; Skin; Fibroblasts; Melanocytes; Keratinocytes; Transplantation, Heterologous; Wound Healing; Skin Pigmentation
PubMed: 38940034
DOI: 10.31083/j.fbl2906230