-
Briefings in Bioinformatics Mar 2022A transcriptome constructed from short-read RNA sequencing (RNA-seq) is an easily attainable proxy catalog of protein-coding genes when genome assembly is unnecessary,... (Review)
Review
A transcriptome constructed from short-read RNA sequencing (RNA-seq) is an easily attainable proxy catalog of protein-coding genes when genome assembly is unnecessary, expensive or difficult. In the absence of a sequenced genome to guide the reconstruction process, the transcriptome must be assembled de novo using only the information available in the RNA-seq reads. Subsequently, the sequences must be annotated in order to identify sequence-intrinsic and evolutionary features in them (for example, protein-coding regions). Although straightforward at first glance, de novo transcriptome assembly and annotation can quickly prove to be challenging undertakings. In addition to familiarizing themselves with the conceptual and technical intricacies of the tasks at hand and the numerous pre- and post-processing steps involved, those interested must also grapple with an overwhelmingly large choice of tools. The lack of standardized workflows, fast pace of development of new tools and techniques and paucity of authoritative literature have served to exacerbate the difficulty of the task even further. Here, we present a comprehensive overview of de novo transcriptome assembly and annotation. We discuss the procedures involved, including pre- and post-processing steps, and present a compendium of corresponding tools.
Topics: Genome; High-Throughput Nucleotide Sequencing; Molecular Sequence Annotation; Sequence Analysis, RNA; Transcriptome; Workflow
PubMed: 35076693
DOI: 10.1093/bib/bbab563 -
Cell Feb 2024Methods from artificial intelligence (AI) trained on large datasets of sequences and structures can now "write" proteins with new shapes and molecular functions de novo,... (Review)
Review
Methods from artificial intelligence (AI) trained on large datasets of sequences and structures can now "write" proteins with new shapes and molecular functions de novo, without starting from proteins found in nature. In this Perspective, I will discuss the state of the field of de novo protein design at the juncture of physics-based modeling approaches and AI. New protein folds and higher-order assemblies can be designed with considerable experimental success rates, and difficult problems requiring tunable control over protein conformations and precise shape complementarity for molecular recognition are coming into reach. Emerging approaches incorporate engineering principles-tunability, controllability, and modularity-into the design process from the beginning. Exciting frontiers lie in deconstructing cellular functions with de novo proteins and, conversely, constructing synthetic cellular signaling from the ground up. As methods improve, many more challenges are unsolved.
Topics: Artificial Intelligence; Protein Conformation; Proteins; Protein Engineering; Deep Learning
PubMed: 38306980
DOI: 10.1016/j.cell.2023.12.028 -
Nature Reviews. Chemistry Jan 2022Natural metalloproteins perform many functions - ranging from sensing to electron transfer and catalysis - in which the position and property of each ligand and metal,...
Natural metalloproteins perform many functions - ranging from sensing to electron transfer and catalysis - in which the position and property of each ligand and metal, is dictated by protein structure. De novo protein design aims to define an amino acid sequence that encodes a specific structure and function, providing a critical test of the hypothetical inner workings of (metallo)proteins. To date, de novo metalloproteins have used simple, symmetric tertiary structures - uncomplicated by the large size and evolutionary marks of natural proteins - to interrogate structure-function hypotheses. In this Review, we discuss de novo design applications, such as proteins that induce complex, increasingly asymmetric ligand geometries to achieve function, as well as the use of more canonical ligand geometries to achieve stability. De novo design has been used to explore how proteins fine-tune redox potentials and catalyse both oxidative and hydrolytic reactions. With an increased understanding of structure-function relationships, functional proteins including O-dependent oxidases, fast hydrolases, and multi-proton/multi-electron reductases, have been created. In addition, proteins can now be designed using xeno-biological metals or cofactors and principles from inorganic chemistry to derive new-to-nature functions. These results and the advances in computational protein design suggest a bright future for the de novo design of diverse, functional metalloproteins.
PubMed: 35811759
DOI: 10.1038/s41570-021-00339-5 -
Genes & Diseases Nov 2023nucleotide biosynthetic pathway is a highly conserved and essential biochemical pathway in almost all organisms. Both purine nucleotides and pyrimidine nucleotides are... (Review)
Review
nucleotide biosynthetic pathway is a highly conserved and essential biochemical pathway in almost all organisms. Both purine nucleotides and pyrimidine nucleotides are necessary for cell metabolism and proliferation. Thus, the dysregulation of the nucleotide biosynthetic pathway contributes to the development of many human diseases, such as cancer. It has been shown that many enzymes in this pathway are overactivated in different cancers. In this review, we summarize and update the current knowledge on the nucleotide biosynthetic pathway, regulatory mechanisms, its role in tumorigenesis, and potential targeting opportunities.
PubMed: 37554216
DOI: 10.1016/j.gendis.2022.04.018 -
Cancer Science Feb 2021Recent studies of the cancer genome have identified numerous patients harboring multiple mutations (MM) within individual oncogenes. These MM (de novo MM) in cis... (Review)
Review
Recent studies of the cancer genome have identified numerous patients harboring multiple mutations (MM) within individual oncogenes. These MM (de novo MM) in cis synergistically activate the mutated oncogene and promote tumorigenesis, indicating a positive epistatic interaction between mutations. The relatively frequent de novo MM suggest that intramolecular positive epistasis is widespread in oncogenes. Studies also suggest that negative and higher-order epistasis affects de novo MM. Comparison of de novo MM and MM associated with drug-resistant secondary mutations (secondary MM) revealed several similarities with respect to allelic configuration, mutational selection and functionality of individual mutations. Conversely, they have several differences, most notably the difference in drug sensitivities. Secondary MM usually confer resistance to molecularly targeted therapies, whereas several de novo MM are associated with increased sensitivity, implying that both can be useful as therapeutic biomarkers. Unlike secondary MM in which specific secondary resistant mutations are selected, minor (infrequent) functionally weak mutations are convergently selected in de novo MM, which may provide an explanation as to why such mutations accumulate in cancer. The third type of MM is MM from different subclones. This type of MM is associated with parallel evolution, which may contribute to relapse and treatment failure. Collectively, MM within individual oncogenes are diverse, but all types of MM are associated with cancer evolution and therapeutic response. Further evaluation of oncogenic MM is warranted to gain a deeper understanding of cancer genetics and evolution.
Topics: Carcinogenesis; Humans; Mutation; Neoplasms; Oncogenes
PubMed: 33073435
DOI: 10.1111/cas.14699 -
Drug Discovery Today Nov 2021Molecular design strategies are integral to therapeutic progress in drug discovery. Computational approaches for de novo molecular design have been developed over the... (Review)
Review
Molecular design strategies are integral to therapeutic progress in drug discovery. Computational approaches for de novo molecular design have been developed over the past three decades and, recently, thanks in part to advances in machine learning (ML) and artificial intelligence (AI), the drug discovery field has gained practical experience. Here, we review these learnings and present de novo approaches according to the coarseness of their molecular representation: that is, whether molecular design is modeled on an atom-based, fragment-based, or reaction-based paradigm. Furthermore, we emphasize the value of strong benchmarks, describe the main challenges to using these methods in practice, and provide a viewpoint on further opportunities for exploration and challenges to be tackled in the upcoming years.
Topics: Artificial Intelligence; Computer Simulation; Drug Design; Drug Discovery; Drug Evaluation, Preclinical; Humans; Machine Learning; Workflow
PubMed: 34082136
DOI: 10.1016/j.drudis.2021.05.019 -
Epigenomes Aug 2022Every cell of an organism shares the same genome; even so, each cellular lineage owns a different transcriptome and proteome. The Polycomb group proteins (PcG) are... (Review)
Review
Every cell of an organism shares the same genome; even so, each cellular lineage owns a different transcriptome and proteome. The Polycomb group proteins (PcG) are essential regulators of gene repression patterning during development and homeostasis. However, it is unknown how the repressive complexes, PRC1 and PRC2, identify their targets and elicit new Polycomb domains during cell differentiation. Classical recruitment models consider the pre-existence of repressive histone marks; still, target binding overcomes the absence of both H3K27me3 and H2AK119ub. The CpG islands (CGIs), non-core proteins, and RNA molecules are involved in Polycomb recruitment. Nonetheless, it is unclear how targets are identified depending on the physiological context and developmental stage and which are the leading players stabilizing Polycomb complexes at domain nucleation sites. Here, we examine the features of sites and the accessory elements bridging its recruitment and discuss the first steps of Polycomb domain formation and transcriptional regulation, comprehended by the experimental reconstruction of the repressive domains through time-resolved genomic analyses in mammals.
PubMed: 35997371
DOI: 10.3390/epigenomes6030025 -
The New Phytologist Jun 2018Contents Summary 1334 I. Introduction 1334 II. Regeneration-initial cell: the origin of regeneration 1335 III. Acquiring regeneration competency: the essential... (Review)
Review
Contents Summary 1334 I. Introduction 1334 II. Regeneration-initial cell: the origin of regeneration 1335 III. Acquiring regeneration competency: the essential intermediate step for hormone-induced regeneration 1335 IV. Hormonal induction of stem cell regulators: the program for de novo establishment of apical meristems 1337 V. Conclusions and perspectives 1337 Acknowledgements 1338 Author contributions 1338 References 1338 SUMMARY: High cellular plasticity confers remarkable regeneration capacity to plants. Based on the activity of stem cells and their regulators, higher plants are capable of regenerating new individuals. De novo organogenesis exemplifies the regeneration of the whole plant body and is exploited widely in agriculture and biotechnology. In this Tansley insight article, we summarize recent advances that facilitate our understanding of the molecular mechanisms underlying de novo organogenesis. According to our current knowledge, this process can be divided into three steps, including activation of regeneration-initial cells, acquisition of competency and de novo establishment of apical meristems. The functions of stem cells and their regulators are critical to de novo organogenesis, whereas auxin and cytokinin act as triggers and linkers between different steps.
Topics: Meristem; Organogenesis; Plant Cells; Plant Growth Regulators; Regeneration; Stem Cells
PubMed: 29574802
DOI: 10.1111/nph.15106 -
Frontiers in Genetics 2022Mosaicism-the existence of genetically distinct populations of cells in a particular organism-is an important cause of genetic disease. Mosaicism can appear as DNA... (Review)
Review
Mosaicism-the existence of genetically distinct populations of cells in a particular organism-is an important cause of genetic disease. Mosaicism can appear as DNA mutations, epigenetic alterations of DNA, and chromosomal abnormalities. Neurodevelopmental or neuropsychiatric diseases, including autism-often arise by mutations that usually not present in either of the parents. mutations might occur as early as in the parental germline, during embryonic, fetal development, and/or post-natally, through ageing and life. Mutation timing could lead to mutation burden of less than heterozygosity to approaching homozygosity. Developmental timing of somatic mutation attainment will affect the mutation load and distribution throughout the body. In this review, we discuss the timing of mutations, spanning from mutations in the germ lineage (all ages), to post-zygotic, embryonic, fetal, and post-natal events, through aging to death. These factors can determine the tissue specific distribution and load of mutations, which can affect disease. The disease threshold burden of somatic mutations of a particular gene in any tissue will be important to define.
PubMed: 36226191
DOI: 10.3389/fgene.2022.983668 -
World Journal of Gastroenterology May 2023Endoscopy has rapidly developed in recent years and has enabled further investigation into the origin and features of intestinal tumors. The small size and concealed...
BACKGROUND
Endoscopy has rapidly developed in recent years and has enabled further investigation into the origin and features of intestinal tumors. The small size and concealed position of these tumors make it difficult to distinguish them from nonneoplastic polyps and carcinoma in adenoma (CIA). The invasive depth and metastatic potential determine the operation regimen, which in turn affects the overall survival and distant prognosis. The previous studies have confirmed the malignant features and clinicopathological features of colorectal cancer (CRC).
AIM
To provide assistance for diagnosis and treatment, but the lack of a summary of endoscopic features and assessment of risk factors that differ from the CIA prompted us to conduct this retrospective study.
METHODS
In total, 167 patients with small-sized CRCs diagnosed by endoscopy were reviewed. The patients diagnosed as advanced CRCs and other malignant cancers or chronic diseases that could affect distant outcomes were excluded. After screening, 63 cases were excluded, including 33 and 30 CIA cases. Patient information, including their follow-up information, was obtained from an electronic His-system. The characteristics between two group and risk factors for invasion depth were analyzed with SPSS 25.0 software.
RESULTS
Nearly half of the CRCs were smaller than 1 cm ( = 16, 48.5%) and the majority were located in the distal colon ( = 26, 78.8%). The IIc type was the most common macroscopic type of CRC. In a Pearson analysis, the differential degree, Sano, JNET, and Kudo types, surrounding mucosa, and chicken skin mucosa (CSM) were correlated with the invasion depth ( < 0.001). CSM was a significant risk factor for deep invasion and disturbed judgment of endoscopic ultrasound. A high degree of tumor budding and tumor-infiltrating lymphocytes are accompanied by malignancy. Finally, CRCs have worse outcomes than CIA CRCs.
CONCLUSION
This is the first comprehensive study to analyze the features of CRCs to distinguish them from nonneoplastic polyps. It is also the first study paying attention to CSM invasive depth measurement. This study emphasizes the high metastatic potential of CRCs and highlights the need for more research on this tumor type.
Topics: Humans; Retrospective Studies; Colorectal Neoplasms; Endoscopy; Risk Factors; Adenoma
PubMed: 37274065
DOI: 10.3748/wjg.v29.i18.2836