-
International Journal of Molecular... Jun 2024Lung adenocarcinoma (LUAD) is the most widespread cancer in the world, and its development is associated with complex biological mechanisms that are poorly understood....
Lung adenocarcinoma (LUAD) is the most widespread cancer in the world, and its development is associated with complex biological mechanisms that are poorly understood. Here, we revealed a marked upregulation in the mRNA level of C1orf131 in LUAD samples compared to non-tumor tissue samples in The Cancer Genome Atlas (TCGA). Depletion of C1orf131 suppressed cell proliferation and growth, whereas it stimulated apoptosis in LUAD cells. Mechanistic investigations revealed that C1orf131 knockdown induced cell cycle dysregulation via the AKT and p53/p21 signalling pathways. Additionally, C1orf131 knockdown blocked cell migration through the modulation of epithelial-mesenchymal transition (EMT) in lung adenocarcinoma. Notably, we identified the C1orf131 protein nucleolar localization sequence, which included amino acid residues 137-142 (KKRKLT) and 240-245 (KKKRKG). Collectively, C1orf131 has potential as a novel therapeutic marker for patients in the future, as it plays a vital role in the progression of lung adenocarcinoma.
Topics: Humans; Adenocarcinoma of Lung; Proto-Oncogene Proteins c-akt; Signal Transduction; Lung Neoplasms; Cell Proliferation; Gene Expression Regulation, Neoplastic; Cell Movement; Cell Line, Tumor; Epithelial-Mesenchymal Transition; Apoptosis; Disease Progression; Tumor Suppressor Protein p53; Nuclear Proteins; A549 Cells
PubMed: 38928092
DOI: 10.3390/ijms25126381 -
International Journal of Molecular... Jun 2024The secreted proteins of human body fluid have the potential to be used as biomarkers for diseases. These biomarkers can be used for early diagnosis and risk prediction...
The secreted proteins of human body fluid have the potential to be used as biomarkers for diseases. These biomarkers can be used for early diagnosis and risk prediction of diseases, so the study of secreted proteins of human body fluid has great application value. In recent years, the deep-learning-based transformer language model has transferred from the field of natural language processing (NLP) to the field of proteomics, leading to the development of protein language models (PLMs) for protein sequence representation. Here, we propose a deep learning framework called ESM Predict Secreted Proteins (ESMSec) to predict three types of proteins secreted in human body fluid. The ESMSec is based on the ESM2 model and attention architecture. Specifically, the protein sequence data are firstly put into the ESM2 model to extract the feature information from the last hidden layer, and all the input proteins are encoded into a fixed 1000 × 480 matrix. Secondly, multi-head attention with a fully connected neural network is employed as the classifier to perform binary classification according to whether they are secreted into each body fluid. Our experiment utilized three human body fluids that are important and ubiquitous markers. Experimental results show that ESMSec achieved average accuracy of 0.8486, 0.8358, and 0.8325 on the testing datasets for plasma, cerebrospinal fluid (CSF), and seminal fluid, which on average outperform the state-of-the-art (SOTA) methods. The outstanding performance results of ESMSec demonstrate that the ESM can improve the prediction performance of the model and has great potential to screen the secretion information of human body fluid proteins.
Topics: Humans; Body Fluids; Biomarkers; Deep Learning; Natural Language Processing; Proteomics; Proteins; Neural Networks, Computer; Computational Biology
PubMed: 38928078
DOI: 10.3390/ijms25126371 -
Bioengineering (Basel, Switzerland) Jun 2024The significant growth of the global protein drug market, including fusion proteins, emphasizes the crucial role of optimizing amino acid sequences to enhance the...
The significant growth of the global protein drug market, including fusion proteins, emphasizes the crucial role of optimizing amino acid sequences to enhance the productivity and bioefficacy. Among these fusion proteins, RBP-IIIA-IB, comprising retinol-binding protein in conjunction with the albumin domains, IIIA and IB, has displayed efficacy in alleviating liver fibrosis by inhibiting the activation of hepatic stellate cells (HSCs). This study aimed to address the issue of the low productivity in RBP-IIIA-IB. To induce structural changes, the linking sequence, EVDD, between domain IIIA and IB in RBP-IIIA-IB was modified to DGPG, AAAA, and GGPA. Among these, RBP-IIIA-AAAA-IB demonstrated an increase in yield (>4-fold) and a heightened inhibition of HSC activation. Furthermore, we identified amino acid residues that could form disulfide bonds when substituted with cysteine. Through the mutation of N453S-V480S in RBP-IIIA-AAAA-IB, the productivity further increased by over 9-fold, accompanied by an increase in anti-fibrotic activity. Overall, there was a more than 30-fold increase in the fusion protein's yield. These findings demonstrate the effectiveness of modifying linker sequences and introducing extra disulfide bonds to improve both the production yield and biological efficacy of fusion proteins.
PubMed: 38927853
DOI: 10.3390/bioengineering11060617 -
Genes Jun 2024(Poaceae Bromeae) is a forage grass with high adaptability and ecological and economic value. Here, we sequenced , , , and chloroplast genomes and compared them with... (Comparative Study)
Comparative Study
(Poaceae Bromeae) is a forage grass with high adaptability and ecological and economic value. Here, we sequenced , , , and chloroplast genomes and compared them with four previously described species. The genome sizes of species ranged from 136,934 bp () to 137,189 bp (, ), with a typical quadripartite structure. The studied species had 129 genes, consisting of 83 protein-coding, 38 tRNA-coding, and 8 rRNA-coding genes. The highest GC content was found in the inverted repeat (IR) region (43.85-44.15%), followed by the large single-copy (LSC) region (36.25-36.65%) and the small single-copy (SSC) region (32.21-32.46%). There were 33 high-frequency codons, with those ending in A/U accounting for 90.91%. A total of 350 simple sequence repeats (SSRs) were identified, with single-nucleotide repeats being the most common (61.43%). A total of 228 forward and 141 palindromic repeats were identified. No reverse or complementary repeats were detected. The sequence identities of all sequences were very similar, especially with respect to the protein-coding and inverted repeat regions. Seven highly variable regions were detected, which could be used for molecular marker development. The constructed phylogenetic tree indicates that is a monophyletic taxon closely related to Triticum. This comparative analysis of the chloroplast genome of provides a scientific basis for species identification and phylogenetic studies.
Topics: Genome, Chloroplast; Phylogeny; Microsatellite Repeats; Bromus; Base Composition
PubMed: 38927750
DOI: 10.3390/genes15060815 -
Genes Jun 2024Peroxisome proliferator-activated receptor γ (PPARG) has various splicing variants and plays essential roles in the regulation of adipocyte differentiation and...
Peroxisome proliferator-activated receptor γ (PPARG) has various splicing variants and plays essential roles in the regulation of adipocyte differentiation and lipogenesis. However, little is known about the expression pattern and effect of the PPARG on milk fat synthesis in the buffalo mammary gland. In this study, we found that only and of the splicing variant were expressed in the buffalo mammary gland. Amino acid sequence characterization showed that the proteins encoded by and are endonuclear non-secreted hydrophilic proteins. Protein domain prediction found that only the -encoded protein had PPAR ligand-binding domains (NR_LBD_PPAR), which may lead to functional differences between the two splices. RNA interference (RNAi) and the overexpression of and in buffalo mammary epithelial cells (BMECs) were performed. Results showed that the expression of fatty acid synthesis-related genes (, , , , , ) was significantly modified ( < 0.05) by the RNAi and overexpression of and . All kinds of FAs detected in this study were significantly decreased ( < 0.05) after RNAi of or . Overexpression of or significantly decreased ( < 0.05) the SFA content, while significantly increased ( < 0.05) the UFA, especially the MUFA in the BMECs. In conclusion, there are two splicing variants expressed in the BMECs that can regulate FA synthesis by altering the expression of diverse fatty acid synthesis-related genes. This study revealed the expression characteristics and functions of the gene in buffalo mammary glands and provided a reference for further understanding of fat synthesis in buffalo milk.
Topics: Animals; Buffaloes; PPAR gamma; Mammary Glands, Animal; Female; Epithelial Cells; Alternative Splicing; Fatty Acids; Protein Isoforms; Milk
PubMed: 38927715
DOI: 10.3390/genes15060779 -
Genes Jun 2024The high-throughput proteomics data generated by increasingly more sensible mass spectrometers greatly contribute to our better understanding of molecular and cellular...
The high-throughput proteomics data generated by increasingly more sensible mass spectrometers greatly contribute to our better understanding of molecular and cellular mechanisms operating in live beings. Nevertheless, proteomics analyses are based on accurate genomic and protein annotations, and some information may be lost if these resources are incomplete. Here, we show that most proteomics data may be recovered by interconnecting genomics and proteomics approaches (i.e., following a proteogenomic strategy), resulting, in turn, in an improvement of gene/protein models. In this study, we generated proteomics data from (HU3 strain) promastigotes that allowed us to detect 1908 proteins in this developmental stage on the basis of the currently annotated proteins available in public databases. However, when the proteomics data were searched against all possible open reading frames existing in the genome, twenty new protein-coding genes could be annotated. Additionally, 43 previously annotated proteins were extended at their N-terminal ends to accommodate peptides detected in the proteomics data. Also, different post-translational modifications (phosphorylation, acetylation, methylation, among others) were found to occur in a large number of proteins. Finally, a detailed comparative analysis of the and experimental proteomes served to illustrate how inaccurate conclusions can be raised if proteomes are compared solely on the basis of the listed proteins identified in each proteome. Finally, we have created data entries (based on freely available repositories) to provide and maintain updated gene/protein models. Raw data are available via ProteomeXchange with the identifier PXD051920.
Topics: Leishmania donovani; Proteogenomics; Protozoan Proteins; Genome, Protozoan; Protein Processing, Post-Translational; Proteomics; Proteome; Molecular Sequence Annotation
PubMed: 38927711
DOI: 10.3390/genes15060775 -
Genes Jun 2024We identified five distinct full-length human mineralocorticoid receptor (MR) genes containing either 984 amino acids (MR-984) or 988 amino acids (MR-988), which can be... (Comparative Study)
Comparative Study
We identified five distinct full-length human mineralocorticoid receptor (MR) genes containing either 984 amino acids (MR-984) or 988 amino acids (MR-988), which can be distinguished by the presence or absence of Lys, Cys, Ser, and Trp (KCSW) in their DNA-binding domain (DBD) and mutations at codons 180 and 241 in their amino-terminal domain (NTD). Two human MR-KCSW genes contain either (Val-180, Val-241) or (Ile-180, Val-241) in their NTD, and three human MR-984 genes contain either (Ile-180, Ala-241), (Val-180, Val-241), or (Ile-180, Val-241). Human MR-KCSW with (Ile-180, Ala-241) has not been cloned. In contrast, chimpanzees contain four MRs: two MR-988s with KCSW in their DBD, or two MR-984s without KCSW in their DBD. Chimpanzee MRs only contain (Ile180, Val-241) in their NTD. A chimpanzee MR with either (Val-180, Val-241) or (Ile-180, Ala-241) in the NTD has not been cloned. Gorillas and orangutans each contain one MR-988 with KCSW in the DBD and one MR-984 without KCSW, and these MRs only contain (Ile-180, Val-241) in their NTD. A gorilla MR or orangutan MR with either (Val-180, Val-241) or (Ile-180, Ala-241) in the NTD has not been cloned. Together, these data suggest that human MRs with (Val-180, Val-241) or (Ile-180, Ala-241) in the NTD evolved after humans and chimpanzees diverged from their common ancestor. Considering the multiple functions in human development of the MR in kidney, brain, heart, skin, and lungs, as well as MR activity in interaction with the glucocorticoid receptor, we suggest that the evolution of human MRs that are absent in chimpanzees may have been important in the evolution of humans from chimpanzees. Investigation of the physiological responses to corticosteroids mediated by the MR in humans, chimpanzees, gorillas, and orangutans may provide insights into the evolution of humans and their closest relatives.
Topics: Animals; Receptors, Mineralocorticoid; Humans; Pan troglodytes; Evolution, Molecular; Gorilla gorilla; Phylogeny; Pongo; Amino Acid Sequence; Protein Domains
PubMed: 38927703
DOI: 10.3390/genes15060767 -
Genes Jun 2024The identification and expression of germ cells are important for studying sex-related mechanisms in fish. The gene, encoding an ATP-dependent RNA helicase, is...
The identification and expression of germ cells are important for studying sex-related mechanisms in fish. The gene, encoding an ATP-dependent RNA helicase, is recognized as a molecular marker of germ cells and plays a crucial role in germ cell development. , an important freshwater economic fish species in China, shows significant sex dimorphism with the female growing faster than the male. However, the molecular mechanisms underlying these sex differences especially involving in the gene in this fish remain poorly understood. In this work, the gene sequence of (named as ) was obtained through RT-PCR and rapid amplification of cDNA end (RACE), and its expression in embryos and tissues was analyzed using qRT-PCR and an in situ hybridization method. Letrozole (LT) treatment on the larvae fish was also conducted to investigate its influence on the gene. The results revealed that the open reading frame (ORF) of was 1989 bp, encoding 662 amino acids. The SaVasa protein contains 10 conserved domains unique to the DEAD-box protein family, showing the highest sequence identity of 95.92% with that of . In embryos, is highly expressed from the two-cell stage to the blastula stage in early embryos, with a gradually decreasing trend from the gastrula stage to the heart-beating stage. Furthermore, was initially detected at the end of the cleavage furrow during the two-cell stage, later condensing into four symmetrical cell clusters with embryonic development. At the gastrula stage, -positive cells increased and began to migrate towards the dorsal side of the embryo. In tissues, is predominantly expressed in the ovaries, with almost no or lower expression in other detected tissues. Moreover, was expressed in phase I-V oocytes in the ovaries, as well as in spermatogonia and spermatocytes in the testis, implying a specific expression pattern of germ cells. In addition, LT significantly upregulated the expression of in a concentration-dependent manner during the key gonadal differentiation period of the fish. Notably, at 120 dph after LT treatment, expression was the lowest in the testis and ovary of the high concentration group. Collectively, findings from gene structure, protein sequence, phylogenetic analysis, RNA expression patterns, and response to LT suggest that is maternally inherited with conserved features, serving as a potential marker gene for germ cells in , and might participate in LT-induced early embryonic development and gonadal development processes of the fish. This would provide a basis for further research on the application of germ cell markers and the molecular mechanisms of sex differences in .
Topics: Animals; Letrozole; Female; Male; Fish Proteins; DEAD-box RNA Helicases; Catfishes; Gene Expression Regulation, Developmental; Germ Cells; Phylogeny
PubMed: 38927693
DOI: 10.3390/genes15060756 -
Genes Jun 2024Anthocyanidin reductase () is a key enzyme regulating anthocyanin synthesis and accumulation in plants. Here, lychee genes were globally identified, their sequence and...
Anthocyanidin reductase () is a key enzyme regulating anthocyanin synthesis and accumulation in plants. Here, lychee genes were globally identified, their sequence and phylogenetic characteristics were analyzed, and their spatiotemporal expression patterns were characterized. A total of 51 family members were identified in the lychee genome. The length of the encoded amino acid residues ranged from 87 aa to 289 aa, the molecular weight ranged from 9.49 KD to 32.40 KD, and the isoelectric point (pI) ranged from 4.83 to 9.33. Most of the members were acidic proteins. Most members of the family were located in the cytoplasm. The 51 family members were unevenly distributed in 11 chromosomes, and their exons and motif conserved structures were significantly different from each other. Promoters in over 90% of members contained anaerobically induced response elements, and 88% contained photoresponsive elements. Most family members had low expression in nine lychee tissues and organs (root, young leaf, bud, female flower, male flower, pericarp, pulp, seed, and calli), and some members showed tissue-specific expression patterns. The expression of one gene, , decreased with the increase of anthocyanin accumulation in 'Feizixiao' and 'Ziniangxi' pericarp, which was negatively correlated with pericarp coloring. The identified gene was heterologously expressed in tobacco K326, and the function of the gene was verified. This study provides a basis for the further study of function, particularly the role in lychee pericarp coloration.
Topics: Litchi; Plant Proteins; Gene Expression Regulation, Plant; Phylogeny; Multigene Family; Anthocyanins; Genome, Plant
PubMed: 38927692
DOI: 10.3390/genes15060757 -
Genes May 2024MYB transcription factors (TFs) play vital roles in plant growth, development, and response to adversity. Although the MYB gene family has been studied in many plant...
MYB transcription factors (TFs) play vital roles in plant growth, development, and response to adversity. Although the MYB gene family has been studied in many plant species, there is still little known about the function of R2R3 MYB TFs in sweet potato in response to abiotic stresses. In this study, an R2R3 MYB gene, was isolated from sweet potato (). was ectopically expressed in tobacco and the functional characterization was performed by overexpression in transgenic plants. The IbMYB330 protein has a 268 amino acid sequence and contains two highly conserved MYB domains. The molecular weight and isoelectric point of IbMYB330 are 29.24 kD and 9.12, respectively. The expression of in sweet potato is tissue-specific, and levels in the root were significantly higher than that in the leaf and stem. It showed that the expression of was strongly induced by PEG-6000, NaCl, and HO. Ectopic expression of led to increased transcript levels of stress-related genes such as , , , and . Moreover, compared to the wild-type (WT), transgenic tobacco overexpression of enhanced the tolerance to drought and salt stress treatment as CAT activity, POD activity, proline content, and protein content in transgenic tobacco had increased, while MDA content had decreased. Taken together, our study demonstrated that plays a role in enhancing the resistance of sweet potato to stresses. These findings lay the groundwork for future research on the R2R3-MYB genes of sweet potato and indicates that may be a candidate gene for improving abiotic stress tolerance in crops.
Topics: Ipomoea batatas; Nicotiana; Plants, Genetically Modified; Transcription Factors; Plant Proteins; Gene Expression Regulation, Plant; Droughts; Salt Tolerance; Stress, Physiological; Salt Stress
PubMed: 38927629
DOI: 10.3390/genes15060693