-
Cell Aug 2023Many regions in the human genome vary in length among individuals due to variable numbers of tandem repeats (VNTRs). To assess the phenotypic impact of VNTRs...
Many regions in the human genome vary in length among individuals due to variable numbers of tandem repeats (VNTRs). To assess the phenotypic impact of VNTRs genome-wide, we applied a statistical imputation approach to estimate the lengths of 9,561 autosomal VNTR loci in 418,136 unrelated UK Biobank participants and 838 GTEx participants. Association and statistical fine-mapping analyses identified 58 VNTRs that appeared to influence a complex trait in UK Biobank, 18 of which also appeared to modulate expression or splicing of a nearby gene. Non-coding VNTRs at TMCO1 and EIF3H appeared to generate the largest known contributions of common human genetic variation to risk of glaucoma and colorectal cancer, respectively. Each of these two VNTRs associated with a >2-fold range of risk across individuals. These results reveal a substantial and previously unappreciated role of non-coding VNTRs in human health and gene regulation.
Topics: Humans; Calcium Channels; Colorectal Neoplasms; Genome, Human; Glaucoma; Minisatellite Repeats; Polymorphism, Genetic; Eukaryotic Initiation Factor-3
PubMed: 37527660
DOI: 10.1016/j.cell.2023.07.002 -
Nucleic Acids Research Nov 2023SINE-VNTR-Alu (SVA) retrotransposons are evolutionarily young and still-active transposable elements (TEs) in the human genome. Several pathogenic SVA insertions have...
SINE-VNTR-Alu (SVA) retrotransposons are evolutionarily young and still-active transposable elements (TEs) in the human genome. Several pathogenic SVA insertions have been identified that directly mutate host genes to cause neurodegenerative and other types of diseases. However, due to their sequence heterogeneity and complex structures as well as limitations in sequencing techniques and analysis, SVA insertions have been less well studied compared to other mobile element insertions. Here, we identified polymorphic SVA insertions from 3646 whole-genome sequencing (WGS) samples of >150 diverse populations and constructed a polymorphic SVA insertion reference catalog. Using 20 long-read samples, we also assembled reference and polymorphic SVA sequences and characterized the internal hexamer/variable-number-tandem-repeat (VNTR) expansions as well as differing SVA activity for SVA subfamilies and human populations. In addition, we developed a module to annotate both reference and polymorphic SVA copies. By characterizing the landscape of both reference and polymorphic SVA retrotransposons, our study enables more accurate genotyping of these elements and facilitate the discovery of pathogenic SVA insertions.
Topics: Humans; Alu Elements; Genome, Human; Minisatellite Repeats; Retroelements; Short Interspersed Nucleotide Elements
PubMed: 37823611
DOI: 10.1093/nar/gkad821 -
Journal of Infection in Developing... Nov 2023Mycobacterium tuberculosis genotyping has impacted evolutionary studies worldwide. Nonetheless, its application and the knowledge generated depend on the genetic marker... (Review)
Review
INTRODUCTION
Mycobacterium tuberculosis genotyping has impacted evolutionary studies worldwide. Nonetheless, its application and the knowledge generated depend on the genetic marker evaluated and the detection technologies that have evolved over the years. Here we describe the timeline of main genotypic methods related to M. tuberculosis in Latin America and the main findings obtained.
METHODOLOGY
Systematic searches through the PubMed database were performed from 1993 to May 2021. A total of 345 articles met the inclusion criteria and were selected.
RESULTS
Spacer oligonucleotide typing (spoligotyping) was the most widely used method in Latin America, with decreasing use in parallel with increasing use of mycobacterial interspersed repetitive unit-variable number tandem repeat (MIRU-VNTR) and whole genome sequencing (WGS). Among the countries, Brazil, Mexico, and Argentina had the most publications, and a considerable part of the articles were in collaboration with Latin American or non-Latin American institutions; a small proportion of studies needed partnerships to perform the genotypic methods. The genotypic methods allowed the identification of M. tuberculosis genotypes with greater capacity for clonal expansion and revealed the predominance of the Euro-American lineage in Latin America. There was a notable presence of the Beijing family in Peru and Colombia.
CONCLUSIONS
The data obtained demonstrated the importance of expanding collaborative networks of tuberculosis (TB) research groups to countries with low productivity in this area, the commitment of the few Latin American countries to advance TB research, as well as the inestimable value of building a Latin America database, considering ease of population mobility between countries.
Topics: Humans; Latin America; Genotype; Polymorphism, Restriction Fragment Length; Bacterial Typing Techniques; Tuberculosis; Mycobacterium tuberculosis; Minisatellite Repeats
PubMed: 37956372
DOI: 10.3855/jidc.17840 -
The Malaysian Journal of Medical... Dec 2023Forensic DNA typing has been widely accepted in the courts all over the world. This is because DNA profiling is a very powerful tool to identify individuals on the basis... (Review)
Review
Forensic DNA typing has been widely accepted in the courts all over the world. This is because DNA profiling is a very powerful tool to identify individuals on the basis of their unique genetic makeup. DNA evidence is capable of not only identifying the presence of specific biospecimens in a crime scene, but it is also used to exonerate suspects who are innocent of a crime. Technological advancements in DNA profiling, including the development of validated kits and statistical methods have made this tool to be more precise in forensic investigations. Therefore, validated combined DNA index system (CODIS) short tandem repeats (STRs) kits which require very small amount of DNA, coupled with real-time polymerase chain reaction (PCR) and the statistical strengths are used routinely to identify human remains, establish paternity or to match suspected crime scene biospecimens. The road to modern DNA profiling has been long, and it has taken scientists decades of work and fine tuning to develop highly accurate testing and analyses that are used today. This review will discuss the various DNA polymorphisms and their utility in human identity testing.
PubMed: 38239252
DOI: 10.21315/mjms2023.30.6.2 -
BMC Genomics Jul 2023Drug resistant Mycobacterium tuberculosis prevention and care is a major challenge in Ethiopia. The World health organization has designated Ethiopia as one of the 30...
BACKGROUND
Drug resistant Mycobacterium tuberculosis prevention and care is a major challenge in Ethiopia. The World health organization has designated Ethiopia as one of the 30 high burden multi-drug resistant tuberculosis (MDR-TB) countries. There is limited information regarding genetic diversity and transmission dynamics of MDR-TB in Ethiopia.
OBJECTIVE
To investigate the molecular epidemiology and transmission dynamics of MDR-TB strains using whole genome sequence (WGS) in the Amhara region.
METHODS
Forty-five MDR-TB clinical isolates from Amhara region were collected between 2016 and 2018, and characterized using WGS and 24-loci Mycobacterium Interspersed Repetitive Units Variable Number of Tandem Repeats (MIRU-VNTR) typing. Clusters were defined based on the maximum distance of 12 single nucleotide polymorphisms (SNPs) or alleles as the upper threshold of genomic relatedness. Five or less SNPs or alleles distance or identical 24-loci VNTR typing is denoted as surrogate marker for recent transmission.
RESULTS
Forty-one of the 45 isolates were analyzed by WGS and 44% (18/41) of the isolates were distributed into 4 clusters. Of the 41 MDR-TB isolates, 58.5% were classified as lineage 4, 36.5% lineage 3 and 5% lineage 1. Overall, TUR genotype (54%) was the predominant in MDR-TB strains. 41% (17/41) of the isolates were clustered into four WGS groups and the remaining isolates were unique strains. The predominant cluster (Cluster 1) was composed of nine isolates belonging to lineage 4 and of these, four isolates were in the recent transmission links.
CONCLUSIONS
Majority of MDR-TB strain cluster and predominance of TUR lineage in the Amhara region give rise to concerns for possible ongoing transmission. Efforts to strengthen TB laboratory to advance diagnosis, intensified active case finding, and expanded contact tracing activities are needed in order to improve rapid diagnosis and initiate early treatment. This would lead to the interruption of the transmission chain and stop the spread of MDR-TB in the Amhara region.
Topics: Humans; Antitubercular Agents; Tuberculosis; Mycobacterium tuberculosis; Ethiopia; Molecular Epidemiology; Tuberculosis, Multidrug-Resistant; Genotype; Whole Genome Sequencing; Minisatellite Repeats
PubMed: 37460951
DOI: 10.1186/s12864-023-09502-2 -
Archives of Razi Institute Dec 2023In the present research, we aimed to determine the characteristics of E. faecalis strains collected from an Iranian Children's Hospital for four years. Sixty-seven E....
In the present research, we aimed to determine the characteristics of E. faecalis strains collected from an Iranian Children's Hospital for four years. Sixty-seven E. faecalis isolates with virulence genes detection, variable-number tandem repeat (VNTR), and multiple-locus variable-number tandem repeat analysis (MLVA) typing were investigated. A high frequency of virulence genes belonged to gelatinase (73%) and Enterococcus faecalis (62%). The MLVA of 67 E. faecalis isolates revealed 52 VNTR patterns and 38 MLVA types (MTs). Furthermore, genetic diversities with the majority of the MT1 as a major MT in different Wards of the Children's Hospital were found.
Topics: Enterococcus faecalis; Iran; Humans; Gram-Positive Bacterial Infections; Minisatellite Repeats; Child; Hospitals, Pediatric; Genetic Variation; Virulence Factors
PubMed: 38828180
DOI: 10.32592/ARI.2023.78.6.1873 -
Nature Communications Sep 2023Markedly expanded tandem repeats (TRs) have been correlated with ~60 diseases. TR diversity has been considered a clue toward understanding missing heritability....
Markedly expanded tandem repeats (TRs) have been correlated with ~60 diseases. TR diversity has been considered a clue toward understanding missing heritability. However, haplotype-resolved long TRs remain mostly hidden or blacked out because their complex structures (TRs composed of various units and minisatellites containing >10-bp units) make them difficult to determine accurately with existing methods. Here, using a high-precision algorithm to determine complex TR structures from long, accurate reads of PacBio HiFi, an investigation of 270 Japanese control samples yields several genome-wide findings. Approximately 322,000 TRs are difficult to impute from the surrounding single-nucleotide variants. Greater genetic divergence of TR loci is significantly correlated with more events of younger replication slippage. Complex TRs are more abundant than single-unit TRs, and a tendency for complex TRs to consist of <10-bp units and single-unit TRs to be minisatellites is statistically significant at loci with ≥500-bp TRs. Of note, 8909 loci with extended TRs (>100b longer than the mode) contain several known disease-associated TRs and are considered candidates for association with disorders. Overall, complex TRs and minisatellites are found to be abundant and diverse, even in genetically small Japanese populations, yielding insights into the landscape of long TRs.
Topics: Humans; Genome, Human; Tandem Repeat Sequences; Minisatellite Repeats; Algorithms; Genetic Drift
PubMed: 37709751
DOI: 10.1038/s41467-023-41262-1 -
Genome Biology Jul 2023Roughly 3% of the human genome is composed of variable-number tandem repeats (VNTRs): arrays of motifs at least six bases. These loci are highly polymorphic, yet current...
Roughly 3% of the human genome is composed of variable-number tandem repeats (VNTRs): arrays of motifs at least six bases. These loci are highly polymorphic, yet current approaches that define and merge variants based on alignment breakpoints do not capture their full diversity. Here we present a method vamos: VNTR Annotation using efficient Motif Sets that instead annotates VNTR using repeat composition under different levels of motif diversity. Using vamos we estimate 7.4-16.7 alleles per locus when applied to 74 haplotype-resolved human assemblies, compared to breakpoint-based approaches that estimate 4.0-5.5 alleles per locus.
Topics: Humans; Minisatellite Repeats
PubMed: 37501141
DOI: 10.1186/s13059-023-03010-y -
BMC Genomics Nov 2023As a population genetic tool, mitochondrial DNA is commonly divided into the ~ 1-kb control region (CR), in which single nucleotide variant (SNV) diversity is...
BACKGROUND
As a population genetic tool, mitochondrial DNA is commonly divided into the ~ 1-kb control region (CR), in which single nucleotide variant (SNV) diversity is relatively high, and the coding region, in which selective constraint is greater and diversity lower, but which provides an informative phylogeny. In some species, the CR contains variable tandemly repeated sequences that are understudied due to heteroplasmy. Domestic cats (Felis catus) have a recent origin and therefore traditional CR-based analysis of populations yields only a small number of haplotypes.
RESULTS
To increase resolution we used Nanopore sequencing to analyse 119 cat mitogenomes via a long-amplicon approach. This greatly improves discrimination (from 15 to 87 distinct haplotypes in our dataset) and defines a phylogeny showing similar starlike topologies within all major clades (haplogroups), likely reflecting post-domestication expansion. We sequenced RS2, a CR tandem array of 80-bp repeat units, placing RS2 array structures within the phylogeny and increasing overall haplotype diversity. Repeat number varies between 3 and 12 (median: 4) with over 30 different repeat unit types differing largely by SNVs. Five SNVs show evidence of independent recurrence within the phylogeny, and seven are involved in at least 11 instances of rapid spread along repeat arrays within haplogroups.
CONCLUSIONS
In defining mitogenome variation our study provides key information for the forensic genetic analysis of cat hair evidence, and for the first time a phylogenetically informed picture of tandem repeat variation that reveals remarkably dynamic mutation processes at work in the mitochondrion.
Topics: Cats; Animals; Genome, Mitochondrial; Genetic Variation; Minisatellite Repeats; Mitochondria; Mutation
PubMed: 37978434
DOI: 10.1186/s12864-023-09789-1 -
Veterinary Microbiology Dec 2023Mycoplasma iowae is a worldwide spread and economically important avian pathogen that mostly infects turkeys. Currently, multi-locus sequence typing (MLST) serves as the...
Mycoplasma iowae is a worldwide spread and economically important avian pathogen that mostly infects turkeys. Currently, multi-locus sequence typing (MLST) serves as the gold standard method for strain identification in M. iowae. However, additional robust genotyping methods are required to effectively monitor M. iowae infections and conduct epidemiological investigations. The first aim of this study was to develop genotyping assays with high resolution, that specifically target M. iowae, namely a multiple-locus variable number of tandem-repeats analysis (MLVA) and a core genome multi-locus sequence typing (cgMLST) schema. The second aim was the determination of relationships among a diverse selection of M. iowae strains and clinical isolates with a previous and the newly developed assays. The MLVA was designed based on the analyses of tandem-repeat (TR) regions in the six serotype reference strains (I, J, K, N, Q and R). The cgMLST schema was developed based on the coding sequences (CDSs) common in 95% of the examined 99 isolates. The samples were submitted for a previously published MLST assay for comparison with the developed methods. Out of 94 TR regions identified, 17 alleles were selected for further evaluation by PCR. Finally, seven alleles were chosen to establish the MLVA assay. Additionally, whole genome sequence analyses identified a total of 676 CDSs shared by 95% of the isolates, all of which were included into the developed cgMLST schema. The MLVA discriminated 19 distinct genotypes (GT), while with the cgMLST assay 79 sequence types (ST) could be determined with Simpson's diversity indices of 0.810 (MLVA) and 0.989 (cgMLST). The applied assays consistently identified the same main clusters among the diverse selection of isolates, thereby demonstrating their suitability for various genetic analyses and their ability to yield congruent results.
Topics: Animals; Multilocus Sequence Typing; Mycoplasma iowae; Genotype; Genotyping Techniques; Tandem Repeat Sequences; Minisatellite Repeats; Phylogeny
PubMed: 37925876
DOI: 10.1016/j.vetmic.2023.109909