-
Molecular Biotechnology Aug 1996This review outlines the various methods available for submitting sequence data to the EMBL Nucleotide Sequence Database. Depending on the type of sequence data and the... (Review)
Review
This review outlines the various methods available for submitting sequence data to the EMBL Nucleotide Sequence Database. Depending on the type of sequence data and the facilities available to the submitter, one method may be more suitable than another. Recent developments have been the World Wide Web submission tool, procedures for bulk submissions and genome projects.
Topics: Base Sequence; Databases, Factual
PubMed: 8887360
DOI: 10.1007/BF02762322 -
Vaccine Mar 2008Modern electroporation has been widely and successfully used in gene therapies and drug submissions on large animals including human. The DNA vaccine submission was now... (Review)
Review
Modern electroporation has been widely and successfully used in gene therapies and drug submissions on large animals including human. The DNA vaccine submission was now focused on muscle electroporation and has been shown to be a perspective application. Here we review some potentials of this application and discuss some difficulties in practical works.
Topics: Animals; Electroporation; Humans; Muscles; Vaccination; Vaccines, DNA
PubMed: 18313811
DOI: 10.1016/j.vaccine.2008.01.047 -
Nucleic Acids Research Jan 2020GenBank® (www.ncbi.nlm.nih.gov/genbank/) is a comprehensive, public database that contains over 6.25 trillion base pairs from over 1.6 billion nucleotide sequences for...
GenBank® (www.ncbi.nlm.nih.gov/genbank/) is a comprehensive, public database that contains over 6.25 trillion base pairs from over 1.6 billion nucleotide sequences for 450 000 formally described species. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. Recent updates include a new version of Genome Workbench that supports GenBank submissions, new submission wizards for viral genomes, enhancements to BankIt and improved handling of taxonomy for sequences from pathogens.
Topics: Computational Biology; Databases, Nucleic Acid; Genomics; Molecular Sequence Annotation; National Institutes of Health (U.S.); Software; United States; Web Browser
PubMed: 31665464
DOI: 10.1093/nar/gkz956 -
Nucleic Acids Research Jan 2019GenBank® (www.ncbi.nlm.nih.gov/genbank/) is a comprehensive database that contains publicly available nucleotide sequences for 420 000 formally described species. Most...
GenBank® (www.ncbi.nlm.nih.gov/genbank/) is a comprehensive database that contains publicly available nucleotide sequences for 420 000 formally described species. Most GenBank submissions are made using BankIt, the NCBI Submission Portal, or the tool tbl2asn, and are obtained from individual laboratories and batch submissions from large-scale sequencing projects, including whole genome shotgun (WGS) and environmental sampling projects. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the NCBI Nucleotide database, which links to related information such as taxonomy, genomes, protein sequences and structures, and biomedical journal literature in PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. Recent updates include an expansion of sequence identifier formats to accommodate expected database growth, submission wizards for ribosomal RNA, and the transfer of Expressed Sequence Tag (EST) and Genome Survey Sequence (GSS) data into the Nucleotide database.
Topics: Computational Biology; Databases, Nucleic Acid; Genomics; Humans; Information Storage and Retrieval; Software Design; Web Browser
PubMed: 30365038
DOI: 10.1093/nar/gky989 -
Clinical Trials (London, England) Feb 2023In clinical trial development, it is a critical step to submit applications, amendments, supplements, and reports on medicinal products to regulatory agencies. The...
BACKGROUND
In clinical trial development, it is a critical step to submit applications, amendments, supplements, and reports on medicinal products to regulatory agencies. The electronic common technical document is the standard format to enable worldwide regulatory submission. There is a growing trend of using R for clinical trial analysis and reporting as part of regulatory submissions, where R functions, analysis scripts, analysis results, and all proprietary code dependencies are required to be included. One unmet and significant gap is the lack of tools, guidance, and publicly available examples to prepare submission R programs following the electronic common technical document specification.
METHODS
We introduce a simple and sufficient R package, pkglite, to convert analysis scripts and associated proprietary dependency R packages into a compact, text-based file, which makes the submission document self-contained, easy to restore, transfer, review, and submit following the electronic common technical document specification and regulatory guidelines (e.g. the study data technical conformance guide from the US Food and Drug Administration). The pkglite R package is published on Comprehensive R Archive Network and developed on GitHub.
RESULTS
As a tool, pkglite can pack and unpack multiple R packages with their dependencies to facilitate the reproduction and make it an off-the-shelf tool for both sponsors and reviewers. As a grammar, pkglite provides an explicit trace of the packing scope using the concept of file specifications. As a standard, pkglite offers an open file format to represent and exchange R packages as a text file. We use a mock-up example to demonstrate the workflow of using pkglite to prepare submission programs following the electronic common technical document specification.
CONCLUSION
pkglite and the proposed workflow enable the sponsor to submit well-organized R scripts following the electronic common technical document specification. The workflow has been used in the first publicly available R-based submission to the US Food and Drug Administration by the R Consortium R submission working group (https://www.r-consortium.org/blog/2022/03/16/update-successful-r-based-test-package-submitted-to-fda).
Topics: United States; Humans; United States Food and Drug Administration; Electronics
PubMed: 36169229
DOI: 10.1177/17407745221123244 -
Nucleic Acids Research Jan 2018GenBank® (www.ncbi.nlm.nih.gov/genbank/) is a comprehensive database that contains publicly available nucleotide sequences for 400 000 formally described species. These...
GenBank® (www.ncbi.nlm.nih.gov/genbank/) is a comprehensive database that contains publicly available nucleotide sequences for 400 000 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole genome shotgun and environmental sampling projects. Most submissions are made using BankIt, the National Center for Biotechnology Information (NCBI) Submission Portal, or the tool tbl2asn. GenBank staff assign accession numbers upon data receipt. Daily data exchange with the European Nucleotide Archive and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through the NCBI Nucleotide database, which links to related information such as taxonomy, genomes, protein sequences and structures, and biomedical journal literature in PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. Recent updates include changes to sequence identifiers, submission wizards for 16S and Influenza sequences, and an Identical Protein Groups resource.
Topics: Animals; Computational Biology; Databases, Nucleic Acid; Europe; Genomics; Humans; Information Dissemination; Information Storage and Retrieval; Internet; Japan; National Library of Medicine (U.S.); Orthomyxoviridae; Proteomics; RNA, Ribosomal; Sequence Alignment; United States
PubMed: 29140468
DOI: 10.1093/nar/gkx1094 -
Nucleic Acids Research Jan 2017GenBank (www.ncbi.nlm.nih.gov/genbank/) is a comprehensive database that contains publicly available nucleotide sequences for 370 000 formally described species. These...
GenBank (www.ncbi.nlm.nih.gov/genbank/) is a comprehensive database that contains publicly available nucleotide sequences for 370 000 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or the NCBI Submission Portal. GenBank staff assign accession numbers upon data receipt. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the NCBI Nucleotide database, which links to related information such as taxonomy, genomes, protein sequences and structures, and biomedical journal literature in PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. Recent updates include changes to policies regarding sequence identifiers, an improved 16S submission wizard, targeted loci studies, the ability to submit methylation and BioNano mapping files, and a database of anti-microbial resistance genes.
Topics: Animals; DNA Methylation; Databases, Nucleic Acid; Genome, Bacterial; Genomics; Humans; RNA, Ribosomal, 16S; Sequence Analysis, DNA; beta-Lactamases
PubMed: 27899564
DOI: 10.1093/nar/gkw1070 -
Biochemical Pharmacology Oct 2015Recent reports have highlighted studies in biomedical research that cannot be reproduced, tending to undermine the credibility, relevance and sustainability of the...
Recent reports have highlighted studies in biomedical research that cannot be reproduced, tending to undermine the credibility, relevance and sustainability of the research process. To address this issue, a number of factors can be monitored to improve the overall probability of reproducibility. These include: (i) shortcomings in experimental design and execution that involve hypothesis conceptualization, statistical analysis, and data reporting; (ii) investigator bias and error; (iii) validation of reagents including cells and antibodies; and (iv) fraud. Historically, research data that have undergone peer review and are subsequently published are then subject to independent replication via the process of self-correction. This often leads to refutation of the original findings and retraction of the paper by which time considerable resources have been wasted in follow-on studies. New NIH guidelines focused on experimental conduct and manuscript submission are being widely adopted in the peer-reviewed literature. These, in their various iterations, are intended to improve the transparency and accuracy of data reporting via the use of checklists that are often accompanied by "best practice" guidelines that aid in validating the methodologies and reagents used in data generation. The present Editorial provides background and context to a newly developed checklist for submissions to Biochemical Pharmacology that is intended to be clear, logical, useful and unambiguous in assisting authors in preparing manuscripts and in facilitating the peer review process. While currently optional, development of this checklist based on user feedback will result in it being mandatory within the next 12 months.
Topics: Biomedical Research; Checklist; Guideline Adherence; Guidelines as Topic; Manuscripts, Medical as Topic; Peer Review, Research; Periodicals as Topic; Pharmacology
PubMed: 26208784
DOI: 10.1016/j.bcp.2015.06.023 -
Nucleic Acids Research Jan 2016ClinVar (https://www.ncbi.nlm.nih.gov/clinvar/) at the National Center for Biotechnology Information (NCBI) is a freely available archive for interpretations of clinical...
ClinVar (https://www.ncbi.nlm.nih.gov/clinvar/) at the National Center for Biotechnology Information (NCBI) is a freely available archive for interpretations of clinical significance of variants for reported conditions. The database includes germline and somatic variants of any size, type or genomic location. Interpretations are submitted by clinical testing laboratories, research laboratories, locus-specific databases, OMIM®, GeneReviews™, UniProt, expert panels and practice guidelines. In NCBI's Variation submission portal, submitters upload batch submissions or use the Submission Wizard for single submissions. Each submitted interpretation is assigned an accession number prefixed with SCV. ClinVar staff review validation reports with data types such as HGVS (Human Genome Variation Society) expressions; however, clinical significance is reported directly from submitters. Interpretations are aggregated by variant-condition combination and assigned an accession number prefixed with RCV. Clinical significance is calculated for the aggregate record, indicating consensus or conflict in the submitted interpretations. ClinVar uses data standards, such as HGVS nomenclature for variants and MedGen identifiers for conditions. The data are available on the web as variant-specific views; the entire data set can be downloaded via ftp. Programmatic access for ClinVar records is available through NCBI's E-utilities. Future development includes providing a variant-centric XML archive and a web page for details of SCV submissions.
Topics: Databases, Genetic; Disease; Genes; Genetic Variation; Genome, Human; Humans
PubMed: 26582918
DOI: 10.1093/nar/gkv1222 -
Bioinformatics (Oxford, England) Nov 2021Many aspects of the global response to the COVID-19 pandemic are enabled by the fast and open publication of SARS-CoV-2 genetic sequence data. The European Nucleotide...
SUMMARY
Many aspects of the global response to the COVID-19 pandemic are enabled by the fast and open publication of SARS-CoV-2 genetic sequence data. The European Nucleotide Archive (ENA) is the European recommended open repository for genetic sequences. In this work, we present a tool for submitting raw sequencing reads of SARS-CoV-2 to ENA. The tool features a single-step submission process, a graphical user interface, tabular-formatted metadata and the possibility to remove human reads prior to submission. A Galaxy wrap of the tool allows users with little or no bioinformatics knowledge to do bulk sequencing read submissions. The tool is also packed in a Docker container to ease deployment.
AVAILABILITY AND IMPLEMENTATION
CLI ENA upload tool is available at github.com/usegalaxy-eu/ena-upload-cli (DOI 10.5281/zenodo.4537621); Galaxy ENA upload tool at toolshed.g2.bx.psu.edu/view/iuc/ena_upload/382518f24d6d and github.com/galaxyproject/tools-iuc/tree/master/tools/ena_upload (development); and ENA upload Galaxy container at github.com/ELIXIR-Belgium/ena-upload-container (DOI 10.5281/zenodo.4730785).
Topics: Humans; Software; SARS-CoV-2; Nucleotides; Pandemics; COVID-19
PubMed: 34096994
DOI: 10.1093/bioinformatics/btab421