-
Reproduction (Cambridge, England) Jun 2023Adverse trends in reproductive function are a concern in humans, companion, livestock, and wildlife species. This study indicates that equine populations are at risk of...
IN BRIEF
Adverse trends in reproductive function are a concern in humans, companion, livestock, and wildlife species. This study indicates that equine populations are at risk of a comparable decline in sperm progressive motility.
ABSTRACT
There is increasing evidence reporting geographically sensitive adverse trends in human semen quality, with parallel trends observed in the dog sentinel. Despite significant economic and welfare complications associated with poor testicular function, trends in current equine populations are undetermined. Given the predictive value of sperm progressive motility (PMOT) in male factor infertility and fertilisation potential, research determining trends in this parameter is warranted. This research analysed trends in stallion sperm PMOT through systematic review and meta-regression. Using a comprehensive search strategy, Scopus, Embase (Ovid), Medline (Ovid), and VetMed (CAB direct) were scoped for eligible data. Using best practices, 230 meta-data points from 229 articles published from 1991 to 2021 were collated for meta-regression analysis. Sperm PMOT declined significantly between 1984 and 2019 (simple linear regression: b -0.340, P = 0.017; meta-regression: b -0.610, P ≤ 0.001). Overall and yearly PMOT declines were predicted at 33.51 and 0.96%, respectively (1984: 63.69 ± 5.07%; 2019: 42.35 ± 3.69%). Trends remained consistent irrespective of sensitivity analyses. Yearly and overall declines were stronger in western (yearly: 0.75%, overall: 26.29%) compared to non-western (yearly: 0.46%, overall: 10.65%) populations. Adverse trends contribute vital data to the debate surrounding declining semen quality, supporting the use of equines as novel comparative models for human reproduction. Results could have significant economic, health, and welfare consequences for equine breeding sectors. A comparable decline in human, dog, and horse sperm quality is indicative of a common environmental aetiology, indicating the need for a holistic One Health approach in determining causes and developing preventative strategies.
Topics: Male; Horses; Animals; Humans; Dogs; Semen Analysis; Semen; Sperm Motility; Spermatozoa; Infertility, Male; Sperm Count
PubMed: 37000597
DOI: 10.1530/REP-22-0490 -
PeerJ 2023The emerging field of environmental DNA (eDNA) research lacks universal guidelines for ensuring data produced are FAIR-findable, accessible, interoperable, and...
The emerging field of environmental DNA (eDNA) research lacks universal guidelines for ensuring data produced are FAIR-findable, accessible, interoperable, and reusable-despite growing awareness of the importance of such practices. In order to better understand these data usability challenges, we systematically reviewed 60 peer reviewed articles conducting a specific subset of eDNA research: metabarcoding studies in marine environments. For each article, we characterized approximately 90 features across several categories: general article attributes and topics, methodological choices, types of metadata included, and availability and storage of sequence data. Analyzing these characteristics, we identified several barriers to data accessibility, including a lack of common context and vocabulary across the articles, missing metadata, supplementary information limitations, and a concentration of both sample collection and analysis in the United States. While some of these barriers require significant effort to address, we also found many instances where small choices made by authors and journals could have an outsized influence on the discoverability and reusability of data. Promisingly, articles also showed consistency and creativity in data storage choices as well as a strong trend toward open access publishing. Our analysis underscores the need to think critically about data accessibility and usability as marine eDNA metabarcoding studies, and eDNA projects more broadly, continue to proliferate.
Topics: DNA, Environmental; Biodiversity; DNA Barcoding, Taxonomic
PubMed: 36992947
DOI: 10.7717/peerj.14993 -
Dermatology (Basel, Switzerland) 2023While skin cancers are less prevalent in people with skin of color, they are more often diagnosed at later stages and have a poorer prognosis. The use of artificial...
BACKGROUND
While skin cancers are less prevalent in people with skin of color, they are more often diagnosed at later stages and have a poorer prognosis. The use of artificial intelligence (AI) models can potentially improve early detection of skin cancers; however, the lack of skin color diversity in training datasets may only widen the pre-existing racial discrepancies in dermatology.
OBJECTIVE
The aim of this study was to systematically review the technique, quality, accuracy, and implications of studies using AI models trained or tested in populations with skin of color for classification of pigmented skin lesions.
METHODS
PubMed was used to identify any studies describing AI models for classification of pigmented skin lesions. Only studies that used training datasets with at least 10% of images from people with skin of color were eligible. Outcomes on study population, design of AI model, accuracy, and quality of the studies were reviewed.
RESULTS
Twenty-two eligible articles were identified. The majority of studies were trained on datasets obtained from Chinese (7/22), Korean (5/22), and Japanese populations (3/22). Seven studies used diverse datasets containing Fitzpatrick skin type I-III in combination with at least 10% from black Americans, Native Americans, Pacific Islanders, or Fitzpatrick IV-VI. AI models producing binary outcomes (e.g., benign vs. malignant) reported an accuracy ranging from 70% to 99.7%. Accuracy of AI models reporting multiclass outcomes (e.g., specific lesion diagnosis) was lower, ranging from 43% to 93%. Reader studies, where dermatologists' classification is compared with AI model outcomes, reported similar accuracy in one study, higher AI accuracy in three studies, and higher clinician accuracy in two studies. A quality review revealed that dataset description and variety, benchmarking, public evaluation, and healthcare application were frequently not addressed.
CONCLUSIONS
While this review provides promising evidence of accurate AI models in populations with skin of color, the majority of the studies reviewed were obtained from East Asian populations and therefore provide insufficient evidence to comment on the overall accuracy of AI models for darker skin types. Large discrepancies remain in the number of AI models developed in populations with skin of color (particularly Fitzpatrick type IV-VI) compared with those of largely European ancestry. A lack of publicly available datasets from diverse populations is likely a contributing factor, as is the inadequate reporting of patient-level metadata relating to skin color in training datasets.
Topics: Humans; Artificial Intelligence; Melanoma; Sensitivity and Specificity; Skin Neoplasms; Skin Pigmentation; Racial Groups
PubMed: 36944317
DOI: 10.1159/000530225 -
Gut Microbes 2023Growth failure is among the most prevalent and devastating consequences of prematurity. Up to half of all extremely preterm neonates struggle to grow despite modern...
Growth failure is among the most prevalent and devastating consequences of prematurity. Up to half of all extremely preterm neonates struggle to grow despite modern nutrition practices. Although elegant preclinical models suggest causal roles for the gut microbiome, these insights have not yet translated into biomarkers that identify at-risk neonates or therapies that prevent or treat growth failure. This systematic review aims to identify features of the neonatal gut microbiota that are positively or negatively associated with early postnatal growth. We identified 860 articles, of which 14 were eligible for inclusion. No two studies used the same definitions of growth, ages at stool collection, and statistical methods linking microbiota to metadata. In all, 58 different taxa were associated with growth, with little consensus among studies. Two or more studies reported positive associations with Enterobacteriaceae, , , , and , and negative associations with , and . was positively associated with growth in five studies and negatively associated with growth in three studies. To gain insight into how the various definitions of growth could impact results, we performed an exploratory secondary analysis of 245 longitudinally sampled preterm infant stools, linking microbiota composition to multiple clinically relevant definitions of neonatal growth. Within this cohort, every definition of growth was associated with a different combination of microbiota features. Together, these results suggest that the lack of consensus in defining neonatal growth may limit our capacity to detect consistent, meaningful clinical associations that could be leveraged into improved care for preterm neonates.
Topics: Infant; Infant, Newborn; Humans; Infant, Premature; Gastrointestinal Microbiome; Feces; Microbiota; Enterobacteriaceae
PubMed: 36927287
DOI: 10.1080/19490976.2023.2190301 -
PLOS Digital Health May 2022Federated learning (FL) allows multiple institutions to collaboratively develop a machine learning algorithm without sharing their data. Organizations instead share...
OBJECTIVES
Federated learning (FL) allows multiple institutions to collaboratively develop a machine learning algorithm without sharing their data. Organizations instead share model parameters only, allowing them to benefit from a model built with a larger dataset while maintaining the privacy of their own data. We conducted a systematic review to evaluate the current state of FL in healthcare and discuss the limitations and promise of this technology.
METHODS
We conducted a literature search using PRISMA guidelines. At least two reviewers assessed each study for eligibility and extracted a predetermined set of data. The quality of each study was determined using the TRIPOD guideline and PROBAST tool.
RESULTS
13 studies were included in the full systematic review. Most were in the field of oncology (6 of 13; 46.1%), followed by radiology (5 of 13; 38.5%). The majority evaluated imaging results, performed a binary classification prediction task via offline learning (n = 12; 92.3%), and used a centralized topology, aggregation server workflow (n = 10; 76.9%). Most studies were compliant with the major reporting requirements of the TRIPOD guidelines. In all, 6 of 13 (46.2%) of studies were judged at high risk of bias using the PROBAST tool and only 5 studies used publicly available data.
CONCLUSION
Federated learning is a growing field in machine learning with many promising uses in healthcare. Few studies have been published to date. Our evaluation found that investigators can do more to address the risk of bias and increase transparency by adding steps for data homogeneity or sharing required metadata and code.
PubMed: 36812504
DOI: 10.1371/journal.pdig.0000033 -
PloS One 2023Several studies applying Machine Learning to deception detection have been published in the last decade. A rich and complex set of settings, approaches, theories, and...
Several studies applying Machine Learning to deception detection have been published in the last decade. A rich and complex set of settings, approaches, theories, and results is now available. Therefore, one may find it difficult to identify trends, successful paths, gaps, and opportunities for contribution. The present literature review aims to provide the state of research regarding deception detection with Machine Learning. We followed the PRISMA protocol and retrieved 648 articles from ACM Digital Library, IEEE Xplore, Scopus, and Web of Science. 540 of them were screened (108 were duplicates). A final corpus of 81 documents has been summarized as mind maps. Metadata was extracted and has been encoded as Python dictionaries to support a statistical analysis scripted in Python programming language, and available as a collection of Jupyter Lab Notebooks in a GitHub repository. All are available as Jupyter Lab Notebooks. Neural Networks, Support Vector Machines, Random Forest, Decision Tree and K-nearest Neighbor are the five most explored techniques. The studies report a detection performance ranging from 51% to 100%, with 19 works reaching accuracy rate above 0.9. Monomodal, Bimodal, and Multimodal approaches were exploited and achieved various accuracy levels for detection. Bimodal and Multimodal approaches have become a trend over Monomodal ones, although there are high-performance examples of the latter. Studies that exploit language and linguistic features, 75% are dedicated to English. The findings include observations of the following: language and culture, emotional features, psychological traits, cognitive load, facial cues, complexity, performance, and Machine Learning topics. We also present a dataset benchmark. Main conclusions are that labeled datasets from real-life data are scarce. Also, there is still room for new approaches for deception detection with Machine Learning, especially if focused on languages and cultures other than English-based. Further research would greatly contribute by providing new labeled and multimodal datasets for deception detection, both for English and other languages.
Topics: Neural Networks, Computer; Research Design; Publications; Machine Learning; Deception
PubMed: 36757928
DOI: 10.1371/journal.pone.0281323 -
Journal of Medical Internet Research Feb 2023In patient care, data are historically generated and stored in heterogeneous databases that are domain specific and often noninteroperable or isolated. As the amount of...
BACKGROUND
In patient care, data are historically generated and stored in heterogeneous databases that are domain specific and often noninteroperable or isolated. As the amount of health data increases, the number of isolated data silos is also expected to grow, limiting the accessibility of the collected data. Medical informatics is developing ways to move from siloed data to a more harmonized arrangement in information architectures. This paradigm shift will allow future research to integrate medical data at various levels and from various sources. Currently, comprehensive requirements engineering is working on data integration projects in both patient care- and research-oriented contexts, and it is significantly contributing to the success of such projects. In addition to various stakeholder-based methods, document-based requirement elicitation is a valid method for improving the scope and quality of requirements.
OBJECTIVE
Our main objective was to provide a general catalog of functional requirements for integrating medical data into knowledge management environments. We aimed to identify where integration projects intersect to derive consistent and representative functional requirements from the literature. On the basis of these findings, we identified which functional requirements for data integration exist in the literature and thus provide a general catalog of requirements.
METHODS
This work began by conducting a literature-based requirement elicitation based on a broad requirement engineering approach. Thus, in the first step, we performed a web-based systematic literature review to identify published articles that dealt with the requirements for medical data integration. We identified and analyzed the available literature by applying the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. In the second step, we screened the results for functional requirements using the requirements engineering method of document analysis and derived the requirements into a uniform requirement syntax. Finally, we classified the elicited requirements into a category scheme that represents the data life cycle.
RESULTS
Our 2-step requirements elicitation approach yielded 821 articles, of which 61 (7.4%) were included in the requirement elicitation process. There, we identified 220 requirements, which were covered by 314 references. We assigned the requirements to different data life cycle categories as follows: 25% (55/220) to data acquisition, 35.9% (79/220) to data processing, 12.7% (28/220) to data storage, 9.1% (20/220) to data analysis, 6.4% (14/220) to metadata management, 2.3% (5/220) to data lineage, 3.2% (7/220) to data traceability, and 5.5% (12/220) to data security.
CONCLUSIONS
The aim of this study was to present a cross-section of functional data integration-related requirements defined in the literature by other researchers. The aim was achieved with 220 distinct requirements from 61 publications. We concluded that scientific publications are, in principle, a reliable source of information for functional requirements with respect to medical data integration. Finally, we provide a broad catalog to support other scientists in the requirement elicitation phase.
Topics: Humans; Knowledge Management; Publications; Data Collection; Systems Analysis; Information Storage and Retrieval
PubMed: 36757764
DOI: 10.2196/41344 -
BMJ Open Jan 2023Various studies have been published to better understand the underlying spatial and temporal dynamics of COVID-19. This review sought to identify different spatial and...
OBJECTIVE
Various studies have been published to better understand the underlying spatial and temporal dynamics of COVID-19. This review sought to identify different spatial and spatio-temporal modelling methods that have been applied to COVID-19 and examine influential covariates that have been reportedly associated with its risk in Africa.
DESIGN
Systematic review using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines.
DATA SOURCES
Thematically mined keywords were used to identify refereed studies conducted between January 2020 and February 2022 from the following databases: PubMed, Scopus, MEDLINE via Proquest, CINHAL via EBSCOhost and Coronavirus Research Database via ProQuest. A manual search through the reference list of studies was also conducted.
ELIGIBILITY CRITERIA FOR SELECTING STUDIES
Peer-reviewed studies that demonstrated the application of spatial and temporal approaches to COVID-19 outcomes.
DATA EXTRACTION AND SYNTHESIS
A standardised extraction form based on critical appraisal and data extraction for systematic reviews of prediction modelling studies checklist was used to extract the meta-data of the included studies. A validated scoring criterion was used to assess studies based on their methodological relevance and quality.
RESULTS
Among 2065 hits in five databases, title and abstract screening yielded 827 studies of which 22 were synthesised and qualitatively analysed. The most common socioeconomic variable was population density. HIV prevalence was the most common epidemiological indicator, while temperature was the most common environmental indicator. Thirteen studies (59%) implemented diverse formulations of spatial and spatio-temporal models incorporating unmeasured factors of COVID-19 and the subtle influence of time and space. Cluster analyses were used across seven studies (32%) to explore COVID-19 variation and determine whether observed patterns were random.
CONCLUSION
COVID-19 modelling in Africa is still in its infancy, and a range of spatial and spatio-temporal methods have been employed across diverse settings. Strengthening routine data systems remains critical for generating estimates and understanding factors that drive spatial variation in vulnerable populations and temporal variation in pandemic progression.
PROSPERO REGISTRATION NUMBER
CRD42021279767.
Topics: Humans; COVID-19; Africa; Cluster Analysis
PubMed: 36697047
DOI: 10.1136/bmjopen-2022-067134 -
PloS One 2023The use of cannabis for medicinal purposes has increased globally over the past decade since patient access to medicinal cannabis has been legislated across...
The use of cannabis for medicinal purposes has increased globally over the past decade since patient access to medicinal cannabis has been legislated across jurisdictions in Europe, the United Kingdom, the United States, Canada, and Australia. Yet, evidence relating to the effect of medical cannabis on the management of symptoms for a suite of conditions is only just emerging. Although there is considerable engagement from many stakeholders to add to the evidence base through randomized controlled trials, many gaps in the literature remain. Data from real-world and patient reported sources can provide opportunities to address this evidence deficit. This real-world data can be captured from a variety of sources such as found in routinely collected health care and health services records that include but are not limited to patient generated data from medical, administrative and claims data, patient reported data from surveys, wearable trackers, patient registries, and social media. In this systematic scoping review, we seek to understand the utility of online user generated text into the use of cannabis as a medicine. In this scoping review, we aimed to systematically search published literature to examine the extent, range, and nature of research that utilises user-generated content to examine to cannabis as a medicine. The objective of this methodological review is to synthesise primary research that uses social media discourse and internet search engine queries to answer the following questions: (i) In what way, is online user-generated text used as a data source in the investigation of cannabis as a medicine? (ii) What are the aims, data sources, methods, and research themes of studies using online user-generated text to discuss the medicinal use of cannabis. We conducted a manual search of primary research studies which used online user-generated text as a data source using the MEDLINE, Embase, Web of Science, and Scopus databases in October 2022. Editorials, letters, commentaries, surveys, protocols, and book chapters were excluded from the review. Forty-two studies were included in this review, twenty-two studies used manually labelled data, four studies used existing meta-data (Google trends/geo-location data), two studies used data that was manually coded using crowdsourcing services, and two used automated coding supplied by a social media analytics company, fifteen used computational methods for annotating data. Our review reflects a growing interest in the use of user-generated content for public health surveillance. It also demonstrates the need for the development of a systematic approach for evaluating the quality of social media studies and highlights the utility of automatic processing and computational methods (machine learning technologies) for large social media datasets. This systematic scoping review has shown that user-generated content as a data source for studying cannabis as a medicine provides another means to understand how cannabis is perceived and used in the community. As such, it provides another potential 'tool' with which to engage in pharmacovigilance of, not only cannabis as a medicine, but also other novel therapeutics as they enter the market.
Topics: Humans; Social Media; Cannabis; Medicine; Delivery of Health Care; United Kingdom
PubMed: 36662832
DOI: 10.1371/journal.pone.0269143 -
Environmental Research Mar 2023Assessing health outcomes associated with exposure to polychlorinated biphenyls (PCBs) is important given their persistent and ubiquitous nature. PCBs are classified as... (Review)
Review
Assessing health outcomes associated with exposure to polychlorinated biphenyls (PCBs) is important given their persistent and ubiquitous nature. PCBs are classified as a Group 1 carcinogen, but the full range of potential noncancer health effects from exposure to PCBs has not been systematically summarized and evaluated. We used systematic review methods to identify and screen the literature using combined manual review and machine learning approaches. A protocol was developed that describes the literature search strategy and Populations, Exposures, Comparators, and Outcomes (PECO) criteria used to facilitate subsequent screening and categorization of literature into a systematic evidence map of PCB exposure and noncancer health endpoints across 15 organs/systems. A comprehensive literature search yielded 62,599 records. After electronic prioritization steps, 17,037 studies were manually screened at the title and abstract level. An additional 900 studies identified by experts or supplemental searches were also included. After full-text screening of 3889 references, 1586 studies met the PECO criteria. Relevant study details such as the endpoints assessed, exposure duration, and species were extracted into literature summary tables. This review compiles and organizes the human and mammalian studies from these tables into an evidence map for noncancer health endpoints and PCB mixture exposure to identify areas of robust research as well as areas of uncertainty that would benefit from future investigation. Summary data are available online as interactive visuals with downloadable metadata. Sufficient research is available to inform PCB hazard assessments for most organs/systems, but the amount of data to inform associations with specific endpoints differs. Furthermore, despite many years of research, sparse data exist for inhalation and dermal exposures, which are highly relevant human exposure routes. This evidence map provides a foundation for future systematic reviews and noncancer hazard assessments of PCB mixtures and for strategic planning of research to inform areas of greater uncertainty.
Topics: Animals; Humans; Carcinogens; Mammals; Polychlorinated Biphenyls; Uncertainty
PubMed: 36580985
DOI: 10.1016/j.envres.2022.115148