-
Sensors (Basel, Switzerland) Jul 2023This paper presents reported machine learning approaches in the field of Brillouin distributed fiber optic sensors (DFOSs). The increasing popularity of Brillouin DFOSs... (Review)
Review
This paper presents reported machine learning approaches in the field of Brillouin distributed fiber optic sensors (DFOSs). The increasing popularity of Brillouin DFOSs stems from their capability to continuously monitor temperature and strain along kilometer-long optical fibers, rendering them attractive for industrial applications, such as the structural health monitoring of large civil infrastructures and pipelines. In recent years, machine learning has been integrated into the Brillouin DFOS signal processing, resulting in fast and enhanced temperature, strain, and humidity measurements without increasing the system's cost. Machine learning has also contributed to enhanced spatial resolution in Brillouin optical time domain analysis (BOTDA) systems and shorter measurement times in Brillouin optical frequency domain analysis (BOFDA) systems. This paper provides an overview of the applied machine learning methodologies in Brillouin DFOSs, as well as future perspectives in this area.
Topics: Fiber Optic Technology; Optical Devices; Optical Fibers; Humidity; Machine Learning
PubMed: 37448034
DOI: 10.3390/s23136187 -
American Journal of Epidemiology Nov 2023Deep learning methods are increasingly being applied to problems in medicine and health care. However, few epidemiologists have received formal training in these... (Review)
Review
Deep learning methods are increasingly being applied to problems in medicine and health care. However, few epidemiologists have received formal training in these methods. To bridge this gap, this article introduces the fundamentals of deep learning from an epidemiologic perspective. Specifically, this article reviews core concepts in machine learning (e.g., overfitting, regularization, and hyperparameters); explains several fundamental deep learning architectures (convolutional neural networks, recurrent neural networks); and summarizes training, evaluation, and deployment of models. Conceptual understanding of supervised learning algorithms is the focus of the article; instructions on the training of deep learning models and applications of deep learning to causal learning are out of this article's scope. We aim to provide an accessible first step towards enabling the reader to read and assess research on the medical applications of deep learning and to familiarize readers with deep learning terminology and concepts to facilitate communication with computer scientists and machine learning engineers.
Topics: Humans; Deep Learning; Epidemiologists; Neural Networks, Computer; Algorithms; Machine Learning
PubMed: 37139570
DOI: 10.1093/aje/kwad107 -
The British Journal of Radiology Oct 2023Data drift refers to differences between the data used in training a machine learning (ML) model and that applied to the model in real-world operation. Medical ML... (Review)
Review
Data drift refers to differences between the data used in training a machine learning (ML) model and that applied to the model in real-world operation. Medical ML systems can be exposed to various forms of data drift, including differences between the data sampled for training and used in clinical operation, differences between medical practices or context of use between training and clinical use, and time-related changes in patient populations, disease patterns, and data acquisition, to name a few. In this article, we first review the terminology used in ML literature related to data drift, define distinct types of drift, and discuss in detail potential causes within the context of medical applications with an emphasis on medical imaging. We then review the recent literature regarding the effects of data drift on medical ML systems, which overwhelmingly show that data drift can be a major cause for performance deterioration. We then discuss methods for monitoring data drift and mitigating its effects with an emphasis on pre- and post-deployment techniques. Some of the potential methods for drift detection and issues around model retraining when drift is detected are included. Based on our review, we find that data drift is a major concern in medical ML deployment and that more research is needed so that ML models can identify drift early, incorporate effective mitigation strategies and resist performance decay.
Topics: Machine Learning; Medical Informatics Computing
PubMed: 36971405
DOI: 10.1259/bjr.20220878 -
International Journal of Molecular... Jul 2023Although and are essential food-fermenting bacteria, they are also opportunistic pathogens. Despite these species being commercially crucial, their taxonomy is still...
Although and are essential food-fermenting bacteria, they are also opportunistic pathogens. Despite these species being commercially crucial, their taxonomy is still based on inaccurate identification methods. In this study, we present a novel approach for identifying two important species, . and . , by combining matrix-assisted laser desorption/ionization and time-of-flight mass spectrometer (MALDI-TOF MS) data using machine-learning techniques. After on- and off-plate protein extraction, we observed that the BioTyper database misidentified or could not differentiate species. Although species exhibited very similar protein profiles, these species can be differentiated on the basis of the results of a statistical analysis. To classify . , . , and non-target species, machine learning was used for 167 spectra, which led to the listing of potential species-specific mass-to-charge (/) loci. Machine-learning techniques including artificial neural networks, principal component analysis combined with the K-nearest neighbor, support vector machine (SVM), and random forest were used. The model that applied the Radial Basis Function kernel algorithm in SVM achieved classification accuracy of 1.0 for training and test sets. The combination of MALDI-TOF MS and machine learning can efficiently classify closely-related species, enabling accurate microbial identification.
Topics: Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization; Weissella; Machine Learning
PubMed: 37446188
DOI: 10.3390/ijms241311009 -
BMC Medical Informatics and Decision... Jul 2023Esophageal cancer (EC) is a significant global health problem, with an estimated 7th highest incidence and 6th highest mortality rate. Timely diagnosis and treatment are...
INTRODUCTION
Esophageal cancer (EC) is a significant global health problem, with an estimated 7th highest incidence and 6th highest mortality rate. Timely diagnosis and treatment are critical for improving patients' outcomes, as over 40% of patients with EC are diagnosed after metastasis. Recent advances in machine learning (ML) techniques, particularly in computer vision, have demonstrated promising applications in medical image processing, assisting clinicians in making more accurate and faster diagnostic decisions. Given the significance of early detection of EC, this systematic review aims to summarize and discuss the current state of research on ML-based methods for the early detection of EC.
METHODS
We conducted a comprehensive systematic search of five databases (PubMed, Scopus, Web of Science, Wiley, and IEEE) using search terms such as "ML", "Deep Learning (DL (", "Neural Networks (NN)", "Esophagus", "EC" and "Early Detection". After applying inclusion and exclusion criteria, 31 articles were retained for full review.
RESULTS
The results of this review highlight the potential of ML-based methods in the early detection of EC. The average accuracy of the reviewed methods in the analysis of endoscopic and computed tomography (CT (images of the esophagus was over 89%, indicating a high impact on early detection of EC. Additionally, the highest percentage of clinical images used in the early detection of EC with the use of ML was related to white light imaging (WLI) images. Among all ML techniques, methods based on convolutional neural networks (CNN) achieved higher accuracy and sensitivity in the early detection of EC compared to other methods.
CONCLUSION
Our findings suggest that ML methods may improve accuracy in the early detection of EC, potentially supporting radiologists, endoscopists, and pathologists in diagnosis and treatment planning. However, the current literature is limited, and more studies are needed to investigate the clinical applications of these methods in early detection of EC. Furthermore, many studies suffer from class imbalance and biases, highlighting the need for validation of detection algorithms across organizations in longitudinal studies.
Topics: Humans; Deep Learning; Early Detection of Cancer; Machine Learning; Neural Networks, Computer; Esophageal Neoplasms
PubMed: 37460991
DOI: 10.1186/s12911-023-02235-y -
Drug Discovery Today Dec 2023Data availability, data security, and privacy concerns often hamper optimal performance efficiency of machine learning (ML) techniques. Therefore, novel techniques for... (Review)
Review
Data availability, data security, and privacy concerns often hamper optimal performance efficiency of machine learning (ML) techniques. Therefore, novel techniques for the utilization of private/sensitive data in the field of drug discovery have been proposed for ML model-building tasks. Some examples of the different techniques are secure multiparty computation, distributed deep learning, homomorphic encryption, blockchain-based peer-to-peer networking, differential privacy, and federated learning, as well as combinations of such techniques. In this paper, we present an overview of these techniques for decentralized ML to illustrate its benefits and drawbacks in the field of drug discovery.
Topics: Privacy; Drug Discovery; Machine Learning
PubMed: 37935330
DOI: 10.1016/j.drudis.2023.103820 -
Scientific Reports Sep 2023For many machine learning applications in drug discovery, only limited amounts of training data are available. This typically applies to compound design and activity...
For many machine learning applications in drug discovery, only limited amounts of training data are available. This typically applies to compound design and activity prediction and often restricts machine learning, especially deep learning. For low-data applications, specialized learning strategies can be considered to limit required training data. Among these is meta-learning that attempts to enable learning in low-data regimes by combining outputs of different models and utilizing meta-data from these predictions. However, in drug discovery settings, meta-learning is still in its infancy. In this study, we have explored meta-learning for the prediction of potent compounds via generative design using transformer models. For different activity classes, meta-learning models were derived to predict highly potent compounds from weakly potent templates in the presence of varying amounts of fine-tuning data and compared to other transformers developed for this task. Meta-learning consistently led to statistically significant improvements in model performance, in particular, when fine-tuning data were limited. Moreover, meta-learning models generated target compounds with higher potency and larger potency differences between templates and targets than other transformers, indicating their potential for low-data compound design.
Topics: Drug Discovery; Electric Power Supplies; Machine Learning
PubMed: 37752164
DOI: 10.1038/s41598-023-43046-5 -
Biomedicine & Pharmacotherapy =... Sep 2023Traditional bulk sequencing methods are limited to measuring the average signal in a group of cells, potentially masking heterogeneity, and rare populations. The... (Review)
Review
Traditional bulk sequencing methods are limited to measuring the average signal in a group of cells, potentially masking heterogeneity, and rare populations. The single-cell resolution, however, enhances our understanding of complex biological systems and diseases, such as cancer, the immune system, and chronic diseases. However, the single-cell technologies generate massive amounts of data that are often high-dimensional, sparse, and complex, thus making analysis with traditional computational approaches difficult and unfeasible. To tackle these challenges, many are turning to deep learning (DL) methods as potential alternatives to the conventional machine learning (ML) algorithms for single-cell studies. DL is a branch of ML capable of extracting high-level features from raw inputs in multiple stages. Compared to traditional ML, DL models have provided significant improvements across many domains and applications. In this work, we examine DL applications in genomics, transcriptomics, spatial transcriptomics, and multi-omics integration, and address whether DL techniques will prove to be advantageous or if the single-cell omics domain poses unique challenges. Through a systematic literature review, we have found that DL has not yet revolutionized the most pressing challenges of the single-cell omics field. However, using DL models for single-cell omics has shown promising results (in many cases outperforming the previous state-of-the-art models) in data preprocessing and downstream analysis. Although developments of DL algorithms for single-cell omics have generally been gradual, recent advances reveal that DL can offer valuable resources in fast-tracking and advancing research in single-cell.
Topics: Deep Learning; Transcriptome; Genomics; Machine Learning; Gene Expression Profiling
PubMed: 37393865
DOI: 10.1016/j.biopha.2023.115077 -
Journal of Chemical Information and... Dec 2023
Topics: Machine Learning; Algorithms
PubMed: 38073434
DOI: 10.1021/acs.jcim.3c01807 -
Computers in Biology and Medicine Oct 2023CRISPR/Cas9 system is a powerful tool for genome editing. Numerous studies have shown that sgRNAs can strongly affect the efficiency of editing. However, it is still not...
CRISPR/Cas9 system is a powerful tool for genome editing. Numerous studies have shown that sgRNAs can strongly affect the efficiency of editing. However, it is still not clear what rules should be followed for designing sgRNA with high cleavage efficiency. At present, several machine learning or deep learning methods have been developed to predict the cleavage efficiency of sgRNAs, however, the prediction accuracy of these tools is still not satisfactory. Here we propose a fusion framework of deep learning and machine learning, which first deals with the primary sequence and secondary structure features of the sgRNAs using both convolutional neural network (CNN) and recurrent neural network (RNN), and then uses the features extracted by the deep neural network to train a conventional machine learning model with LGBM. As a result, the new approach overwhelmed previous methods. The Spearman's correlation coefficient between predicted and measured sgRNA cleavage efficiency of our model (0.917) is improved by over 5% compared with the most advanced method (0.865), and the mean square error reduces from 7.89 × 10 to 4.75 × 10. Finally, we developed an online tool, CRISep (http://www.cuilab.cn/CRISep), to evaluate the availability of sgRNAs based on our models.
Topics: Deep Learning; RNA, Guide, CRISPR-Cas Systems; Machine Learning; Neural Networks, Computer
PubMed: 37696181
DOI: 10.1016/j.compbiomed.2023.107476