-
Journal of Environmental Management Oct 2023Modern wastewater treatment plants base their biological processes on advanced control systems which ensure compliance with discharge limits and minimize energy...
Modern wastewater treatment plants base their biological processes on advanced control systems which ensure compliance with discharge limits and minimize energy consumption responding to information from on-line probes. The correct readings of probes are particularly crucial for intermittent aeration controllers, which rely on real-time measurements of ammonia and oxygen in biological tanks. These data are also an important resource for developing artificial intelligence algorithms that can identify process or sensor anomalies, thus guiding the choices of plant operators and automatic process controllers. However, using anomaly detection and classification algorithms in real-time wastewater treatment is challenging because of the noisy nature of sensor measurements, the difficulty of obtaining labeled real-plant data, and the complex and interdependent mechanisms that govern biological processes. This work aims at thoroughly exploring the performance of machine learning methods in detecting and classifying the main anomalies in plants operating with intermittent aeration. Using oxygen, ammonia and aeration power measurements from a set of plants in Italy, we perform both binary and multiclass classification, and we compare them through a rigorous validation procedure that includes a test on an unknown dataset, proposing a new evaluation protocol. The classification methods explored are support vector machine, multilayer perceptron, random forest, and two gradient boosting methods (LightGBM and XGBoost). The best performance was achieved using the gradient boosting ensemble algorithms, with up to 96% of anomalies detected and up to 84% and 62% of anomalies classified correctly on the first and second datasets respectively.
Topics: Artificial Intelligence; Ammonia; Machine Learning; Neural Networks, Computer; Algorithms; Water Purification; Support Vector Machine
PubMed: 37473555
DOI: 10.1016/j.jenvman.2023.118594 -
European Radiology Experimental Mar 2024An increasingly strong connection between artificial intelligence and medicine has enabled the development of predictive models capable of supporting physicians'... (Review)
Review
An increasingly strong connection between artificial intelligence and medicine has enabled the development of predictive models capable of supporting physicians' decision-making. Artificial intelligence encompasses much more than machine learning, which nevertheless is its most cited and used sub-branch in the last decade. Since most clinical problems can be modeled through machine learning classifiers, it is essential to discuss their main elements. This review aims to give primary educational insights on the most accessible and widely employed classifiers in radiology field, distinguishing between "shallow" learning (i.e., traditional machine learning) algorithms, including support vector machines, random forest and XGBoost, and "deep" learning architectures including convolutional neural networks and vision transformers. In addition, the paper outlines the key steps for classifiers training and highlights the differences between the most common algorithms and architectures. Although the choice of an algorithm depends on the task and dataset dealing with, general guidelines for classifier selection are proposed in relation to task analysis, dataset size, explainability requirements, and available computing resources. Considering the enormous interest in these innovative models and architectures, the problem of machine learning algorithms interpretability is finally discussed, providing a future perspective on trustworthy artificial intelligence.Relevance statement The growing synergy between artificial intelligence and medicine fosters predictive models aiding physicians. Machine learning classifiers, from shallow learning to deep learning, are offering crucial insights for the development of clinical decision support systems in healthcare. Explainability is a key feature of models that leads systems toward integration into clinical practice. Key points • Training a shallow classifier requires extracting disease-related features from region of interests (e.g., radiomics).• Deep classifiers implement automatic feature extraction and classification.• The classifier selection is based on data and computational resources availability, task, and explanation needs.
Topics: Artificial Intelligence; Deep Learning; Algorithms; Machine Learning; Neural Networks, Computer
PubMed: 38438821
DOI: 10.1186/s41747-024-00428-2 -
International Journal of Medical... Aug 2023Revision hip arthroplasty has a less favorable outcome than primary total hip arthroplasty and an understanding of the timing of total hip arthroplasty failure may be...
AIM
Revision hip arthroplasty has a less favorable outcome than primary total hip arthroplasty and an understanding of the timing of total hip arthroplasty failure may be helpful. The aim of this study is to develop a combined deep learning (DL) and machine learning (ML) approach to automatically detect hip prosthetic failure from conventional plain radiographs.
METHODS
Two cohorts of patients (of 280 and 352 patients) were included in the study, for model development and validation, respectively. The analysis was based on one antero-posterior and one lateral radiographic view obtained from each patient during routine post-surgery follow-up. After pre-processing, three images were obtained: the original image, the acetabulum image and the stem image. These images were analyzed through convolutional neural networks aiming to predict prosthesis failure. Deep features of the three images were extracted for each model and two feature-based pipelines were developed: one utilizing only the features of the original image (original image pipeline) and the other concatenating the features of the three images (3-image pipeline). The obtained features were either used directly or reduced through principal component analysis. Both support vector machine (SVM) and random forest (RF) classifiers were considered for each pipeline.
RESULTS
The SVM applied to the 3-image pipeline provided the best performance, with an accuracy of 0.958 ± 0.006 in the internal validation and an F1-score of 0.874 in the external validation set. The explainability analysis, besides identifying the features of the complete original images as the major contributor, highlighted the role of the acetabulum and stem images on the prediction.
CONCLUSIONS
This study demonstrated the potentialities of the developed DL-ML procedure based on plain radiographs in the detection of the failure of the hip prosthesis.
Topics: Humans; Arthroplasty, Replacement, Hip; Deep Learning; Hip Prosthesis; Prosthesis Failure; Machine Learning
PubMed: 37220702
DOI: 10.1016/j.ijmedinf.2023.105095 -
Scientific Reports Mar 2024Research on different machine learning (ML) has become incredibly popular during the past few decades. However, for some researchers not familiar with statistics, it...
Research on different machine learning (ML) has become incredibly popular during the past few decades. However, for some researchers not familiar with statistics, it might be difficult to understand how to evaluate the performance of ML models and compare them with each other. Here, we introduce the most common evaluation metrics used for the typical supervised ML tasks including binary, multi-class, and multi-label classification, regression, image segmentation, object detection, and information retrieval. We explain how to choose a suitable statistical test for comparing models, how to obtain enough values of the metric for testing, and how to perform the test and interpret its results. We also present a few practical examples about comparing convolutional neural networks used to classify X-rays with different lung infections and detect cancer tumors in positron emission tomography images.
Topics: Image Processing, Computer-Assisted; Machine Learning; Neural Networks, Computer; Supervised Machine Learning; Positron-Emission Tomography
PubMed: 38480847
DOI: 10.1038/s41598-024-56706-x -
GigaScience Jan 2024Machine learning (ML) has emerged as a vital asset for researchers to analyze and extract valuable information from complex datasets. However, developing an effective...
BACKGROUND
Machine learning (ML) has emerged as a vital asset for researchers to analyze and extract valuable information from complex datasets. However, developing an effective and robust ML pipeline can present a real challenge, demanding considerable time and effort, thereby impeding research progress. Existing tools in this landscape require a profound understanding of ML principles and programming skills. Furthermore, users are required to engage in the comprehensive configuration of their ML pipeline to obtain optimal performance.
RESULTS
To address these challenges, we have developed a novel tool called Machine Learning Made Easy (MLme) that streamlines the use of ML in research, specifically focusing on classification problems at present. By integrating 4 essential functionalities-namely, Data Exploration, AutoML, CustomML, and Visualization-MLme fulfills the diverse requirements of researchers while eliminating the need for extensive coding efforts. To demonstrate the applicability of MLme, we conducted rigorous testing on 6 distinct datasets, each presenting unique characteristics and challenges. Our results consistently showed promising performance across different datasets, reaffirming the versatility and effectiveness of the tool. Additionally, by utilizing MLme's feature selection functionality, we successfully identified significant markers for CD8+ naive (BACH2), CD16+ (CD16), and CD14+ (VCAN) cell populations.
CONCLUSION
MLme serves as a valuable resource for leveraging ML to facilitate insightful data analysis and enhance research outcomes, while alleviating concerns related to complex coding scripts. The source code and a detailed tutorial for MLme are available at https://github.com/FunctionalUrology/MLme.
Topics: Humans; Data Analysis; Machine Learning; Research Personnel; Software
PubMed: 38206587
DOI: 10.1093/gigascience/giad111 -
Scientific Reports Aug 2023There is growing interest in canine behavioral research specifically for working dogs. Here we take advantage of a dataset of a Transportation Safety Administration...
There is growing interest in canine behavioral research specifically for working dogs. Here we take advantage of a dataset of a Transportation Safety Administration olfactory detection cohort of 628 Labrador Retrievers to perform Machine Learning (ML) prediction and classification studies of behavioral traits and environmental effects. Data were available for four time points over a 12 month foster period after which dogs were accepted into a training program or eliminated. Three supervised ML algorithms had robust performance in correctly predicting which dogs would be accepted into the training program, but poor performance in distinguishing those that were eliminated (~ 25% of the cohort). The 12 month testing time point yielded the best ability to distinguish accepted and eliminated dogs (AUC = 0.68). Classification studies using Principal Components Analysis and Recursive Feature Elimination using Cross-Validation revealed the importance of olfaction and possession-related traits for an airport terminal search and retrieve test, and possession, confidence, and initiative traits for an environmental test. Our findings suggest which tests, environments, behavioral traits, and time course are most important for olfactory detection dog selection. We discuss how this approach can guide further research that encompasses cognitive and emotional, and social and environmental effects.
Topics: Dogs; Animals; Smell; Machine Learning; Supervised Machine Learning; Algorithms; Mental Processes
PubMed: 37528118
DOI: 10.1038/s41598-023-39112-7 -
Brain : a Journal of Neurology Mar 2024
Topics: Humans; Parkinson Disease; Machine Learning; Support Vector Machine
PubMed: 38428999
DOI: 10.1093/brain/awae043 -
Critical Care (London, England) May 2024Sepsis, an acute and potentially fatal systemic response to infection, significantly impacts global health by affecting millions annually. Prompt identification of... (Review)
Review
BACKGROUND
Sepsis, an acute and potentially fatal systemic response to infection, significantly impacts global health by affecting millions annually. Prompt identification of sepsis is vital, as treatment delays lead to increased fatalities through progressive organ dysfunction. While recent studies have delved into leveraging Machine Learning (ML) for predicting sepsis, focusing on aspects such as prognosis, diagnosis, and clinical application, there remains a notable deficiency in the discourse regarding feature engineering. Specifically, the role of feature selection and extraction in enhancing model accuracy has been underexplored.
OBJECTIVES
This scoping review aims to fulfill two primary objectives: To identify pivotal features for predicting sepsis across a variety of ML models, providing valuable insights for future model development, and To assess model efficacy through performance metrics including AUROC, sensitivity, and specificity.
RESULTS
The analysis included 29 studies across diverse clinical settings such as Intensive Care Units (ICU), Emergency Departments, and others, encompassing 1,147,202 patients. The review highlighted the diversity in prediction strategies and timeframes. It was found that feature extraction techniques notably outperformed others in terms of sensitivity and AUROC values, thus indicating their critical role in improving sepsis prediction models.
CONCLUSION
Key dynamic indicators, including vital signs and critical laboratory values, are instrumental in the early detection of sepsis. Applying feature selection methods significantly boosts model precision, with models like Random Forest and XG Boost showing promising results. Furthermore, Deep Learning models (DL) reveal unique insights, spotlighting the pivotal role of feature engineering in sepsis prediction, which could greatly benefit clinical practice.
Topics: Humans; Sepsis; Machine Learning
PubMed: 38802973
DOI: 10.1186/s13054-024-04948-6 -
Database : the Journal of Biological... Jan 2024The post-translational modifications occur as crucial molecular regulatory mechanisms utilized to regulate diverse cellular processes. Malonylation of proteins, a... (Review)
Review
The post-translational modifications occur as crucial molecular regulatory mechanisms utilized to regulate diverse cellular processes. Malonylation of proteins, a reversible post-translational modification of lysine/k residues, is linked to a variety of biological functions, such as cellular regulation and pathogenesis. This modification plays a crucial role in metabolic pathways, mitochondrial functions, fatty acid oxidation and other life processes. However, accurately identifying malonylation sites is crucial to understand the molecular mechanism of malonylation, and the experimental identification can be a challenging and costly task. Recently, approaches based on machine learning (ML) have been suggested to address this issue. It has been demonstrated that these procedures improve accuracy while lowering costs and time constraints. However, these approaches also have specific shortcomings, including inappropriate feature extraction out of protein sequences, high-dimensional features and inefficient underlying classifiers. As a result, there is an urgent need for effective predictors and calculation methods. In this study, we provide a comprehensive analysis and review of existing prediction models, tools and benchmark datasets for predicting malonylation sites in protein sequences followed by a comparison study. The review consists of the specifications of benchmark datasets, explanation of features and encoding methods, descriptions of the predictions approaches and their embedding ML or deep learning models and the description and comparison of the existing tools in this domain. To evaluate and compare the prediction capability of the tools, a new bunch of data has been extracted based on the most updated database and the tools have been assessed based on the extracted data. Finally, a hybrid architecture consisting of several classifiers including classical ML models and a deep learning model has been proposed to ensemble the prediction results. This approach demonstrates the better performance in comparison with all prediction tools included in this study (the source codes of the models presented in this manuscript are available in https://github.com/Malonylation). Database URL: https://github.com/A-Golshan/Malonylation.
Topics: Deep Learning; Lysine; Machine Learning; Protein Processing, Post-Translational; Proteins
PubMed: 38245002
DOI: 10.1093/database/baad094 -
Scientific Reports Sep 2023Both machine learning and physiologically-based pharmacokinetic models are becoming essential components of the drug development process. Integrating the predictive...
Both machine learning and physiologically-based pharmacokinetic models are becoming essential components of the drug development process. Integrating the predictive capabilities of physiologically-based pharmacokinetic (PBPK) models within machine learning (ML) pipelines could offer significant benefits in improving the accuracy and scope of drug screening and evaluation procedures. Here, we describe the development and testing of a self-contained machine learning module capable of faithfully recapitulating summary pharmacokinetic (PK) parameters produced by a full PBPK model, given a set of input drug-specific and regimen-specific information. Because of its widespread use in characterizing the disposition of orally administered drugs, the PBPK model chosen to demonstrate the methodology was an open-source implementation of a state-of-the-art compartmental and transit model called OpenCAT. The model was tested for drug formulations spanning a large range of solubility and absorption characteristics, and was evaluated for concordance against predictions of OpenCAT and relevant experimental data. In general, the values predicted by the ML models were within 20% of those of the PBPK model across the range of drug and formulation properties. However, summary PK parameter predictions from both the ML model and full PBPK model were occasionally poor with respect to those derived from experiments, suggesting deficiencies in the underlying PBPK model.
Topics: Drug Evaluation, Preclinical; Machine Learning; Solubility
PubMed: 37696914
DOI: 10.1038/s41598-023-42165-3