-
Mathematical Biosciences and... Jul 2023Fake news has already become a severe problem on social media, with substantially more detrimental impacts on society than previously thought. Research on multi-modal...
Fake news has already become a severe problem on social media, with substantially more detrimental impacts on society than previously thought. Research on multi-modal fake news detection has substantial practical significance since online fake news that includes multimedia elements are more likely to mislead users and propagate widely than text-only fake news. However, the existing multi-modal fake news detection methods have the following problems: 1) Existing methods usually use traditional CNN models and their variants to extract image features, which cannot fully extract high-quality visual features. 2) Existing approaches usually adopt a simple concatenate approach to fuse inter-modal features, leading to unsatisfactory detection results. 3) Most fake news has large disparity in feature similarity between images and texts, yet existing models do not fully utilize this aspect. Thus, we propose a novel model (TGA) based on transformers and multi-modal fusion to address the above problems. Specifically, we extract text and image features by different transformers and fuse features by attention mechanisms. In addition, we utilize the degree of feature similarity between texts and images in the classifier to improve the performance of TGA. Experimental results on the public datasets show the effectiveness of TGA*. * Our code is available at https://github.com/PPEXCEPED/TGA.
PubMed: 37679154
DOI: 10.3934/mbe.2023657 -
BMC Medical Informatics and Decision... Jul 2022Named entity recognition (NER) is a key and fundamental part of many medical and clinical tasks, including the establishment of a medical knowledge graph,...
OBJECTIVE
Named entity recognition (NER) is a key and fundamental part of many medical and clinical tasks, including the establishment of a medical knowledge graph, decision-making support, and question answering systems. When extracting entities from electronic health records (EHRs), NER models mostly apply long short-term memory (LSTM) and have surprising performance in clinical NER. However, increasing the depth of the network is often required by these LSTM-based models to capture long-distance dependencies. Therefore, these LSTM-based models that have achieved high accuracy generally require long training times and extensive training data, which has obstructed the adoption of LSTM-based models in clinical scenarios with limited training time.
METHOD
Inspired by Transformer, we combine Transformer with Soft Term Position Lattice to form soft lattice structure Transformer, which models long-distance dependencies similarly to LSTM. Our model consists of four components: the WordPiece module, the BERT module, the soft lattice structure Transformer module, and the CRF module.
RESULT
Our experiments demonstrated that this approach increased the F1 by 1-5% in the CCKS NER task compared to other models based on LSTM with CRF and consumed less training time. Additional evaluations showed that lattice structure transformer shows good performance for recognizing long medical terms, abbreviations, and numbers. The proposed model achieve 91.6% f-measure in recognizing long medical terms and 90.36% f-measure in abbreviations, and numbers.
CONCLUSIONS
By using soft lattice structure Transformer, the method proposed in this paper captured Chinese words to lattice information, making our model suitable for Chinese clinical medical records. Transformers with Mutilayer soft lattice Chinese word construction can capture potential interactions between Chinese characters and words.
Topics: China; Electronic Health Records; Humans; Natural Language Processing
PubMed: 35908055
DOI: 10.1186/s12911-022-01924-4 -
Diagnostics (Basel, Switzerland) Jul 2021Over the past decade, convolutional neural networks (CNN) have shown very competitive performance in medical image analysis tasks, such as disease classification, tumor...
Over the past decade, convolutional neural networks (CNN) have shown very competitive performance in medical image analysis tasks, such as disease classification, tumor segmentation, and lesion detection. CNN has great advantages in extracting local features of images. However, due to the locality of convolution operation, it cannot deal with long-range relationships well. Recently, transformers have been applied to computer vision and achieved remarkable success in large-scale datasets. Compared with natural images, multi-modal medical images have explicit and important long-range dependencies, and effective multi-modal fusion strategies can greatly improve the performance of deep models. This prompts us to study transformer-based structures and apply them to multi-modal medical images. Existing transformer-based network architectures require large-scale datasets to achieve better performance. However, medical imaging datasets are relatively small, which makes it difficult to apply pure transformers to medical image analysis. Therefore, we propose TransMed for multi-modal medical image classification. TransMed combines the advantages of CNN and transformer to efficiently extract low-level features of images and establish long-range dependencies between modalities. We evaluated our model on two datasets, parotid gland tumors classification and knee injury classification. Combining our contributions, we achieve an improvement of 10.1% and 1.9% in average accuracy, respectively, outperforming other state-of-the-art CNN-based models. The results of the proposed method are promising and have tremendous potential to be applied to a large number of medical image analysis tasks. To our best knowledge, this is the first work to apply transformers to multi-modal medical image classification.
PubMed: 34441318
DOI: 10.3390/diagnostics11081384 -
Frontiers in Neuroscience 2023A hybrid UNet and Transformer (HUT) network is introduced to combine the merits of the UNet and Transformer architectures, improving brain lesion segmentation from MRI...
A hybrid UNet and Transformer (HUT) network is introduced to combine the merits of the UNet and Transformer architectures, improving brain lesion segmentation from MRI and CT scans. The HUT overcomes the limitations of conventional approaches by utilizing two parallel stages: one based on UNet and the other on Transformers. The Transformer-based stage captures global dependencies and long-range correlations. It uses intermediate feature vectors from the UNet decoder and improves segmentation accuracy by enhancing the attention and relationship modeling between voxel patches derived from the 3D brain volumes. In addition, HUT incorporates self-supervised learning on the transformer network. This allows the transformer network to learn by maintaining consistency between the classification layers of the different resolutions of patches and augmentations. There is an improvement in the rate of convergence of the training and the overall capability of segmentation. Experimental results on benchmark datasets, including ATLAS and ISLES2018, demonstrate HUT's advantage over the state-of-the-art methods. HUT achieves higher Dice scores and reduced Hausdorff Distance scores in single-modality and multi-modality lesion segmentation. HUT outperforms the state-the-art network SPiN in the single-modality MRI segmentation on Anatomical Tracings of lesion After Stroke (ATLAS) dataset by 4.84% of Dice score and a large margin of 40.7% in the Hausdorff Distance score. HUT also performed well on CT perfusion brain scans in the Ischemic Stroke Lesion Segmentation (ISLES2018) dataset and demonstrated an improvement over the recent state-of-the-art network USSLNet by 3.3% in the Dice score and 12.5% in the Hausdorff Distance score. With the analysis of both single and multi-modalities datasets (ATLASR12 and ISLES2018), we show that HUT can perform and generalize well on different datasets. Code is available at: https://github.com/vicsohntu/HUT_CT.
PubMed: 38105927
DOI: 10.3389/fnins.2023.1298514 -
Proceedings of Machine Learning Research Jul 2022Transformers have emerged as a preferred model for many tasks in natural langugage processing and vision. Recent efforts on training and deploying Transformers more...
Transformers have emerged as a preferred model for many tasks in natural langugage processing and vision. Recent efforts on training and deploying Transformers more efficiently have identified many strategies to approximate the self-attention matrix, a key module in a Transformer architecture. Effective ideas include various prespecified sparsity patterns, low-rank basis expansions and combinations thereof. In this paper, we revisit classical Multiresolution Analysis (MRA) concepts such as Wavelets, whose potential value in this setting remains underexplored thus far. We show that simple approximations based on empirical feedback and design choices informed by modern hardware and implementation challenges, eventually yield a MRA-based approach for self-attention with an excellent performance profile across most criteria of interest. We undertake an extensive set of experiments and demonstrate that this multi-resolution scheme outperforms most efficient self-attention proposals and is favorable for both short and long sequences. Code is available at https://github.com/mlpen/mra-attention.
PubMed: 37139473
DOI: No ID Found -
Scientific Reports Dec 2023Three dimensional electron back-scattered diffraction (EBSD) microscopy is a critical tool in many applications in materials science, yet its data quality can fluctuate...
Three dimensional electron back-scattered diffraction (EBSD) microscopy is a critical tool in many applications in materials science, yet its data quality can fluctuate greatly during the arduous collection process, particularly via serial-sectioning. Fortunately, 3D EBSD data is inherently sequential, opening up the opportunity to use transformers, state-of-the-art deep learning architectures that have made breakthroughs in a plethora of domains, for data processing and recovery. To be more robust to errors and accelerate this 3D EBSD data collection, we introduce a two step method that recovers missing slices in an 3D EBSD volume, using an efficient transformer model and a projection algorithm to process the transformer's outputs. Overcoming the computational and practical hurdles of deep learning with scarce high dimensional data, we train this model using only synthetic 3D EBSD data with self-supervision and obtain superior recovery accuracy on real 3D EBSD data, compared to existing methods.
PubMed: 38040823
DOI: 10.1038/s41598-023-47936-6 -
Sensors (Basel, Switzerland) Aug 2023Forecasting energy consumption models allow for improvements in building performance and reduce energy consumption. Energy efficiency has become a pressing concern in...
Forecasting energy consumption models allow for improvements in building performance and reduce energy consumption. Energy efficiency has become a pressing concern in recent years due to the increasing energy demand and concerns over climate change. This paper addresses the energy consumption forecast as a crucial ingredient in the technology to optimize building system operations and identifies energy efficiency upgrades. The work proposes a modified multi-head transformer model focused on multi-variable time series through a learnable weighting feature attention matrix to combine all input variables and forecast building energy consumption properly. The proposed multivariate transformer-based model is compared with two other recurrent neural network models, showing a robust performance while exhibiting a lower mean absolute percentage error. Overall, this paper highlights the superior performance of the modified transformer-based model for the energy consumption forecast in a multivariate step, allowing it to be incorporated in future forecasting tasks, allowing for the tracing of future energy consumption scenarios according to the current building usage, playing a significant role in creating a more sustainable and energy-efficient building usage.
PubMed: 37571622
DOI: 10.3390/s23156840 -
Sensors (Basel, Switzerland) Oct 2023Medical image segmentation is crucial for medical image processing and the development of computer-aided diagnostics. In recent years, deep Convolutional Neural Networks...
Medical image segmentation is crucial for medical image processing and the development of computer-aided diagnostics. In recent years, deep Convolutional Neural Networks (CNNs) have been widely adopted for medical image segmentation and have achieved significant success. UNet, which is based on CNNs, is the mainstream method used for medical image segmentation. However, its performance suffers owing to its inability to capture long-range dependencies. Transformers were initially designed for Natural Language Processing (NLP), and sequence-to-sequence applications have demonstrated the ability to capture long-range dependencies. However, their abilities to acquire local information are limited. Hybrid architectures of CNNs and Transformer, such as TransUNet, have been proposed to benefit from Transformer's long-range dependencies and CNNs' low-level details. Nevertheless, automatic medical image segmentation remains a challenging task due to factors such as blurred boundaries, the low-contrast tissue environment, and in the context of ultrasound, issues like speckle noise and attenuation. In this paper, we propose a new model that combines the strengths of both CNNs and Transformer, with network architectural improvements designed to enrich the feature representation captured by the skip connections and the decoder. To this end, we devised a new attention module called Three-Level Attention (TLA). This module is composed of an Attention Gate (AG), channel attention, and spatial normalization mechanism. The AG preserves structural information, whereas channel attention helps to model the interdependencies between channels. Spatial normalization employs the spatial coefficient of the Transformer to improve spatial attention akin to TransNorm. To further improve the skip connection and reduce the semantic gap, skip connections between the encoder and decoder were redesigned in a manner similar to that of the UNet++ dense connection. Moreover, deep supervision using a side-output channel was introduced, analogous to BASNet, which was originally used for saliency predictions. Two datasets from different modalities, a CT scan dataset and an ultrasound dataset, were used to evaluate the proposed UNet architecture. The experimental results showed that our model consistently improved the prediction performance of the UNet across different datasets.
Topics: Diagnosis, Computer-Assisted; Electric Power Supplies; Image Processing, Computer-Assisted; Natural Language Processing; Neural Networks, Computer
PubMed: 37896682
DOI: 10.3390/s23208589 -
Journal of Cardiothoracic Surgery Feb 2024Artificial intelligence (AI) is a transformative technology with many benefits, but also risks when applied to healthcare and cardiac surgery in particular. Surgeons... (Review)
Review
Artificial intelligence (AI) is a transformative technology with many benefits, but also risks when applied to healthcare and cardiac surgery in particular. Surgeons must be aware of AI and its application through generative pre-trained transformers (GPT/ChatGPT) to fully understand what this offers to clinical care, decision making, training, research and education. Clinicians must appreciate that the advantages and potential for transformative change in practice is balanced by risks typified by validation, ethical challenges and medicolegal concerns. ChatGPT should be seen as a tool to support and enhance the skills of surgeons, rather than a replacement for their experience and judgment. Human oversight and intervention will always be necessary to ensure patient safety and to make complex decisions that may require a refined understanding of individual patient circumstances.
Topics: Humans; Artificial Intelligence; Cardiac Surgical Procedures; Heart Transplantation; Educational Status; Patient Safety
PubMed: 38409178
DOI: 10.1186/s13019-024-02541-0 -
Computers in Biology and Medicine Aug 2022The accurate identification of Drug-Target Interactions (DTIs) remains a critical turning point in drug discovery and understanding of the binding process. Despite...
The accurate identification of Drug-Target Interactions (DTIs) remains a critical turning point in drug discovery and understanding of the binding process. Despite recent advances in computational solutions to overcome the challenges of in vitro and in vivo experiments, most of the proposed in silico-based methods still focus on binary classification, overlooking the importance of characterizing DTIs with unbiased binding strength values to properly distinguish primary interactions from those with off-targets. Moreover, several of these methods usually simplify the entire interaction mechanism, neglecting the joint contribution of the individual units of each binding component and the interacting substructures involved, and have yet to focus on more explainable and interpretable architectures. In this study, we propose an end-to-end Transformer-based architecture for predicting drug-target binding affinity (DTA) using 1D raw sequential and structural data to represent the proteins and compounds. This architecture exploits self-attention layers to capture the biological and chemical context of the proteins and compounds, respectively, and cross-attention layers to exchange information and capture the pharmacological context of the DTIs. The results show that the proposed architecture is effective in predicting DTA, achieving superior performance in both correctly predicting the value of interaction strength and being able to correctly discriminate the rank order of binding strength compared to state-of-the-art baselines. The combination of multiple Transformer-Encoders was found to result in robust and discriminative aggregate representations of the proteins and compounds for binding affinity prediction, in which the addition of a Cross-Attention Transformer-Encoder was identified as an important block for improving the discriminative power of these representations. Overall, this research study validates the applicability of an end-to-end Transformer-based architecture in the context of drug discovery, capable of self-providing different levels of potential DTI and prediction understanding due to the nature of the attention blocks. The data and source code used in this study are available at: https://github.com/larngroup/DTITR.
Topics: Drug Development; Drug Discovery; Proteins; Software
PubMed: 35777085
DOI: 10.1016/j.compbiomed.2022.105772