-
Journal of Imaging Aug 2023This paper presents a system that utilizes vision transformers and multimodal feedback modules to facilitate navigation and collision avoidance for the visually...
This paper presents a system that utilizes vision transformers and multimodal feedback modules to facilitate navigation and collision avoidance for the visually impaired. By implementing vision transformers, the system achieves accurate object detection, enabling the real-time identification of objects in front of the user. Semantic segmentation and the algorithms developed in this work provide a means to generate a trajectory vector of all identified objects from the vision transformer and to detect objects that are likely to intersect with the user's walking path. Audio and vibrotactile feedback modules are integrated to convey collision warning through multimodal feedback. The dataset used to create the model was captured from both indoor and outdoor settings under different weather conditions at different times across multiple days, resulting in 27,867 photos consisting of 24 different classes. Classification results showed good performance (95% accuracy), supporting the efficacy and reliability of the proposed model. The design and control methods of the multimodal feedback modules for collision warning are also presented, while the experimental validation concerning their usability and efficiency stands as an upcoming endeavor. The demonstrated performance of the vision transformer and the presented algorithms in conjunction with the multimodal feedback modules show promising prospects of its feasibility and applicability for the navigation assistance of individuals with vision impairment.
PubMed: 37623693
DOI: 10.3390/jimaging9080161 -
Pacific Symposium on Biocomputing.... 2022The established approach to unsupervised protein contact prediction estimates coevolving positions using undirected graphical models. This approach trains a Potts model...
The established approach to unsupervised protein contact prediction estimates coevolving positions using undirected graphical models. This approach trains a Potts model on a Multiple Sequence Alignment. Increasingly large Transformers are being pretrained on unlabeled, unaligned protein sequence databases and showing competitive performance on protein contact prediction. We argue that attention is a principled model of protein interactions, grounded in real properties of protein family data. We introduce an energy-based attention layer, factored attention, which, in a certain limit, recovers a Potts model, and use it to contrast Potts and Transformers. We show that the Transformer leverages hierarchical signal in protein family databases not captured by single-layer models. This raises the exciting possibility for the development of powerful structured models of protein family databases.
Topics: Attention; Computational Biology; Humans; Proteins; Sequence Alignment
PubMed: 34890134
DOI: No ID Found -
Journal of Medical Internet Research Dec 2023Electronic health records (EHRs) in unstructured formats are valuable sources of information for research in both the clinical and biomedical domains. However, before...
BACKGROUND
Electronic health records (EHRs) in unstructured formats are valuable sources of information for research in both the clinical and biomedical domains. However, before such records can be used for research purposes, sensitive health information (SHI) must be removed in several cases to protect patient privacy. Rule-based and machine learning-based methods have been shown to be effective in deidentification. However, very few studies investigated the combination of transformer-based language models and rules.
OBJECTIVE
The objective of this study is to develop a hybrid deidentification pipeline for Australian EHR text notes using rules and transformers. The study also aims to investigate the impact of pretrained word embedding and transformer-based language models.
METHODS
In this study, we present a hybrid deidentification pipeline called OpenDeID, which is developed using an Australian multicenter EHR-based corpus called OpenDeID Corpus. The OpenDeID corpus consists of 2100 pathology reports with 38,414 SHI entities from 1833 patients. The OpenDeID pipeline incorporates a hybrid approach of associative rules, supervised deep learning, and pretrained language models.
RESULTS
The OpenDeID achieved a best F-score of 0.9659 by fine-tuning the Discharge Summary BioBERT model and incorporating various preprocessing and postprocessing rules. The OpenDeID pipeline has been deployed at a large tertiary teaching hospital and has processed over 8000 unstructured EHR text notes in real time.
CONCLUSIONS
The OpenDeID pipeline is a hybrid deidentification pipeline to deidentify SHI entities in unstructured EHR text notes. The pipeline has been evaluated on a large multicenter corpus. External validation will be undertaken as part of our future work to evaluate the effectiveness of the OpenDeID pipeline.
Topics: Humans; Australia; Data Anonymization; Electronic Health Records; Algorithms; Hospitals, Teaching
PubMed: 38055317
DOI: 10.2196/48145 -
Social Network Analysis and Mining 2023Since the COVID-19 pandemic, healthcare services, particularly remote and automated healthcare consultations, have gained increased attention. Medical bots, which...
Since the COVID-19 pandemic, healthcare services, particularly remote and automated healthcare consultations, have gained increased attention. Medical bots, which provide medical advice and support, are becoming increasingly popular. They offer numerous benefits, including 24/7 access to medical counseling, reduced appointment wait times by providing quick answers to common questions or concerns, and cost savings associated with fewer visits or tests required for diagnosis and treatment plans. The success of medical bots depends on the quality of their learning, which in turn depends on the appropriate corpus within the domain of interest. Arabic is one of the most commonly used languages for sharing users' internet content. However, implementing medical bots in Arabic faces several challenges, including the language's morphological composition, the diversity of dialects, and the need for an appropriate and large enough corpus in the medical domain. To address this gap, this paper introduces the largest Arabic Healthcare Q &A dataset, called MAQA, consisting of over 430,000 questions distributed across 20 medical specializations. Furthermore, this paper adopts three deep learning models, namely LSTM, Bi-LSTM, and Transformers, for experimenting and benchmarking the proposed corpus MAQA. The experimental results demonstrate that the recent Transformer model outperforms the traditional deep learning models, achieving an average cosine similarity of 80.81% and a BLeU score of 58%.
PubMed: 37096241
DOI: 10.1007/s13278-023-01077-w -
JMIR Medical Informatics Nov 2023Transformer-based models are gaining popularity in medical imaging and cancer imaging applications. Many recent studies have demonstrated the use of transformer-based... (Review)
Review
BACKGROUND
Transformer-based models are gaining popularity in medical imaging and cancer imaging applications. Many recent studies have demonstrated the use of transformer-based models for brain cancer imaging applications such as diagnosis and tumor segmentation.
OBJECTIVE
This study aims to review how different vision transformers (ViTs) contributed to advancing brain cancer diagnosis and tumor segmentation using brain image data. This study examines the different architectures developed for enhancing the task of brain tumor segmentation. Furthermore, it explores how the ViT-based models augmented the performance of convolutional neural networks for brain cancer imaging.
METHODS
This review performed the study search and study selection following the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) guidelines. The search comprised 4 popular scientific databases: PubMed, Scopus, IEEE Xplore, and Google Scholar. The search terms were formulated to cover the interventions (ie, ViTs) and the target application (ie, brain cancer imaging). The title and abstract for study selection were performed by 2 reviewers independently and validated by a third reviewer. Data extraction was performed by 2 reviewers and validated by a third reviewer. Finally, the data were synthesized using a narrative approach.
RESULTS
Of the 736 retrieved studies, 22 (3%) were included in this review. These studies were published in 2021 and 2022. The most commonly addressed task in these studies was tumor segmentation using ViTs. No study reported early detection of brain cancer. Among the different ViT architectures, Shifted Window transformer-based architectures have recently become the most popular choice of the research community. Among the included architectures, UNet transformer and TransUNet had the highest number of parameters and thus needed a cluster of as many as 8 graphics processing units for model training. The brain tumor segmentation challenge data set was the most popular data set used in the included studies. ViT was used in different combinations with convolutional neural networks to capture both the global and local context of the input brain imaging data.
CONCLUSIONS
It can be argued that the computational complexity of transformer architectures is a bottleneck in advancing the field and enabling clinical transformations. This review provides the current state of knowledge on the topic, and the findings of this review will be helpful for researchers in the field of medical artificial intelligence and its applications in brain cancer.
PubMed: 37976086
DOI: 10.2196/47445 -
Advances in Neural Information... Dec 2022Transformers have made remarkable progress towards modeling long-range dependencies within the medical image analysis domain. However, current transformer-based models...
Transformers have made remarkable progress towards modeling long-range dependencies within the medical image analysis domain. However, current transformer-based models suffer from several disadvantages: (1) existing methods fail to capture the important features of the images due to the naive tokenization scheme; (2) the models suffer from information loss because they only consider single-scale feature representations; and (3) the segmentation label maps generated by the models are not accurate enough without considering rich semantic contexts and anatomical textures. In this work, we present CASTformer, a novel type of adversarial transformers, for 2D medical image segmentation. First, we take advantage of the pyramid structure to construct multi-scale representations and handle multi-scale variations. We then design a novel class-aware transformer module to better learn the discriminative regions of objects with semantic structures. Lastly, we utilize an adversarial training strategy that boosts segmentation accuracy and correspondingly allows a transformer-based discriminator to capture high-level semantically correlated contents and low-level anatomical features. Our experiments demonstrate that CASTformer dramatically outperforms previous state-of-the-art transformer-based approaches on three benchmarks, obtaining 2.54%-5.88% absolute improvements in Dice over previous models. Further qualitative experiments provide a more detailed picture of the model's inner workings, shed light on the challenges in improved transparency, and demonstrate that transfer learning can greatly improve performance and reduce the size of medical image datasets in training, making CASTformer a strong starting point for downstream medical image analysis tasks.
PubMed: 37533756
DOI: No ID Found -
Journal of Digital Imaging Dec 2022Limited availability of medical imaging datasets is a vital limitation when using "data hungry" deep learning to gain performance improvements. Dealing with the issue,...
Limited availability of medical imaging datasets is a vital limitation when using "data hungry" deep learning to gain performance improvements. Dealing with the issue, transfer learning has become a de facto standard, where a pre-trained convolution neural network (CNN), typically on natural images (e.g., ImageNet), is finetuned on medical images. Meanwhile, pre-trained transformers, which are self-attention-based models, have become de facto standard in natural language processing (NLP) and state of the art in image classification due to their powerful transfer learning abilities. Inspired by the success of transformers in NLP and image classification, large-scale transformers (such as vision transformer) are trained on natural images. Based on these recent developments, this research aims to explore the efficacy of pre-trained natural image transformers for medical images. Specifically, we analyze pre-trained vision transformer on CheXpert and pediatric pneumonia dataset. We use CNN standard models including VGGNet and ResNet as baseline models. By examining the acquired representations and results, we discover that transfer learning from the pre-trained vision transformer shows improved results as compared to pre-trained CNN which demonstrates a greater transfer ability of the transformers in medical imaging.
Topics: Humans; Child; Neural Networks, Computer; Radiography; Machine Learning
PubMed: 35819537
DOI: 10.1007/s10278-022-00666-z -
Research Square Jun 2023Accurately predicting cellular activities of proteins based on their primary amino acid sequences would greatly improve our understanding of the proteome. In this paper,...
Accurately predicting cellular activities of proteins based on their primary amino acid sequences would greatly improve our understanding of the proteome. In this paper, we present CELL-E, a text-to-image transformer model that generates 2D probability density images describing the spatial distribution of proteins within cells. Given an amino acid sequence and a reference image for cell or nucleus morphology, CELL-E predicts a more refined representation of protein localization, as opposed to previous methods that rely on pre-defined, discrete class annotations of protein localization to subcellular compartments.
PubMed: 37398207
DOI: 10.21203/rs.3.rs-2963881/v1 -
Optics Express Aug 2022Chiral metamaterials with circular dichroism (CD) or asymmetric transmission (AT) draw enormous attention for their attractive applications in polarization transformers,...
Chiral metamaterials with circular dichroism (CD) or asymmetric transmission (AT) draw enormous attention for their attractive applications in polarization transformers, circular polarizers, and biosensing. In this study, a feasible trilayer chiral metamaterials (TCM) is designed and investigated in theory and simulation. The proposed TCM is composed of a nanoslit layer and a Babinet-complementary nanorod layer separated by a nanoslit spacer. Owing to symmetry breaking by the tilted nanoslit in metal film, the TCM shows simultaneous CD and AT effects in the near-infrared region. The simulated electric charge distributions prove that the chirality arises from the excitation of asymmetric electric dipole resonant modes due to the coupling of adjacent unit cells. Moreover, CD and AT can be tuned by the tilted angle of the nanoslit and the thickness of the spacer, the fitting functions of which are consistent with the theoretical formulas based on transmittance matrix analysis. The proposed nanostructure offers a potential strategy for manipulating metamaterials with simultaneous CD and AT effects, allowing a multitude of exciting applications such as ultra-sensitive polarization transformer and biosensor.
Topics: Biosensing Techniques; Circular Dichroism; Metals; Nanostructures
PubMed: 36242144
DOI: 10.1364/OE.464798 -
ELife Jan 2023Recent developments in deep learning, coupled with an increasing number of sequenced proteins, have led to a breakthrough in life science applications, in particular in... (Review)
Review
Recent developments in deep learning, coupled with an increasing number of sequenced proteins, have led to a breakthrough in life science applications, in particular in protein property prediction. There is hope that deep learning can close the gap between the number of sequenced proteins and proteins with known properties based on lab experiments. Language models from the field of natural language processing have gained popularity for protein property predictions and have led to a new computational revolution in biology, where old prediction results are being improved regularly. Such models can learn useful multipurpose representations of proteins from large open repositories of protein sequences and can be used, for instance, to predict protein properties. The field of natural language processing is growing quickly because of developments in a class of models based on a particular model-the Transformer model. We review recent developments and the use of large-scale Transformer models in applications for predicting protein characteristics and how such models can be used to predict, for example, post-translational modifications. We review shortcomings of other deep learning models and explain how the Transformer models have quickly proven to be a very promising way to unravel information hidden in the sequences of amino acids.
Topics: Deep Learning; Amino Acid Sequence; Amino Acids; Biological Science Disciplines; Language
PubMed: 36651724
DOI: 10.7554/eLife.82819