研究动态
Articles below are published ahead of final publication in an issue. Please cite articles in the following format: authors, (year), title, journal, DOI.

使用基于 Transformer 的神经网络从文本数据中提取 miRNA 与疾病关系的数据集。

Dataset of miRNA-disease relations extracted from textual data using transformer-based neural networks.

发表日期:2024 Aug 05
作者: Sumit Madan, Lisa Kühnel, Holger Fröhlich, Martin Hofmann-Apitius, Juliane Fluck
来源: Database-Oxford

摘要:

MicroRNA (miRNA) 在转录后过程中发挥重要作用并调节主要细胞功能。 miRNA表达的异常调节与许多人类疾病有关,例如呼吸系统疾病、癌症和神经退行性疾病。最新的 miRNA 与疾病的关联主要见于非结构化生物医学文献。由于出版物数量不断增加,手动检索这些关联可能既麻烦又耗时。我们提出了一种基于深度学习的文本挖掘方法,从生物医学文献中提取标准化的 miRNA 与疾病关联。为了训练深度学习模型,我们构建了一个新的训练语料库,该语料库通过利用多个外部数据库的远程监督进行扩展。定量评估表明,该工作流程在用于检测 miRNA 与疾病关联的保留测试集上实现了 98% 的接收者操作特征曲线下面积。我们通过从生物医学文献(PubMed 和 PubMed Central)中提取新的 miRNA-疾病关联来证明该方法的适用性。我们通过对三种不同神经退行性疾病的定量分析和评估表明,我们的方法可以有效地提取公共数据库中尚未提供的 miRNA 与疾病的关联。数据库网址:https://zenodo.org/records/10523046.© 作者 2024。由牛津大学出版社出版。
MicroRNAs (miRNAs) play important roles in post-transcriptional processes and regulate major cellular functions. The abnormal regulation of expression of miRNAs has been linked to numerous human diseases such as respiratory diseases, cancer, and neurodegenerative diseases. Latest miRNA-disease associations are predominantly found in unstructured biomedical literature. Retrieving these associations manually can be cumbersome and time-consuming due to the continuously expanding number of publications. We propose a deep learning-based text mining approach that extracts normalized miRNA-disease associations from biomedical literature. To train the deep learning models, we build a new training corpus that is extended by distant supervision utilizing multiple external databases. A quantitative evaluation shows that the workflow achieves an area under receiver operator characteristic curve of 98% on a holdout test set for the detection of miRNA-disease associations. We demonstrate the applicability of the approach by extracting new miRNA-disease associations from biomedical literature (PubMed and PubMed Central). We have shown through quantitative analysis and evaluation on three different neurodegenerative diseases that our approach can effectively extract miRNA-disease associations not yet available in public databases. Database URL: https://zenodo.org/records/10523046.© The Author(s) 2024. Published by Oxford University Press.