研究动态
Articles below are published ahead of final publication in an issue. Please cite articles in the following format: authors, (year), title, journal, DOI.

用于诊断口腔癌和口腔潜在恶性疾病的带注释口腔图像的综合数据集。

A comprehensive dataset of annotated oral cavity images for diagnosis of oral cancer and oral potentially malignant disorders.

发表日期:2024 Jul 12
作者: N S Piyarathne, S N Liyanage, R M S G K Rasnayaka, P V K S Hettiarachchi, G A I Devindi, F B A H Francis, D M D R Dissanayake, R A N S Ranasinghe, M B D Pavithya, I B Nawinne, R G Ragel, R D Jayasinghe
来源: ORAL ONCOLOGY

摘要:

本研究旨在解决无法公开获取口腔图像数据集的关键差距,以开发用于口腔癌 (OCA) 和口腔潜在恶性疾病 (OPMD) 的诊断和预后的机器学习 (ML) 和人工智能 (AI) 技术,特别关注亚洲的高患病率和延迟诊断。在获得伦理批准和知情书面同意后,从手机摄像头获取口腔图像,并从就诊于牙科教学医院的患者的医院记录中提取临床数据,佩拉德尼亚,斯里兰卡。在数据管理和托管之后,临床医生使用研究团队开发的定制软件工具完成图像分类和注释。数据集包含从 714 名患者获得的 3000 张高质量匿名图像,被分为四个不同的类别:健康、良性、OPMD 和 OCA。图像用多边形口腔和病变边界进行注释。每张图像都附有患者元数据,包括年龄、性别、诊断以及吸烟、饮酒和槟榔咀嚼习惯等危险因素概况。研究人员可以利用 COCO 格式的带注释图像以及患者元数据来增强ML 和 AI 算法开发。版权所有 © 2024 作者。由爱思唯尔有限公司出版。保留所有权利。
This study aims to address the critical gap of unavailability of publicly accessible oral cavity image datasets for developing machine learning (ML) and artificial intelligence (AI) technologies for the diagnosis and prognosis of oral cancer (OCA) and oral potentially malignant disorders (OPMD), with a particular focus on the high prevalence and delayed diagnosis in Asia.Following ethical approval and informed written consent, images of the oral cavity were obtained from mobile phone cameras and clinical data was extracted from hospital records from patients attending to the Dental Teaching Hospital, Peradeniya, Sri Lanka. After data management and hosting, image categorization and annotations were done by clinicians using a custom-made software tool developed by the research team.A dataset comprising 3000 high-quality, anonymized images obtained from 714 patients were classified into four distinct categories: healthy, benign, OPMD, and OCA. Images were annotated with polygonal shaped oral cavity and lesion boundaries. Each image is accompanied by patient metadata, including age, sex, diagnosis, and risk factor profiles such as smoking, alcohol, and betel chewing habits.Researchers can utilize the annotated images in the COCO format, along with the patients' metadata, to enhance ML and AI algorithm development.Copyright © 2024 The Author(s). Published by Elsevier Ltd.. All rights reserved.