基于高斯拉普拉斯金字塔混合和相似度度量的医学图像分类数据增强。
Data Augmentation for Medical Image Classification based on Gaussian Laplacian Pyramid Blending with a Similarity Measure.
发表日期:2023 Aug 21
作者:
Abhinav Kumar, Anshul Sharma, Amit Kumar Singh, Sanjay Kumar Singh, Sonal Saxena
来源:
Disease Models & Mechanisms
摘要:
乳腺癌是一种在世界范围内影响妇女的毁灭性疾病,计算机辅助算法已显示出自动化癌症诊断的潜力。最近,生成式人工智能(GenAI)为解决标记数据稀缺和关键应用准确预测的挑战打开了新的可能性。然而,缺乏多样性以及不真实和不可靠的数据对性能有着不利影响。因此,本研究提出了一种增强方案,以解决医学数据集中标记数据稀缺和数据不平衡的问题。该方法结合了高斯-拉普拉斯金字塔和金字塔混合的概念,并采用相似性度量。为了保持图像的结构属性并捕捉同一类别患者图像的内部变异性,引入了基于相似性度量的混合方法。它有助于保持数据集的整体质量和完整性。随后,采用了具有显著修改的深度学习方法,通过使用连接的预训练模型进行迁移学习来对乳腺癌组织病理学图像进行分类。通过对三个不同的医学数据集进行详细分析,展示了提案的有效性,包括数据增强的影响,显示出较基线模型的显著性能改进。该提案有潜力为乳腺癌诊断的更准确可靠的方法的发展做出贡献。
Breast cancer is a devastating disease that affects women worldwide, and computer-aided algorithms have shown potential in automating cancer diagnosis. Recently Generative Artificial Intelligence (GenAI) opens new possibilities for addressing the challenges of labeled data scarcity and accurate prediction in critical applications. However, a lack of diversity, as well as unrealistic and unreliable data, have a detrimental impact on performance. Therefore, this study proposes an augmentation scheme to address the scarcity of labeled data and data imbalance in medical datasets. This approach integrates the concepts of the Gaussian-Laplacian pyramid and pyramid blending with similarity measures. In order to maintain the structural properties of images and capture inter-variability of patient images of the same category similarity-metric-based intermixing has been introduced. It helps to maintain the overall quality and integrity of the dataset. Subsequently, deep learning approach with significant modification, that leverages transfer learning through the usage of concatenated pre-trained models is applied to classify breast cancer histopathological images. The effectiveness of the proposal, including the impact of data augmentation, is demonstrated through a detailed analysis of three different medical datasets, showing significant performance improvement over baseline models. The proposal has the potential to contribute to the development of more accurate and reliable approach for breast cancer diagnosis.