综述
ENGLISH ABSTRACT
医学大模型幻觉问题及应对策略的研究与实践
黄子扬
董超
姜会珍
李喆
李亚光
马琏
马丹丹
张新平
叶向阳
陈婕卿
周翔
作者及单位信息
·
DOI: 10.3760/cma.j.cn101909-20240510-00102
On the countermeasures to hallucination of medical large language model: literature review and experience synthesis
Huang Ziyang
Dong Chao
Jiang Huizhen
Li Zhe
Li Yaguang
Ma Lian
Ma Dandan
Zhang Xinping
Ye Xiangyang
Chen Jieqing
Zhou Xiang
Authors Info & Affiliations
Huang Ziyang
Department of Information Center, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College & State Key Laboratory of Complex Severe and Rare Diseases, Beijing 100730, China
Dong Chao
Department of Information Center, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College & State Key Laboratory of Complex Severe and Rare Diseases, Beijing 100730, China
Jiang Huizhen
Department of Information Center, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College & State Key Laboratory of Complex Severe and Rare Diseases, Beijing 100730, China
Li Zhe
Department of Information Center, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College & State Key Laboratory of Complex Severe and Rare Diseases, Beijing 100730, China
Li Yaguang
Department of Information Center, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College & State Key Laboratory of Complex Severe and Rare Diseases, Beijing 100730, China
Ma Lian
Department of Information Center, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College & State Key Laboratory of Complex Severe and Rare Diseases, Beijing 100730, China
Ma Dandan
Department of Information Center, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College & State Key Laboratory of Complex Severe and Rare Diseases, Beijing 100730, China
Zhang Xinping
Department of Information Center, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College & State Key Laboratory of Complex Severe and Rare Diseases, Beijing 100730, China
Ye Xiangyang
Department of Information Center, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College & State Key Laboratory of Complex Severe and Rare Diseases, Beijing 100730, China
Chen Jieqing
Department of Information Center, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College & State Key Laboratory of Complex Severe and Rare Diseases, Beijing 100730, China
Zhou Xiang
Department of Information Center, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College & State Key Laboratory of Complex Severe and Rare Diseases, Beijing 100730, China
Department of Critical Care Medicine, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College & State Key Laboratory of Complex Severe and Rare, Beijing 100730, China
·
DOI: 10.3760/cma.j.cn101909-20240510-00102
0
0
0
0
0
0
PDF下载
APP内阅读
摘要

随着大语言模型发展和应用的不断深入,大模型幻觉所带来的问题日益显露,在医学领域尤其危险。如何更好地理解幻觉原因并予以减轻,对医学大模型落地和推广至关重要。本文通过文献综述和实践总结,围绕大模型幻觉来源、类型和评估等内容进行阐述,并讨论了在生成阶段、训练阶段可采取应对策略减轻大模型幻觉。实践证明,在医学场景中,检索增强生成(RAG)是减轻大模型幻觉的重要手段。医学大模型有广泛的应用前景,需持续创新不断减轻幻觉问题,提高大模型的准确性和计算性能,为推动医疗领域发展和实现健康中国战略作出更大贡献。

大语言模型;医学领域;大模型幻觉;检索增强生成;知识库
ABSTRACT

With the development and application of large language models(LLMs), the problems caused by hallucinations in these models have become increasingly apparent, posing critical risks in the medical field. It is crucial to better understand the causes of hallucinations and find ways to mitigate them in order to facilitate the implementation and promotion of LLMs in the medical domain. Based on literature review and practical experience, this research aims to elaborate on the sources, types, and assessments of hallucinations in LLMs, with a particular focus on the countermeasures to mitigate hallucinations during the generation and training stages. It is proved that the retrieval-augmented generation is an important measure to be taken to that end. Medical LLMs have broad application prospects, but continuous innovation is required to mitigate hallucinations, improve the accuracy and computational performance of LLMs, and contribute to the advancement of the healthcare sector and the realization of the Healthy China strategy.

Large language model;Medical field;Model hallucination;Retrieval-augmented generation;Knowledge base
Zhou Xiang, Email: nc.defhcabmupgnaixuohz
引用本文

黄子扬,董超,姜会珍,等. 医学大模型幻觉问题及应对策略的研究与实践[J]. 数字医学与健康,2025,03(01):54-58.

DOI:10.3760/cma.j.cn101909-20240510-00102

PERMISSIONS

Request permissions for this article from CCC.

评价本文
*以上评分为匿名评价
大语言模型的发展速度令人瞩目。在过去几年里,大模型领域取得了难以置信的进展和突破。不断推出的新模型架构和训练方法使得大模型在自然语言处理和理解方面实现了质的飞跃 1 , 2。大模型不仅改变了人们上网寻找信息的交互方式,而且在医学领域也展现出了巨大的应用潜力 3 , 4。随着大模型的性能和普及程度的不断提高,人们对其在医学领域的表现寄予了更高的期望 5。然而,尽管医学大模型前景广阔,但当下仍存在一些挑战。大模型可能会产生看似合理但实际上不正确或已过时的答案,这种现象一般被称为幻觉(hallucination) 6。大模型幻觉带来了教育 7、法律、新闻等领域的应用风险,这个问题在医疗领域尤其敏感和危险,可能引发严重后果 8。“答非所问”是一种典型的幻觉问题,也称之为输入冲突型幻觉,此时医学大模型错误理解用户的输入内容,生成与之背离的回答。例如,用户提问“最近我的眼睛一直很痒,是因为过敏吗”,大模型回复“眼睛过敏通常每天晚上10点左右开始痒,白天稍微好转”,显然提问和回答之间不匹配。综上,研究幻觉问题并找到应对策略,对于医学大模型的落地和推广至关重要。
试读结束,您可以通过登录机构账户或个人账户后获取全文阅读权限。
参考文献
[1]
Brown TB , Mann B , Ryder N ,et al. Language models are few-shot learners[J]. arXiv: 2005. 14165. DOI: 10.48550/arXiv.2005.14165 .
返回引文位置Google Scholar
百度学术
万方数据
[2]
Du YL , Li S , Torralba A ,et al. Improving factuality and reasoning in language models through multiagent debate[J]. arXiv:2305.14325. DOI: 10.48550/arXiv.2305.14325 .
返回引文位置Google Scholar
百度学术
万方数据
[3]
Han T , Adams L C , Papaioannou J M ,et al. MedAlpaca——an open-source collection of medical conversational AI models and training data[J]. arXiv:2304.08247. DOI: 10.48550/arXiv.2304.08247 .
返回引文位置Google Scholar
百度学术
万方数据
[4]
陈润生. 医疗大数据结合大语言模型的应用展望[J]. 四川大学学报(医学版), 2023,54(5):855-856. DOI: 10.12182/20230960301 .
返回引文位置Google Scholar
百度学术
万方数据
Chen RS . Prospects for the application of healthcare big data combined with large language models[J]. Journal of Sichuan University(Medical Sciences), 2023,54(5):855-856. DOI: 10.12182/20230960301 .
Goto CitationGoogle Scholar
Baidu Scholar
Wanfang Data
[5]
Singhal K , Tu T , Gottweis J ,et al. Towards expert-level medical question answering with large language models[J]. arXiv:2305.09617. DOI: 10.48550/arXiv.2305.09617 .
返回引文位置Google Scholar
百度学术
万方数据
[6]
Gunjal A , Yin J , Bas E . Detecting and preventing hallucinations in large vision language models[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2024,38(16):18135-18143.
返回引文位置Google Scholar
百度学术
万方数据
[7]
禇乐阳,潘香霖,陈向东. AI大模型在教育应用中的伦理风险与应对[J]. 苏州大学学报(教育科学版), 2024,12(1):87-96. DOI: 10.19563/j.cnki.sdjk.2024.01.009 .
返回引文位置Google Scholar
百度学术
万方数据
Chu LY , Pan XL , Chen XD . Ethic risks and responses in the application of large language models in education[J]. Journal of Suzhou University(Educational Science Edition), 2024,12(1):87-96. DOI: 10.19563/j.cnki.sdjk.2024.01.009 .
Goto CitationGoogle Scholar
Baidu Scholar
Wanfang Data
[8]
Tian S , Jin Q , Yeganova L ,et al. Opportunities and challenges for ChatGPT and large language models in biomedicine and health[J]. Brief Bioinform, 2023,25(1):bbad493. DOI: 10.1093/bib/bbad493 .
返回引文位置Google Scholar
百度学术
万方数据
[9]
Ji Z , Lee N , Frieske R ,et al. Survey of hallucination in natural language generation[J]. ACM Computing Surveys, 2023,55(12):1-38. DOI: 10.1145/3571730 .
返回引文位置Google Scholar
百度学术
万方数据
[10]
莫祖英,盘大清,刘欢,. 信息质量视角下AIGC虚假信息问题及根源分析[J]. 图书情报知识, 2023,40(4):32-40. DOI: 10.13366/j.dik.2023.04.032 .
返回引文位置Google Scholar
百度学术
万方数据
Mo ZY , Pan DQ , Liu H ,et al. Analysis on AIGC false information problem and root cause from the perspective of information quality[J]. Document, Informaiton & Knowledge , 2023,40(4):32-40. DOI: 10.13366/j.dik.2023.04.032 .
Goto CitationGoogle Scholar
Baidu Scholar
Wanfang Data
[11]
McKenna N , Li T , Cheng L ,et al. Sources of hallucination by large language models on inference tasks[J]. arXiv:2305.14552. DOI: 10.48550/arXiv.2305.14552 .
返回引文位置Google Scholar
百度学术
万方数据
[12]
Yin Z , Sun Q , Guo Q ,et al. Do large language models know what they don′t know?[J]. arXiv:2305.18153. DOI: 10.48550/arXiv.2305.18153 .
返回引文位置Google Scholar
百度学术
万方数据
[13]
岳增营,叶霞,刘睿珩. 基于语言模型的预训练技术研究综述[J]. 中文信息学报, 2021,35(9):15-29. DOI: 10.3969/j.issn.1003-0077.2021.09.002 .
返回引文位置Google Scholar
百度学术
万方数据
Yue ZY , Ye X , Liu RH . A survey of language model based pre-training technology[J]. Journal of Chinese Information Processing, 2021,35(9):15-29. DOI: 10.3969/j.issn.1003-0077.2021.09.002 .
Goto CitationGoogle Scholar
Baidu Scholar
Wanfang Data
[14]
陈剑锋. 大语言模型在临床医学的可应用性探讨[J]. 医学与哲学, 2023,44(21):1-6. DOI: 10.12014/j.issn.1002-0772.2023.21.01 .
返回引文位置Google Scholar
百度学术
万方数据
Chen JF . Exploration of the applicability of large language models in clinical medicine[J]. Medicine & Philosophy , 2023,44(21):1-6. DOI: 10.12014/j.issn.1002-0772.2023.21.01 .
Goto CitationGoogle Scholar
Baidu Scholar
Wanfang Data
[15]
Lewis P , Perez E , Piktus A ,et al. Retrieval-augmented generation for knowledge-intensive nlp tasks[J]. arXiv: 2005. 11401. DOI: 10.48550/arXiv.2005.11401 .
返回引文位置Google Scholar
百度学术
万方数据
[16]
田永林,王兴霞,王雨桐,. RAG-PHI:检索增强生成驱动的平行人与平行智能[J]. 智能科学与技术学报, 2024,6(1):41-51. DOI: 10.11959/j.issn.2096-6652.2024015 .
返回引文位置Google Scholar
百度学术
万方数据
Tian YL , Wang XX , Wang YT ,et al. RAG-PHI:RAG-driven parallel human and parallel intelligence[J]. Chinese Journal of Intelligent Science and Technology, 2024,6(1):41-51. DOI: 10.11959/j.issn.2096-6652.2024015 .
Goto CitationGoogle Scholar
Baidu Scholar
Wanfang Data
[17]
Xiong G , Jin Q , Lu Z ,et al. Benchmarking retrieval-augmented generation for medicine[J]. arXiv:2402.13178. DOI: 10.48550/arXiv.2402.13178 .
返回引文位置Google Scholar
百度学术
万方数据
[18]
Zarrieß S , Voigt H , Schüz S . Decoding methods in neural language generation: a survey[J]. Information, 2021,12(9):355. DOI: 10.3390/info12090355 .
返回引文位置Google Scholar
百度学术
万方数据
[19]
Duan J , Cheng H , Wang S ,et al. Shifting attention to relevance: towards the uncertainty estimation of large language models[J]. arXiv:2307.01379. DOI: 10.48550/arXiv.2307.01379 .
返回引文位置Google Scholar
百度学术
万方数据
[20]
Schulman J , Wolski F , Dhariwal P ,et al. Proximal policy optimization algorithms[J]. arXiv:1707.06347. DOI: 10.48550/arXiv.1707.06347 .
返回引文位置Google Scholar
百度学术
万方数据
[21]
Yan H , Gao Y , Fei C ,et al. 基座模型训练中的数据与模型架构(Data and model architecture in base model training)[C]//Proceedings of the 22nd Chinese National Conference on Computational Linguistics (Volume 2: Frontier Forum). 2023:1-15.
返回引文位置Google Scholar
百度学术
万方数据
[22]
张坤丽,任晓辉,庄雷,. 中文药品知识库的研究与构建[J]. 中文信息学报, 2022,36(10):45-53. DOI: 10.3969/j.issn.1003-0077.2022.10.005 .
返回引文位置Google Scholar
百度学术
万方数据
Zhang KL , Ren XH , Zhuang L ,et al. Research and construction of Chinese medicine knowledge base[J]. Journal of Chinese Information Processing, 2022,36(10):45-53. DOI: 10.3969/j.issn.1003-0077.2022.10.005 .
Goto CitationGoogle Scholar
Baidu Scholar
Wanfang Data
备注信息
A
周翔,Email: nc.defhcabmupgnaixuohz
B
黄子扬, 董超, 姜会珍, 等. 医学大模型幻觉问题及应对策略的研究与实践[J]. 数字医学与健康, 2025, 3(1): 54-58. DOI: 10.3760/cma.j.cn101909-20240510-00102.
C
所有作者声明无利益冲突
评论 (0条)
注册
登录
时间排序
暂无评论,发表第一条评论抢沙发
MedAI助手(体验版)
文档即答
智问智答
机器翻译
回答内容由人工智能生成,我社无法保证其准确性和完整性,该生成内容不代表我们的态度或观点,仅供参考。
生成快照
文献快照

你好,我可以帮助您更好的了解本文,请向我提问您关注的问题。

0/2000

《中华医学会杂志社用户协议》 | 《隐私政策》

《SparkDesk 用户协议》 | 《SparkDesk 隐私政策》

网信算备340104764864601230055号 | 网信算备340104726288401230013号

技术支持:

历史对话
本文全部
还没有聊天记录
设置
模式
纯净模式沉浸模式
字号