张泉慧; 何惧; 任杰; 张颖; 卢燕

doi:10.3760/cma.j.cn115259-20210817-01034

点赞 0
分享 0
收藏 0
纠错

• 医学教育评估专栏 •

临床医学专业(本科)水平测试的等值方法比较研究

中华医学教育杂志, 2022,42(7) : 577-580. DOI: 10.3760/cma.j.cn115259-20210817-01034

摘要

目的

基于经典测验理论(classical test theory，CTT)和项目反应理论(item response theory，IRT)下的等值方法对2个年度临床医学专业(本科)水平测试(简称学业水平测试)考生作答情况进行分析，探讨学业水平测试中更为适合的等值方法。

方法

基于CTT方法，采用塔克(Tucker)观察分数线性等值方法、列文(Levine)观察分数线性等值方法、等百分位法、等百分位平滑法4种方法，基于IRT方法的单参数、双参数模型中，采用分别估计法、同时估计法和固定共同题参数估计法各3种校准方法进行等值探索，通过等值标准误来分析以上10种等值结果的稳定性。

结果

CTT方法的等值误差在0.7~1.6之间，IRT方法的等值误差在0.2~0.6之间，IRT误差更小。CTT方法中，Tucker观察分数线性等值方法误差最小，为0.7，等百分位平滑法误差最大，为1.6；IRT方法中，单参数模型的等值结果优于双参数模型，单参数模型中，固定共同题参数估计法的误差最小，为0.2。

结论

学业水平测试等值可以选择IRT单参数模型中的固定共同题参数估计法，通过等值，年度2学业水平测试等值后的分数上调，合格标准保持不变，有效地实现了分数可比，保证了考试公平。

引用本文: 张泉慧, 何惧, 任杰, 等. 临床医学专业(本科)水平测试的等值方法比较研究 [J] . 中华医学教育杂志, 2022, 42(7) : 577-580. DOI: 10.3760/cma.j.cn115259-20210817-01034.

参考文献导出: Endnote NoteExpress RefWorks NoteFirst 医学文献王

扫描看全文

正文

作者信息

基金 0 关键词 0

English Abstract

阅读 0 评论 0

相关资源

引用 | 论文 | 视频

版权归中华医学会所有。

未经授权，不得转载、摘编本刊文章，不得使用本刊的版式设计。

除非特别声明，本刊刊出的所有文章不代表中华医学会和本刊编委会的观点。

临床医学专业(本科)水平测试(简称学业水平测试)是国家医学考试中心和全国医学教育发展中心联合发起的针对医学院校普通全日制临床医学专业本科生的学业水平测试，于2020年正式施测。为了保证学业水平测试的科学有效，实现历年考试分数的公平性，该测试从设计之初就进行了等值方案的探索。等值是指通过一系列教育测量学模型实现不同年度分数的转换，将这些分数统一在一个量尺上，使分数互通可比，避免难度不同对成绩的影响，以保证考试公平、实现成绩的科学反馈。作为考试评价的重要环节，等值的研究一直受到了广泛的关注，现阶段等值的研究主要基于2类理论框架：经典测验理论(classical test theory，CTT)和项目反应理论(item response theory，IRT)。关于等值，国外相关研究与实践开展较早，国内的研究起步较晚，近5年来，国内有关等值的探索主要集中在教育领域，关于等值理论的研究^[1,2,3]，也有关于等值理论实践的研究^[4,5,6]。在医学考试领域中，有关等值的理论与应用的研究很少，仅有个别考试采用IRT理论进行试题分析^[7,8]。这些研究仅应用了CTT和IRT理论进行题目分析，并未涉及等值实践，因此，对于医学类考试引入等值实践及等值方法的比较非常必要。本研究选择在CTT和IRT 2种理论框架下，采用不同的等值模型分析考生作答情况，以期探索适合学业水平测试等值方法的最优方案。

贡献者信息

张泉慧

国家医学考试中心信息评价部，北京　100097

何惧

国家医学考试中心，北京　100097

任杰

北京语言大学语言测试和人才测评研究所，北京　100083

张颖

国家医学考试中心考务管理部，北京　100097

卢燕

国家医学考试中心发展研究部，北京　100097

通信作者

卢燕

国家医学考试中心发展研究部，北京　100097

Email：luyan810206@163.com

关键词

临床医学专业; 水平测试; 经典测验理论; 项目反应理论; 等值;

作者声明

作者贡献声明

张泉慧：起草文章、实施研究和解释数据；何惧、张颖：批评性审阅；任杰：数据统计分析；卢燕：试验设计和数据采集

利益冲突

所有作者声明无利益冲突

历史

出版日期：2022-07-01

收稿日期：2021-08-17

本文编辑

袁玮

A comparative study of equating methods applied in standardized competence test for clinical medicine undergraduates

Zhang Quanhui, He Ju, Ren Jie, Zhang Ying, Lu Yan

Published 2022-07-01

Cite as Chin J Med Edu, 2022, 42(7): 577-580. DOI: 10.3760/cma.j.cn115259-20210817-01034

Abstract

Objective

This paper analyzes equating methods applied in Standardized Competence Test for undergraduates of clinical medicine based on classical test theory (CTT) and item response theory (IRT) in order to explore a more suitable equating method.

Methods

The research uses four equating methods based on the CTT and six equating methods based on the IRT.CTT equating methods include Tucker observation score linear equating method, Levine observation score linear equating method, equipercentile equating smoothing method and equating standard error equating unsmoothed method. While in the one-parameter model and two-parameter model of IRT, three calibration methods are used which are linking separate calibration, concurrent calibration and fixed Item Parameter Calibration. The stability of the 10 equating results is analyzed by the equating standard error.

Results

The results show that the equating standard error of CTT method is 0.7~1.6, while the equating standard error of IRT method is 0.2~0.6, IRT equating standard error is smaller than CTT equating method. Among four CTT equating methods, the equating standard error of Tucker observation score linear equating method is 0.7 as the smallest one, the error of equipercentile equating method is 1.6 as the largest one. Among six IRT equating methods, the result of one-parameter model is better than that of two-parameter model and the error of fixed item parameter calibration is the smallest one in one-parameter model, which the equating standard error is 0.2.

Conclusions

The fixed item parameter calibration in one-parameter model of IRT can be selected as the equating method of this test. Through equating, the score of year 2 is improved, and the eligibility criteria remain unchanged, which effectively achieves the score comparability and ensures the fairness of the test.

Key words:

Clinical medicine; Competence test; Classical test theory; Item response theory; Equating

Contributor Information

Zhang Quanhui

Department of Information and Assessment, National Medicine Examination Center, Being 100097, China

He Ju

National Medicine Examination Center, Being 100097, China

Ren Jie

Institute of Language Testing and Talent Evaluation, Beijing Language and Culture University, Being 100083, China

Zhang Ying

Department of Examination Management, National Medicine Examination Center, Being 100097, China

Lu Yan

Department of Development Research, National Medicine Examination Center, Being 100097, China

共有条评论

验证码

本文被引情况 CSCD: 0次万方数据： 0次 Scopus: 0次

施引文献(最多仅列5条文献，进入CSCD官网发现更多)

未获取施引文献信息...

暂无相关资源