How does NBPME demonstrate that the examinations are legally defensible?
Adherence to Professional Standards
The Standards for Educational and Psychological Testing (1999) is a comprehensive technical guide that provides criteria for the evaluation of tests, testing practices, and the effects of test use. It was developed jointly by the American Psychological Association (APA), the American Educational Research Association (AERA), and the National Council on Measurement in Education (NCME). The guidelines presented in the Standards, by professional consensus (including a review by the National Board of Medical Examiners-NBME) have come to define the necessary components of quality testing. As stated in Standard 14.14, “The content domain to be covered by a credentialing test should be defined clearly and justified in terms of the importance of the content for credential-worthy performance in an occupation or profession. A rationale should be provided to support a claim that the knowledge or skills being assessed are required for credential-worthy performance in an occupation and are consistent with the purpose for which the licensing or licensure program was instituted… Some form of job or practice analysis provides the primary basis for defining the content domain… (p. 161, Standards for Educational & Psychological Testing).
NBPME conducts two separate practice analysis studies, one for Parts I and II, and a second for Part III. The studies are completed at five year intervals. Podiatric physicians are surveyed on the knowledge and skills necessary for practice. That information is used to develop the content to be tested and the percentage of questions in each content area. Practice analysis studies were most recently conducted in 2015 for Part III and in 2016 for Parts I and II.
American Educational Research Association, American Psychological Association, National Council on Measurement in Education, Joint Committee on Standards for Educational, & Psychological Testing (US). (1999). Standards for educational and psychological testing. Amer Educational Research Assn.
Part I and II
- Podiatric Medical College Faculty submit questions (with accompanying references) to Prometric.
- The pool of questions (items) are reviewed by a panel of practicing podiatric physicians and two podiatric medical school faculty in each content area.
- The primary responsibility of the faculty is clarity and a current reference check.
- The primary responsibility of the DPM members is
- what is the relationship between the item and the tasks performed by a DPM in practice?
- priority with regard importance in practice.
- estimated difficulty-Is it easy, medium or hard?
- For Part I, the questions (items) also are reviewed by a content specialist in one of the basic sciences for accuracy and currency. This individual is a medical school faculty member.
- Prometric assembles the test from approved questions according to the content specification.
Items are written and reviewed by panels of DPM practitioners who have been trained how to prepare effective test items. A second panel of DPMs reviews each form of the test before it is published.
Post Test Administration
Double Scoring Ensures Accuracy
At the examinee level, each computer-based test undergoes two independent scorings. Each test is first scored at the testing site and subsequently rescored when the data arrive back at Prometric. If scores do not match exactly, the examinee’s record is held until the results can be reconciled. Irregularities that may have occurred at the testing site are also noted and any examinees who may have experienced irregular testing conditions at the test site (such as hardware or software failures or power interruptions) receive a thorough review of their responses. Scores for these examinees are not released until all irregular conditions are given consideration and resolution processing rules are applied fairly to ensure equity in the test administration process.
Each item is statistically analyzed to determine how many candidates answered correctly and whether the item discriminated between the high and low scoring candidates (the high scoring candidates answered correctly and the low scoring candidates did not). If the item is “flagged”, content experts review each of the flagged items for accuracy.
At the conclusion of the above analyses and after the scores are mailed to the candidates, each dean receives a report, which compares the performance for first time candidates at that school with the national examination data.
Reliability refers to the consistency of test scores, the consistency with which candidates are classified as either passing or failing, and the degree to which test scores are free from errors of measurement. Errors of measurement result from factors not related to the test, factors such as fatigue or heightened attention, personal interests and other characteristics not related to the test. A person’s score will not be perfectly consistent from one occasion to the next as a result of measurement error.
Determination of Passing Scores
The National Board and its test consultant, Prometric, use a widely-accepted criterion-referenced approach to determine passing scores known as the Angoff Method. The important feature of criterion-referenced standard setting is that it is based on an expected level of competence regardless of how many candidates in a particular group pass or fail. This is distinguished from a norm-referenced approach in which a set proportion of test takers fall above or below the passing score.