In a typical ability test, groups of items are administered, and the number of items an examinee answers correctly is used to estimate his/her ability. The more items an individual answers correctly, the greater his/her ability is assumed to be. However, because everyone responds to every item, most examinees are administered items that are either too easy or too difficult. Adding these items to the test is similar to adding constants to a score; they provide relatively little information about the examinee's ability level.
In Computer Adaptive Testing (CAT), the estimated ability level of the examinee is used to predict the probability of getting an item correct. With no knowledge about an examinee in the beginning, it is assumed he/she is of average ability. CAT begins by administering an item of average difficulty. An examinee who correctly answers the first item is then given a more difficult item; if that item is answered correctly, the computer administers an even more difficult item. Conversely, an examinee who gets the first item wrong is administered an easier question. In short, the computer interactively adjusts the difficulty of
CAT consistently administers items appropriate to the examinee, which maximizes the information gained about the examinee's level of ability. CAT stops administering items when certain criteria are met, such as when the standard error of the ability estimate falls below a set threshold, indicating a reliable assessment (CAT is commonly based upon item response theory, which enables the test developer to calculate the reliability of a test taker's ability). Other stopping criteria include time and the number of items administered.
One basic requirement of CAT is that the content domain be one- Cutler Jersey dimensional. In other words, CAT can only be used to measure a single ability or skill. Where multiple skills/abilities need to be assessed, it is necessary to develop a separate CAT for each domain.
Assuming there is only one skill/ability to be assessed, the challenge that remains is development of a high-quality item pool. CAT developers must ensure that the test measures the examinee's true ability level. Because examinees (i.e., applicants) may be of high or low ability levels, the CAT must be able to assess across the entire range of ability represented in the applicant population. This is accomplished by the development of many items for low-ability examinees, Cutler Jersey average-ability examinees and high-ability examinees (as well as points in between). Some have argued that an effective CAT can be developed with only 100 high quality items distributed evenly across ability levels (more items are always preferred). For very "high stakes" exams, or those covering a very broad domain, many items may be necessary for successful Talent Management.
没有评论:
发表评论