Respond by Day 6 to at least two of your colleagues’ postings in one or more of the following ways:
- Ask a probing question.
- Share an insight from having read your colleague’s posting.
- Offer and support an opinion.
- Validate an idea with your own experience.
- Make a suggestion.
- Expand on your colleague’s posting.
Classmate 1 (Stephanie):
“Job performance evaluations are a valued tool utilized by many organizations. The evaluations aid in the decision of whether to reevaluate a position and its responsibilities, if the team member has earned a raise, progression of a team member in a role, etc. Typically, evaluations are conducted by supervisors, managers, team leads, etc., and sometimes, multiple evaluations are conducted and used to come to a common assessment. This can create rater errors which may be problematic to the combination of evaluations.
In an ideal evaluation process, each rater would agree in every aspect and the process would be swift and painful, but that is not the case for most job performance evaluations. My organization conducts observational evaluations in which skills are assessed, as well as behaviors and execution of duties. We use these observations as a means of assessing progression in a position to decide whether the person is a good fit for the position or if we need to move them somewhere more appropriate to their skill level. One way to combine multiple raters’ evaluations would be to have a form that assessed quality characteristics such as consistency in work, quality of work, communication, etc. The scale on which these items will be measured will consist of Likert scale choices: poor, satisfactory, good, and excellent. Fisher, Weiss, and Dawis state that different methods of scaling used on the same data result in differing conclusions (1968). In order for our evaluation to be reliable, the scaling method must be similar and also consistent.
When multiple raters are used in the evaluation process, this leaves more room for rater error. Anastasi & Urbina list multiple rater errors that affect the process. These rater errors include trait influence of the person being evaluated, reluctance to give unfavorable scores, etc. (1997). McCrae goes as far as to support self-report questionnaires over observer ratings by stating they are convenient and researchers believe that individuals have more knowledge of their behavior and own feelings than external observers (1994). He does note that those who prefer observer ratings believe them to be more objective and less susceptible to falsification (McCrae, 1994).
Some suggestions on how to improve the reliability of a multi-rater assessment would be to keep the number of raters low. Howard suggests “the impact of adding raters is smaller than improving measures,” and additional raters do not improve the accuracy of the rating. Another suggestion is to utilize the power of observation. Rothstein states that observation of worker performance is necessary for reliable and accurate ratings (1990). To see, firsthand, employee behaviors and execution of job duties is beneficial to the evaluation process because there is no room for error here; what you see is just that.”
Classmate 2 (Kimberly):
“Performance assessments are used to measure performance in education, work, and everyday life. The traditional performance review that receives the most attention consists of one- or two-dimensional process. In the case of a one-dimensional review, a supervisor reviews an employee’s performance and tells the employee how he or she did. While some supervisor/employee dialogue may occur, it is related to the review process. In a two-dimensional review, both the supervisor and employee review the employee’s performance, discuss the results of the two evaluations, consider each other’s input to the discussion, and plan the performance emphasis for the future. Final accountability for the review rests with the supervisor (Wellins, Byham, & Milson, 1991).
Measurement experts use the word construct when they talk about the “concept or characteristic that a test is designed to measure” (American Educational Research Association, American Psychological Association, & National Council on Measurement in Education, 2014, p. 11), and performance assessment is a better-suited approach to measuring some constructs.
Multi-rater performance review offers a wide range of communication opportunities, which provide multiple chances for positive assistance. These systematic changes at every level of the organization is key to multi-rater feedback and imply delegating leadership within the organization. This means taking away from supervisors their responsibility for the primary performance review. To retain supervisory support for multi-rater performance review, supervisory jobs must have duties added to replace the performance review accountability that is delegated to multiple sources of performance review—e.g., facilitation of the performance review process or a focus on more strategic issues as day-to-day operational issues are handled more by teams (Schuster & Zingheim, 1992).
Psychometrics depend on models that represent test takers’ performance as an overall score or a category (i.e., pass-fail) that translates test takers’ responses into their standing on the construct(s) of interest. According to Tucker (2015) performance assessment presents challenges for those models. Standard psychometric models were developed for multiple-choice assessments, with many discrete test items that are scored as either right or wrong (i.e., scored dichotomously) and that are organized into tests designed to measure one clearly defined construct. Performance assessment is complex and not so easily modeled, making it more difficult to confidently associate test-taker performance with a score or a performance category (Tucker, 2015).”