Hostname: page-component-76fb5796d-45l2p Total loading time: 0 Render date: 2024-04-25T18:23:58.989Z Has data issue: false hasContentIssue false

Explaining the Weak Relationship Between Job Performance and Ratings of Job Performance

Published online by Cambridge University Press:  07 January 2015

Kevin R. Murphy*
Affiliation:
The Pennsylvania State University
*
E-mail: krm10@psu.edu, Address: Department of Psychology, The Pennsylvania State University, Moore Building, University Park, PA, 16802

Abstract

Ratings of job performance are widely viewed as poor measures of job performance. Three models of the performance–performance rating relationship offer very different explanations and solutions for this seemingly weak relationship. One-factor models suggest that measurement error is the main difference between performance and performance ratings and they offer a simple solution—that is, the correction for attenuation. Multifactor models suggest that the effects of job performance on performance ratings are often masked by a range of systematic nonperformance factors that also influence these ratings. These models suggest isolating and dampening the effects of these nonperformance factors. Mediated models suggest that intentional distortions are a key reason that ratings often fail to reflect ratee performance. These models suggest that raters must be given both the tools and the incentive to perform well as measurement instruments and that systematic efforts to remove the negative consequences of giving honest performance ratings are needed if we hope to use performance ratings as serious measures of job performance.

Type
Focal Article
Copyright
Copyright © Society for Industrial and Organizational Psychology 2008 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

*

Department of Psychology, The Pennsylvania State University

References

Austin, J. T., & Villanova, P. (1992). The criterion problem 1917–1992. Journal of Applied Psychology, 77, 836874.Google Scholar
Balzer, W. K., & Sulsky, L. M. (1992). Halo and performance appraisal research: A critical examination. Journal of Applied Psychology, 77, 975985.Google Scholar
Banks, C. G., & Murphy, K. R. (1985). Toward narrowing the research practice gap in performance appraisal. Personnel Psychology, 38, 335345.Google Scholar
Bernardin, H. J., & Beatty, R. W. (1984). Performance appraisal: Assessing human behavior at work. Boston: Kent.Google Scholar
Bernardin, H. J., & Buckley, M. R. (1981). Strategies in rater training. Academy of Management Review, 6, 205212.Google Scholar
Bernardin, H. J., & Walter, C. S. (1977). Effects of rater training and diary-keeping on psychometric error in ratings. Journal of Applied Psychology, 62, 6469.Google Scholar
Bjerke, D. G., Cleveland, J. N., Morrison, R. F., & Wilson, W. C. (1987). Officer fitness report evaluation study (Navy Personnel Research and Development Center Report, TR 88-4). San Diego, CA: NPRDC.Google Scholar
Bracken, D., Timmreck, C., & Church, A. (2001). Handbook of multisource feedback. San Francisco: Jossey-Bass.Google Scholar
Chadwick-Jones, J. K., Brown, C. A., Nicholson, N., & Sheppard, C. (1971). Absence measures: Their reliability and stability in an industrial setting. Personnel Psychology, 24, 463470.Google Scholar
Cleveland, J. N., & Murphy, K. R. (1992). Analyzing performance appraisal as goal-directed behavior. In Ferris, G. & Rowland, K. (Eds.), Research in personnel and human resources management (Vol. 10, pp. 121185). Greenwich, CT: JAI Press.Google Scholar
Cleveland, J. N., Murphy, K. R., & Williams, R. (1989). Multiple uses of performance appraisal: Prevalence and correlates. Journal of Applied Psychology, 74, 130135.Google Scholar
Coen, T., & Jenkins, M. (2000). Abolishing performance appraisals: Why they backfire and what to do instead. New York: Berrett-Koehler.Google Scholar
Cooper, W. (1981). Ubiquitous halo. Psychological Bulletin, 90, 218244.Google Scholar
Cronbach, L. J., Gleser, G. C., Nanda, H., & Rajaratnam, N. (1972). The dependability of behavioral measurements: Theory of generalizability for scores and profiles. New York: Wiley.Google Scholar
DeCotiis, T., & Petit, A. (1978). The performance appraisal process: A model and some testable propositions. Academy of Management Review, 3, 635646.Google Scholar
DeNisi, A. S., Cafferty, T. P., & Meglino, B. M. (1984). A cognitive view of the performance appraisal process: A model and research propositions. Organizational Behavior and Human Performance, 33, 360396.Google Scholar
Gaudet, F. J. (1963). Solving the problems of employee absence. New York: American Management Association.Google Scholar
Guion, R. M. (1998). Assessment, measurement and prediction for personnel decisions. Mahwah, NJ: Erlbaum.Google Scholar
Harris, M. H., & Schaubroeck, J. (1988). A meta-analysis of self-supervisory, self-peer, and peer-supervisory ratings. Personnel Psychology, 41, 4362.Google Scholar
Heneman, R. L., Wexley, K. N., & Moore, M. L. (1987). Performance-rating accuracy: A critical review. Journal of Business Research, 15, 431448.Google Scholar
Hunter, J. E. (1983). The economic benefits of personnel selection using ability tests: A state of the art review including a detailed analysis of the dollar benefit of U.S. Employment Service placements and a critique of the low-cutoff method of test use (USES Test Research Report No. 47). Washington, DC: U.S. Employment Service, USDOL.Google Scholar
Hunter, J. E., & Schmidt, F. L. (1982). Fitting people to jobs: Implications of personnel selection for national productivity. In Fleishman, E. A. & Dunnette, M. D. (Eds.), Human performance and productivity. Volume I: Human capability assessment (pp. 233284). Hillsdale, NJ: Erlbaum.Google Scholar
Jacobs, R., Kafry, D., & Zedeck, S. (1980). Expectations of behaviorally anchored rating scales. Personnel Psychology, 33, 595640.Google Scholar
Jawahar, I. M., & Williams, C. R. (1997). Where all the children are above average: The performance appraisal purpose effect. Personnel Psychology, 50, 905926.Google Scholar
Landy, F. J., & Farr, J. L. (1980). Performance rating. Psychological Bulletin, 87, 72107.Google Scholar
Landy, F. J., & Farr, J. L. (1983). The measurement of work performance: Methods, theory, and applications. New York: Academic Press.Google Scholar
Landy, F. J., Vance, R. J., Barnes-Farrell, J. L., & Steele, J. W. (1980). Statistical control of halo error in performance ratings. Journal of Applied Psychology, 65, 501506.Google Scholar
Latham, G., & Wexley, K. (1977). Behavioral observation scales. Journal of Applied Psychology, 30, 255268.Google Scholar
Le, H., Oh, I., Shaffer, J., & Schmidt, F. (2007). Implications of methodological advances for the practice of personnel selection: How practitioners benefit from meta-analysis. Academy of Management Perspectives, 3, 615.Google Scholar
Longenecker, C. O., Sims, H. P., & Gioia, D. A. (1987). Behind the mask: The politics of employee appraisal. Academy of Management Executive, 1, 183193.Google Scholar
Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.Google Scholar
Lumsden, J. (1976). Test theory. Annual Review of Psychology, 27, 251280.Google Scholar
McBriarty, M. A. (1988). Performance appraisal: Some unintended consequences. Public Personnel Management, 17, 421434.Google Scholar
McIntyre, R. M., Smith, D., & Hassett, C. E. (1984). Accuracy of performance ratings as affected by rater training and perceived purpose of rating. Journal of Applied Psychology, 69, 147156.Google Scholar
Meyer, H. H., Kay, E., & French, R. P. (1965). Split roles in performance appraisal. Harvard Business Review, 43, 123129.Google Scholar
Morin, D., & Murphy, K. R. (1999). Analyse empirique de la relation enre le contexte de l’évaluation de rendment et l’indulgence de l’évaluateur [The relationship between performance appraisal context and rating inflation]. Relations Industrielles [Industrial Relations], 54, 694726.Google Scholar
Murphy, K. R. (1982). Difficulties in the statistical control of halo. Journal of Applied Psychology, 67, 161164.Google Scholar
Murphy, K. R., & Balzer, W. K. (1989). Rater errors and rating accuracy. Journal of Applied Psychology, 74, 619624.Google Scholar
Murphy, K. R., Balzer, W., Kellam, K., & Armstrong, J. (1984). Effect of purpose of rating on accuracy in observing teacher behavior and evaluating teaching performance. Journal of Educational Psychology, 76, 4554.Google Scholar
Murphy, K. R., & Cleveland, J. N. (1991). Performance appraisal: An organizational perspective. Needham Heights, MA: Allyn and Bacon.Google Scholar
Murphy, K. R., & Cleveland, J. N. (1995). Understanding performance appraisal: Social, organizational and goal-based perspectives. Thousand Oaks, CA: Sage.Google Scholar
Murphy, K. R., Cleveland, J. N., Kinney, T. B., Skattebo, A. L., Newman, D. A., & Sin, H. P. (2003). Unit climate, rater goals, and performance ratings in an instructional setting. Irish Journal of Management, 24, 4865.Google Scholar
Murphy, K. R., Cleveland, J. N., & Mohler, C. (2001). Reliability, validity and meaningfulness of multisource ratings. In Bracken, D., Timmreck, C., and Church, A. (Eds.), Handbook of multisource feedback (pp. 130148). San Francisco: Jossey-Bass.Google Scholar
Murphy, K. R., Cleveland, J. N., Skattebo, A. L., & Kinney, T. B. (2004). Raters who pursue different goals give different ratings. Journal of Applied Psychology, 89, 158164.Google Scholar
Murphy, K. R., & DeShon, R. (2000a). Inter-rater correlations do not estimate the reliability of job performance ratings. Personnel Psychology, 53, 873900.Google Scholar
Murphy, K. R., & DeShon, R. (2000b). Progress in psychometrics: Can industrial and organizational psychology catch up? Personnel Psychology, 53, 913924.Google Scholar
Murphy, K. R., Jako, R. A., & Anhalt, R. L. (1993). The nature and consequences of halo error: A critical analysis. Journal of Applied Psychology, 78, 218225.Google Scholar
Murphy, K. R., & Reynolds, D. (1988). Does true halo affect observed halo? Journal of Applied Psychology, 73, 235238.Google Scholar
Noonan, L. E., & Sulsky, L. M. (2001). Impact of Frame-of-Reference and Behavioral Observation Training on alternative training effectiveness criteria in a Canadian military sample. Human Performance, 14, 326.Google Scholar
Osterman, P. (2007). Comment on Le, Oh, Shaffer and Schmidt. Academy of Management Perspectives, 3, 1618.Google Scholar
Roch, S. G., Sturnburgh, A. M., & Caputo, P. M. (2007). Absolute vs. relative rating formats: Implications for fairness and organizational justice. International Journal of Selection and Assessment, 15, 302316.Google Scholar
Schmidt, F. L. (2002). The role of general cognitive ability and job performance: Why there cannot be a debate. Human Performance, 15, 187202.Google Scholar
Schmidt, F. L., & Hunter, J. E. (1996). Measurement error in psychological research: Lessons from 26 research scenarios. Psychological Methods, 1, 199223.Google Scholar
Schmidt, F. L., & Hunter, J. E. (1998). The validity and utility of selection methods in personnel psychology: Practical and theoretical implications of 85 years of research findings. Psychological Bulletin, 124, 262274.Google Scholar
Schmidt, F. E., Viswesvaran, C., & Ones, D. (2000). Reliability is not validity and validity is not reliability. Personnel Psychology, 53, 901912.Google Scholar
Scullen, S. E., Bergey, P. K., & Aiman-Smith, L. (2005). Forced distribution rating systems and improvement of workforce potential: A baseline simulation. Personnel Psychology, 58, 132.Google Scholar
Scullen, S. E., Mount, M. K., & Goff, M. (2000). Evidence of the construct validity of developmental ratings of managerial performance. Journal of Applied Psychology, 88, 5066.Google Scholar
Scullen, S. E., Mount, M. K., & Judge, T. A. (2003). Understanding the latent structure of job performance ratings. Journal of Applied Psychology, 85, 956970.Google Scholar
Sulsky, L. M., Skarlicki, D. P., & Keown, J. L. (2002). Frame-of-reference training: Overcoming the effects of organizational citizenship behavior on performance rating accuracy. Journal of Applied Social Psychology, 32, 12241240.Google Scholar
Tziner, A., & Murphy, K. R. (1999). Additional evidence of attitudinal influences in performance appraisal. Journal of Business and Psychology, 13, 407419.Google Scholar
Tziner, A., Murphy, K. R., & Cleveland, J. N. (2001). Relationships between attitudes toward organizations and performance appraisal systems and rating behavior. International Journal of Selection and Assessment, 9, 226239.Google Scholar
Tziner, A., Murphy, K. R., & Cleveland, J. N. (2002). Does conscientiousness moderate the relationship between attitudes and beliefs regarding performance appraisal and rating behavior? International Journal of Selection and Assessment, 10, 218224.Google Scholar
Tziner, A., Murphy, K. R., & Cleveland, J. N. (2005). Contextual and rater factors affecting rating behavior. Group and Organizational Management, 30, 8998.Google Scholar
Tziner, A., Murphy, K. R., Cleveland, J. N., Beaudin, G., & Marchand, S. (1998). Impact of rater beliefs regarding performance appraisal and its organizational contexts on appraisal quality. Journal of Business and Psychology, 12, 457467.Google Scholar
Tziner, A., Murphy, K. R., Cleveland, J. N., Yavo, A., & Hayoon, E. (in press). A new old question: Do contextual factors relate to rating behavior?—An investigation with peer evaluations. International Journal of Selection and Assessment.Google Scholar
Villanova, P., & Bernardin, H. J. (1989). Impression management in the context of performance appraisal. In Giacalone, R. A. & Rosenfeld, P. (Eds.), Impression management in the organization (pp. 299314). Hillsdale, NJ: Erlbaum.Google Scholar
Viswesvaran, C., Ones, D. S., & Schmidt, F. L. (1996). Comparative analysis of the reliability of job performance ratings. Journal of Applied Psychology, 81, 557574.Google Scholar
Viswesvaran, C., Schmidt, F. L., & Ones, D. S. (2005). Is there a general factor in ratings of job performance? A meta-analytic framework for disentangling substantive and error influences. Journal of Applied Psychology, 90, 108131.Google Scholar
Wallace, S. R. (1974). How high the validity? Personnel Psychology, 27, 397407.Google Scholar
Wegner, D. M., Schneider, D. J., Carter, S. R. III, & White, T. L. (1987). Paradoxical effects of thought suppression. Journal of Personality and Social Psychology, 53, 636647.Google Scholar
Welch, J. F. (2001). Jack: Straight from the gut. New York: Warner Books.Google Scholar
Wherry, R. J., & Bartlett, C. J. (1982). The control of bias in ratings: A theory of rating. Personnel Psychology, 35, 521555.Google Scholar
Williams, J. R., & Levy, P. E. (1992). The effects of perceived system knowledge on the agreement between self-ratings and supervisor ratings. Personnel Psychology, 45, 835847.Google Scholar