British Journal of Educational Psychology

Skip to Search

Skip to Navigation

Volume 27 Issue 1 (February 1957), Pages 1-75



  • 1.—
    A number of essay‐type English examination papers were administered on different occasions to groups of eleven‐year‐old children. Seven examiners were employed in marking the scripts, and a complex experimental design enabled a variety of statistical analyses to be carried out.
  • 2.—
    In another area the validity of similar papers was investigated by comparing the results with a criterion of success in secondary schools after the children concerned had completed two years of their course. The correlations obtained were also compared with those between an objective test of English and the criterion.
  • 3.—
    The results of the reliability investigation indicated that:
  • (a) 
    There are differences between examiners with respect to the mean mark and the variance of marks awarded for English papers of the essay type.
  • (b) 
    The seven papers given varied in their level of difficulty, as judged by the mean mark awarded.
  • (c) 
    English papers of the type described are subject to practice effects.
  • (d) 
    There is clear evidence from this experiment that examiners and children have idiosyncrasies which interact.
  • (e) 
    The error variance attributable to examiners is greatly reduced on the more objective English question paper (Part II) compared with the essay (Part I), although the amount of fluctuation due only to the children is relatively of almost the same size.
  • (f) 
    There were significant differences between examiners regarding their consistency of marking.
  • (g) 
    Compared with the figures of ·980 for the strictly objective Moray House English tests and ·970 for the slightly less objective National Foundation English tests, the reliability of the ‘old type’ English paper considered here was found to be approximately 0·88.
  • (h) 
    The equivalent test, re‐test reliabilities for the essay alone were ·766 if the same examiner re‐marked, and ·719 if a different examiner re‐marked.
  • (j) 
    The upper limit to the reliability of the ‘older type’ of English paper, including an essay, is set by the figure of 0·923 calculated on the assumption of perfect marking.
  • (k) 
    Slightly over half the children opted to write on a different essay topic when attempting the same paper a second time. An analysis of the children's choices, however, showed that for some of the papers their choice was severely limited by the inclusion of one or more unpopular topics.
  • 4.—
    The results of the validity investigation indicated that the ‘old type’ English examination provides a significantly less satisfactory forecast of subsequent success in secondary schools than is obtained from the results of an objective test.
  • 5.—
    The introduction of new types of item into objective tests is suggested as a means of creating a less undesirable ‘backwash’ on the primary school curriculum. Tests of this kind, it is claimed, could maintain the high reliability and validity of the more conventional objective tests.

Add This link

Bookmark and Share>