This blog post is the second in a series presented by Evidence Based Education, creators of Assessment Academy. The blog series explores the four pillars of great assessment: purpose, validity, reliability and value.
There is no such thing as a valid assessment!
Validity is perhaps the most commonly-used word in discussions about the quality of any assessment. While it’s used a lot, it is often misunderstood and can be very misleading.
Validity is a word which, in assessment, refers to two things:
A common misconception about validity is that it is a property of an assessment, but in reality, there is no such thing as ‘a valid assessment’. However, there is such a thing as ‘an assessment which is valid for a specific purpose’: validity is all about the inferences you make based on the information generated.
Two key questions
Researchers such as Samuel Messick (1989) have suggested there are two key questions to be asked of any assessment:
The scientific question (technical accuracy): Is the test any good as a measure of the big idea, characteristic, or attribute it purports to assess?
The ethical question (social value): Should the test be used for its present purpose?
In many cases, there are two reasons that assessments end up not quite hitting their target: construct under-representation and construct-irrelevant variance.
Construct under-representation: is where the assessment fails to capture important aspects of the construct (the target of the assessment). Examples include:
a German assessment of applying verb endings correctly which only tests the present tense
a maths assessment of simplifying and manipulating algebraic expressions that does not test expanding products of two or more binomials
Construct-irrelevant variance: the assessment outcomes are influenced by things other than just the construct. Examples include:
in the German assessment mentioned above, inaccessible vocabulary used in the questions affects the measurement of the intended construct
in the maths assessment mentioned above, to answer a question the pupil is asked to first work out a percentage. Although a mathematical concept, we are no longer assessing just our intended topic (manipulating algebraic expressions).
When we talk of validity and great assessments, we are referring to the assessment’s ability to support the claims we want to make based on the information generated.
One of the key validity checks we can do when assessing the quality of an assessment is to consider: is there either construct under-representation or construct-irrelevant variance in this assessment? Defining the construct – saying what is and isn’t included in it – is a vital part of a robust assessment process. It is one way in which we can avoid construct under-representation and construct-irrelevant variance.
Ensuring that an appropriate and meaningful range of marks is used to represent performance at particular levels of achievement is another aspect of improving the validity of an assessment. If there are 50 marks available on an assessment task, but no student is awarded more than 35 marks or less than 20, is the assessment really out of 50?
Assessment validity is all about the inferences you make based on the information generated. Therefore, it is important to ask, does the assessment allow you to make inferences which are valid?
Validity and reliability form the foundation great assessment and should be considered side-by-side. In the next blog post in this series we will explore reliability and its relationship to validity.