| The Difference
Between Validity and Reliability
Reflection Week 8 Linda Norman |
| It
is important when choosing or authoring a test to evaluate the validity
and reliability of
the assessment. Linn and Gronlund in Measurement and Assessment in Teaching define validity as an "evaluation of the adequacy and appropriateness of the interpretations and uses of assessment results." They also explain the meaning of reliability by the consistency of assessment results. In order for the assessment to be effective both conditions are necessary. Teachers when choosing an assessment, often neglect evaluating the validity or reliablity of the tool An assessment must address its intended use.When measuring validity, one must look at the content, construct, criterion, and consequences. If it is not covering the content that is being measured, the results will not help intepret what is being accomplished. I do not give weekly assessments in my Learning Center. However, a district team that I was on wrote assessments to be given in May. These technology assessments measure what we think each seventh and eighth grader should know at the end of the year. To assure validity, we looked at our content to make sure that we were assessing what we were teaching at seventh and eighth grade. The following summer we reassessed to see if there were any unintended effects. We found several, such as one tech standard didn't fit naturally into the seventh grade curriculum. So, we revised the tests for the following school year. The importance of validity is so that one can make predictions from the assessment that are reasonably accurate. If we had not evaluated the assessments the following summer, our tests would not help us understand if the technology content was being delivered effectively and appropriately. For instance, the skill that didn't fit into the seventh grade curriculum naturally was moved to eighth grade where it was used in math class, because they graph in their curriculum.We had consequences of the tech content not matching the classroom objectives, so students had no practical application of graphing in the classroom content. Reliability was important in our testing, because we had two different buildings using the same assessment. So, we needed consistency in their administering them, in the scoring of them, and in the consistency of the use of the scores. I feel this is the one area of the assessments that we have not totally achieved. It has improved every year with the worse year being the first year that we administered them. Our instrument was teacher created and maybe that is why the degree of reliability is lower. Time was another factor in the reliability. The first assessment was so long that students were pulled from classes more than sixty minutes. Our assessment the second year only took sixty minutes, which is one class. Now after our administering them for three years, both buildings are comfortable with the interpretation and application of the tool. Since it was a teacher created test, there was no manual and as a new question arose the teacher decided how to answer it. Creating
these assessments really gave me an important understanding of how critical
it is to have both validity and reliability. "We cannot expect assessment
results to be perfectly consistent," states Linn and Gronlund. If they
are important enough to give, though, then we must strive to use assessments
that pass the validity and reliability tests.
|