Kathryn Schott
M.A.E. Action Research
Impacts of Social Desirability on the Accuracy of Public Formative Assessment
Purpose
The purpose of the study was to analyze to what degree social desirability impacts the accuracy of formative assessment in a public setting. Second grade students were given a public and private formative assessment immediately after a math lesson. Math lessons were given in both whole-group and small-group settings. A correlation was run between public and private assessment scores, for both whole-group and small-group data, to determine if students were accurately expressing their level of understanding during public formative assessments.
Hypotheses
1) Public formative assessment will not be accurate in expressing students’ true level of understanding, as expressed in private formative assessments.
2) The whole group public data collection will be more accurate than small group public data collection.
Method
First, I taught approximately a ten minute math lesson. The content of the math lesson was different each time I collected data. In the whole group setting, the entire class received the math lesson at once. In the small group settings, each small group of students received the math lesson one at a time, but all within the same day.
At the end of the lesson, students were instructed to hold their fingers in the air to represent how well they understood the lesson content. Students held 1, 2, 3, or 4 fingers in the air. Then I took a picture of the students and their fingers. Immediately after, I gave each student a private, paper formative assessment consisting of 2 questions. Several students at each table put up cardboard, privacy offices to prevent anyone from cheating.
Data was collected for a period of four weeks. Each week I collected data once from the whole group and once from the small groups. With this strategy, I had a total of four whole group and four small group data collection sessions. Students were randomly assigned to small groups each time I collected small group data. Small groups consisted of 6 students.
Results
Bivariate, two-tailed correlation analyses were conducted using Spearman’s rho correlation coefficient on the whole group data and small group data. According to the statistical analyses, there is not a correlation between the number of fingers held in the air and independently completed test scores for neither whole group nor small group settings. The small group data produced a correlation coefficient that was farther from being statistically significant than the whole group data. The whole group data was 0.09 away from being statistically significant. The lack of correlation means there was a disconnection between fingers held up in the air and performance on independently taken quizzes. That disconnection was represented by students displaying a number of fingers that did not represent scores achieved on independently taken, teacher-scored quizzes.
Implications on K-12 Learning
The results of my study suggest that public formative assessment, whether obtained in a small group or whole group setting, is prone to social desirability bias. Whole group public formative assessments are less prone to social desirability bias than small group public formative assessments, but the bias still has an influential role. Public formative assessments are handy for a quick judgement of how well students understand a lesson, but they should be used with caution. Expect that students may display a higher level of understanding than their true level of understanding. The results of my study also suggest that socially desirable responses are more frequent in small group settings, regardless of true understanding. Teachers should be hesitate when using small group public formative assessments since there is no correlation between public formative assessments and independent performance.