Validity evidence and measurement properties in technology enhanced items
Door: Wools, S., Drijvers, P., Feskens, R., Molenaar, D. & Van der Scheer, E. | 02-01-2023Abstract
Within this study we have addressed the question how to evaluate the validity of results of international large-scale assessment programs (ILSAs) that incorporate technology-enhanced items, with special attention for the comparability of results between countries. Within ILSAs the issues of validity and comparability are of utmost importance. These two cornerstones of methodology are closely connected to the two main goals of ILSAs: providing within-country trend comparisons and between-countries relative comparisons. The introduction of digital assessment in general and the use of technology-enhanced items more specifically offers the possibility to improve the authenticity and with that the validity of the measurement. Above that, technology-enhanced items could yield traces of (process) data that could be used to not only make statements about the proficiency of students, but also of the strategy that they have used in order to come to a response to a question. At the same time, the use of technology-enhanced items could have an impact on the comparability of country results and thereby jeopardizing the between-countries comparisons, the second main goal of ILSAs. The study includes an interpretation and use argument to guide validation studies that are necessary to draw conclusions about the use of technology-enhanced items in TIMSS 2019. The validation studies include both qualitative and quantitative studies that aim to gather validity evidence. It is concluded that the technology-enhanced items do not differ psychometrically from other digital items and that no additional differential item functioning (DIF) occurs. However, the qualitative studies show that the possibilities to achieve a better measure of problem solving are not met yet. The study ends with the conclusion that the current technology-enhanced items are still elementary and therefore, with these items, it is not possible to draw conclusions about the validity of advanced technology-enhanced assessments. For ease of reading, the report was split into two parts. The interpretation and use argument and the validity argument are described in this report, part A. The validity evidence is reported in detail in seven separate studies in part B of the report.
Read more
Kunnen we je helpen?
Stel je vraag via onze kanalen of kijk in de veelgestelde vragen.
Voor scholen: Vergeet niet om het brinnummer bij de hand te hebben en/of in de mail te vermelden, zodat we jouw vraag sneller kunnen behandelen!