التقارير الفنية المتخصصة تحميلReliability in Measurement: Unilevel and Multilevel Approaches Georgios D. SideridisThe purpose of the present paper was to evaluate the internal consistency reliability of the General Aptitude Test assuming clustered and non-clustered data using commercial software (Mplus). Participants were 2,000 testees who were selected using random sampling from a larger pool of examinees (more than 65k). The measure involved four factors, namely: (a) planning for learning, (b) promoting learning, (c) supporting learning, and (d) professional responsibilities and was hypothesized to comprise a unidimensional instrument assessing generalized skills and competencies. Intra-class correlation coefficients and variance ratio statistics suggested the need to incorporate a clustering variable (i.e., university) when evaluating the factor structure of the measure. Results indicated that single level reliability estimation significantly overestimated the reliability observed across persons and underestimated the reliability at the clustering variable (university). One level reliability was also, at times, lower than the lowest acceptable levels leading to a conclusion of unreliability whereas multilevel reliability was low at the between person level but excellent at the between ... المزيد تحميلAdmissions Testing for University Entrance: A Literature Review Ioannis TsaousisTR133-2016 Introduction Standardized testing for university admission has seen extraordinary growth over the past century and affects the lives of millions of young individuals around the globe. Early in their educational career, students are organized and categorized on the basis of their scores on various educational achievement tests and successful student performance on admissions tests remains a vital step in gaining access to postsecondary education (House & Keeley, 1997). According to Noble and Camara (2003), today more than 80% of all four-year universities and institutions in US required an admissions test, and more than 91% of non-open institutions required one. Furthermore, more than 1.5 million students take admissions tests in US per year, and this number is increasing constantly. University admissions tests provide a standardized and objective measure of student achievement and generalized skills. These tests are norm-referenced, which means that the individual's score is compared with the scores of a large group of test takers. Furthermore, a standardized test has set certain rules for administration and scoring, such that everyone taking the test receives th ... المزيد تحميلComparisons between the 5 Common Test Forms of the Computer based GAT Quantitative test and the Paper& Pencil Khurrem JehangirTR142-2016 Abstract This study investigates and compares the dimensionality and factor structure of test data on the two modes of assessment being administered in the NCA for the Quantitative section of the General Aptitude Test (GAT). The GAT test is administered by the NCA as an assessment for admission to institutions of higher education and is administered across the kingdom in two forms, a Paper and Pencil Test (PPT) and a Computer Based Test (CBT). The computer based test is NOT an adaptive test. The cohort of examinees selected for the comparative analysis are those examinees who received similar test forms (almost all items in the two tests were the same) in both modes of assessment. Besides the dimensionality structure, the score reliability of the tests is also estimated and compared. Results were compared for 5 CBT test forms with the 5 corresponding (those with same items) PPT test forms. The results from the comparison of rival confirmatory factor models (and reliability studies) showed that the test data are essentially unidimensional for both modes of administration and therefore further comparative studies like Item fit and DIF analysis can be done across ... المزيد تحميلComparisons between the 5 similar Test Forms of the Computer Based GAT Verbal test and the Paper& Pencil GAT V Khurrem JehangirTR137-2016 Abstract This study investigates and compares the dimensionality and factor structure of test data on the two modes of assessment being administered in the NCA for the Verbal section of the General Aptitude Test (GAT). The GAT test is administered by the NCA as an assessment for admission to institutions of higher education and is administered across the kingdom in two forms, a Paper and Pencil Test (PPT) and a Computer Based Test (CBT). The computer based test is NOT an adaptive test. The cohort of examinees selected for the comparative analysis are those examinees who received similar test forms (almost all items in the two tests were the same) in both modes of assessment. Besides the dimensionality structure, the score reliability of the tests is also estimated and compared. Results were compared for 5 CBT test forms with the 5 corresponding (those with same items) PPT test forms. The results from the comparison of rival confirmatory factor models (and reliability studies) showed that the test data are essentially unidimensional for both modes of administration and therefore further comparative studies like Item fit and DIF analysis can be done across the tw ... المزيد تحميلExamining the Effect of Response Latency on Item Parameterization Using an IRT-MIMIC model Ioannis TsaousisThe analysis of response time has received increasing attention during the last decades, since evidence from several studies support the argument that there is a direct relationship between item response time and test performance. The aim of this study was to investigate whether item response latency might has an effect of item parameters, and more specifically, on item difficulties. To examine the above research question data from 8,475 individuals completing the computerized version of the Postgraduate General Aptitude Test (PAGAT) were analyzed. To determine the extent to which response latency affects item difficulty, we used a Multiple Indicators Multiple Causes (MIMIC) model, in which every item in a scale was linked to its corresponding covariate (i.e., item response latency) and the average scale response latency was linked to the latent variable. Since we were interested in examining the effect of these covariates on the scale’s item difficulties, we ran the MIMIC model within an IRT framework (2-PL model). The results supported the hypothesis that item response latency is related to item characteristics such as item difficulty, since it was found that item response ... المزيد تحميلExamining the Effect of Time-of-Day and Time-of-Week on PGAT Performance Ioannis Tsaousis TR155-2016 Abstract The aim of this study was twofold: first, to examine whether students' performance on the computerized version of PGAT score is related to the time-of-day administration (i.e., morning vs. evening). Second, whether students’ performance on the computerized version of PGAT is related to the time-of-week administration (i.e., beginning of the week vs. end of the week). A total of 8,475 individuals from different places of the Kingdom of Saudi Arabia participated in this study by completing the computerized version of the Postgraduate General Aptitude Test (PAGAT). The results from this study showed that time-of-day did not affect test performance. Particularly, when we analyzed the data within a multilevel framework using time-of-day as a cluster variable and morning vs. afternoon session as a predictor at Level 2, we found no variability on either overall PGAT scale score or the individual scale scores (i.e., verbal, numerical, and advanced math) that could be attributed to the administration of the test at different hours of the day. We also examined whether day of administration (i.e., at the beginning of the week vs. at the end of the week) has an effect ... المزيد تحميلMultilevel Modeling of Batteries of Tests Consisting of Binary Items Tenko RaykovTR144-2016 Introduction This report responds to the request by NCA for a description and discussion of multilevel factor analysis (MFA) procedures relevant for test construction and development, which are also exemplified on data from KSA (specifically, from the GAT-M test battery). MFA is a very comprehensive area (e.g., Muthén & Muthén, 2016). Thus, to be practical, this report and its particular analytic activities conducted on the data set provided by NCA, need to be tailored to the research question(s) raised along with the request, which are of relevance to the Institute. This question(s) is specifically focused on the structure of the GAT-M overall test, consisting (initially) of p0 = 44 items, with a particular view of examining (a) gender differences in its corresponding latent ability (if supported by the analysis findings), and especially (b) potential differential item functioning with respect to gender. Since the J = 13 regions involved in this study are in actual fact the administrative regions of KSA, the variable Region is considered a ‘fixed effect’ in the remainder of this report; this is in the sense that its values in the data set are treated as represent ... المزيد تحميلSituational Judgment Tests (SJTs) as Personnel Selection Tools: A Literature Review Ioannis Tsaousis TR141-2016 Introduction During the last decade, assessment methods based on situational tests have become increasingly popular and dominate the field of personnel selection and appraisal (Campion, Ployhart, & MacKenzie, 2014; Motowidlo, Hooper, & Jackson, 2006). These methods include assessment centers, work sample tests, situational interviews, situational judgment tests SJTs, etc. In these types of assessment, hypothetical scenarios (usually describing a critical situation) are presented and participants are asked to identify among several options the most appropriate response. According to this theory, the most dynamic characteristics of the environment (the situation) on the one hand, and the most prominent individual qualities (e.g. abilities and effort) on the other hand, can provide valuable information for the intentions and dispositions of the individual. Thus, knowing how a respondent would react to a hypothetical situation, could help experts to make valid predictions about future actions and/or decisions. This kind of information could be very useful during a selection or evaluation process, where the ultimate goal is to make valid predictions for future behavi ... المزيد تحميلThe Effects of Pretesting on Computer Based Testing Using the GAT Georgios SideridisTR152-2016 Abstract The purpose of the present study was to explore how subsequent testing influences examinee performance using the novel methodology of latent transition analysis. Participants were 5,091 examinees who took the CBT test more than once. Four groups were created and empirically tested using latent profile analysis using a subtests comprised of four aggregates reflecting quartile performance levels. Thus, four ability groups were created based on their performance on the quantitative component of the CBT. After testing how individuals transitioned between ability classes results indicated that examinees were mostly stable when they belonged to the high ability group at time 1 with only 4.5% of them moving to the immediately lower ability group. Individuals transitioned mostly from lower ability groups to higher ability groups with the largest numbers of transitioning being with the ability group right above the one examinees were at time 1. Thus, the hypothesis of practice effect is most likely supported. Interesting effects were observed with individuals in the above average group transitioning greatly (51% of them) to the highest ability group. Findings tha ... المزيد تحميلThe Effects of Repeated Testing on Person Performance on the Computerized Version of the GAT and PGAT Tests Georgios SideridisTR157-2016 Abstract The purpose of the present studies was to explore how subsequent testing influences examinee performance using the novel methodology of latent transition analysis. Participants in Study 1 were 5,091 examinees who took the computerized version of the GAT test across three time intervals. Four groups of individuals were created and empirically tested using latent class analysis with the predictors being 4 ability testlets (from low to high ability items). The four groups that emerged resembled 4 levels of performance such as those in the interquartile range. Results indicated that individuals were very stable at high ability levels with most movement observed with the above average group moving towards high ability. Interestingly, the most unstable group comprised low achievers who moved towards all possible groups, even the highest ability one. Transitioning from Time 2 to Time 3 essentially replicated the transitioning from Time 1 to Time 2. Study 2 attempted to replicate the above findings with the computerized version of the PGAT, which measures 3 general domains. Results from the latent profile analysis suggested that besides a low and a high ability ... المزيد 1234567 محتوى الصفحة شارك على