Back in March, Tobin White, an Associate Professor at the UC Davis School of Education, discussed his assessment of current testing practices at DJUSD. According to Professor White, while the district uses two exams, the OLSAT and the TONI, to identify students eligible for the AIM program, the OLSAT identifies about 24% of the AIM students, the TONI 49% of AIM students – and 27% are identified through some other test.
The TONI is used to retest students who come within five points of qualifying for enrollment on the OLSAT test or who have been identified with various “risk factors.” Professor White’s research determined that students administered the TONI were “six times more likely to qualify than those taking only the OLSAT.” They were also nine times more likely, according to Professor White to score in the 99th percentile.
He writes, “These are radically different measures, yet they are being treated as equivalent in program placement decisions.”
He adds, citing research, “The TONI was not designed to replace broad-based intelligence tests but rather to provide an alternative method of assessment when a subject’s cognitive, language, or motor impairments rendered traditional tests of intelligence inappropriate and ineffectual.”
The Vanguard, about ten days ago, exchanged emails with the professor to clarify some of his thinking and conclusion. This is the second part of those answers.
The Vanguard asked on what basis he would say that the TONI should only be used for kids with severe language impairments such as aphasia. How, then, do we test ethnic minorities who may not have strong English language skills, either because their parents are non-native speakers or they are low SES?
Professor White responded, “I’ve never said the TONI should be used only in cases of severe language impairments such as aphasia. I’ve said it should be should be used in the variety of cases where there is clear reason to expect a child to be disadvantaged by verbal and written tests—severe language impairments, certainly, but also limited proficiency in English, dyslexia, deafness.”
He continued, “My primary basis for making this claim has been to follow the advice of the people who created the test. In my presentations to the School Board and the AIM advisory, and in my written report, I quoted the authors of the test, Brown, Sherbenou & Johnsen.”
These authors wrote the following in the examiner’s manual that accompanies the TONI testing material: “The TONI was not designed to replace broad-based intelligence tests but rather to provide an alternative method of assessment when a subject’s cognitive, language, or motor impairments rendered traditional tests of intelligence inappropriate and ineffectual” (Brown, Sherbenou & Johnson, p. viii).
Later they said, “…individuals who are deaf and hearing impaired…people with aphasia, dyslexia, and other disorders related to spoken and written language, and…people who have excellent language skills but who are not proficient with written or spoken English. The TONI-3 was built with these populations in mind” (Brown, Sherbenou & Johnsen, p. 14).
He stated, “Again, as I said above, the issue is that the TONI is a coarser measure than language-based, multidimensional tests precisely because it provides less information—it is unidimensional. It is an important resource for addressing concerns of language bias, but when those concerns aren’t present, it’s probably not the right instrument to use when a multidimensional alternative is available.”
The Vanguard asked if he believed that qualifying by a test other than OLSAT, for example TONI, does not suggest giftedness?
Professor White responded, “I believe that both the OLSAT and the TONI are good initial screens for giftedness. A high score on either one suggests the possibility of giftedness, and is a good reason to examine additional information—other tests of intelligence and/or achievement, samples of student work, input from teachers and parents.”
The Vanguard asked if he believed that single domain tests are appropriate for identification for gifted programs.
Professor White stated, “When they are warranted, certainly, but as I’ve already stated, they should be used in combination with, not simply in place of, other appropriate measures.”
The Vanguard asked if he believed there are cultural and/or socioeconomic factors that impact how intelligence is shown on these tests.
“Yes, absolutely,” he said. “I believe that all intelligence tests include elements of bias.”
Professor White explained, “The data from DJUSD clearly suggest that the OLSAT has been biased in favor of Asian students (who are overrepresented in the OLSAT-identified pool relative to the overall district population), against Hispanic or Latino and African-American students (each of whom are underrepresented in the OLSAT-identified pool), and more or less neutral relative to White students (who are represented in the OLSAT-identified pool and the overall district population at fairly similar rates).”
He added, “Performance on the OLSAT may also be improved through test preparation, which clearly advantages students who have access to that preparation over others who don’t, so that’s also a concern.”
He cautioned, “Keep in mind, though, that just because the TONI and other non-verbal tests should limit language bias, they do not necessarily remove cultural bias. Images or figures, just like words, can be culture-laden or culturally specific, as can gesture and pantomime (which are the ways a TONI administrator can communicate with test-takers in place of verbal communication).”
Tobin White would also clarify:
In the 2011-12 and 2012-13 school years, a total of 426 3rd grade students were administered a TONI, out of 1286 total enrolled in those years combined—that’s 33.1%, almost exactly a third. So this is how the math works: if you somehow manage to pick out the one-third of the 3rd grade students who were not identified by the OLSAT but have the highest likelihood of qualifying (or of scoring at the 99th percentile), that likelihood could at most be higher by a factor of three (the inverse of the sample-to-total population ratio, one third) than for the population as a whole. So the apples-to-apples comparison, in this “best case” scenario, is the theoretical factor of three, versus the actual factors of six (qualifying) and nine (scoring at 99th percentile) I found in the data.
Or, to put it another way, with raw numbers: of the 426 students administered a TONI in those 2 years, 185 (43% of all those tested, and 14% of all third graders) achieved a qualifying score, and 121 (28% of all those tested, and 9% of all third graders) scored in the 99th percentile. Even if the TONI were also administered to all the remainder of the 1286 third graders in those two years and not a single one score at 95 or above, there would still have been more than twice as many qualifiers as through universal testing on the OLSAT (89, 7% of all third graders) and three times as many at 99 as on the OLSAT (41, 3% of all third graders).
And of course, it’s quite a stretch to think that the district would be so skilled at identifying “strong candidates” for the program as to precisely select that third most likely to qualify before even administering them the TONI. And more to the point, we know that many if not most were selected for rescreening because of risk factors—an appropriate reason to rescreen them by some means, for sure, but not to think that their likelihood of qualifying for AIM was higher than that of other students simply because they had those risk factors.
The simple takeaway is that even taking into account any criteria by which you might select students to take it, the TONI yields qualifying and ceiling scores at much higher rates than the OLSAT.
—David M. Greenwald reporting