Hot topic: Testing the limits
by Sherry Bithell

A hallmark component of college admissions, the Scholastic Aptitude Test (SAT) has generated considerable controversy lately. When the College Board -- the nonprofit that sponsors the SAT -- introduced a new version of the test in March 2005, some educators expressed concern about the test's increased demands, including its new essay portion and longer duration. Last October, scoring errors were discovered on more than 5,000 tests -- 4,411 of which had been scored substantially lower. And this spring, officials from several colleges reported significant drops in the average SAT scores of applicants whose grade point averages and class ranks had remained the same or had risen.

Gary Skaggs is an associate professor in the Research and Evaluation Program in the School of Education. He teaches courses in educational and psychological measurement and research and statistical methods. His research interests include item-response theory, test equating and scaling, item bias, and standard setting.

Despite all the attention, the SAT is only one of many tests that students take before they reach college. According to Fair Test: The National Center for Fair & Open Testing, America's public schools annually administer more than 100 million standardized exams to measure IQ, achievement, and progress.

The emphasis placed on the results of such tests raises the question, "Are standardized tests effective?" This simple question has several answers, according to Associate Professor Gary Skaggs of the School of Education in the College of Liberal Arts and Human Sciences.

True or false?


The chief advantage of standardized testing is that it provides a level playing field, Skaggs says. "Imagine, instead, a system in which each institution has its own entrance exam and test scores are determined by the institution's own subjective scorers. There would be plenty of chances for abuse."

This advantage is offset by a sizeable shortcoming, however. "Some knowledge, skills, and abilities cannot be tested in a large-scale, objective format," adds Skaggs. "Some important higher-level reasoning skills, for instance, are not amenable to multiple-choice tests."

First introduced in 1926, the SAT was created to measure the reasoning skills deemed necessary to succeed in college, and Skaggs agrees that a correlation typically exists between SAT scores and freshman GPAs. However, of standardized tests overall, he notes, "I think the tests can measure most students' abilities or achievement fairly well. The problem is the word 'most' -- some students simply do not take tests well."

As an example, Skaggs cites the story of his wife who, as an undergraduate, entered her finals with straight As but almost flunked because of severe test anxiety. She went on to excel in graduate school and to earn a doctorate, demonstrating that tests may not be an accurate predictor for everyone. "There are many students like her, so I think there needs to be an alternate way to measure students so that poor test-takers can succeed."

Another disadvantage of standardized testing is that the scores reflect a student's performance on a given day, when illness or personal problems may influence the outcome.

Skaggs' primary concern with the SAT in particular is echoed by educators and administrators across the country: the issue of equity. SAT-preparation courses -- which have been shown to raise a student's scores -- can be costly and not all students have access to such courses, especially students from lower-income families. Moreover, although many high schools offer test-preparation courses, they often are held after school, and disadvantaged students may not be able to attend, Skaggs points out.

"Can you compare SAT scores of students with different degrees of test preparation, and are students from low-income families disadvantaged? And if a student has had the best that the test-preparation industry has to offer, is his or her score really an accurate reflection of the student’s readiness for higher education?"

Taught by testing

Paradoxically, according to FairTest's evaluation and Skaggs' own assessment, students are more frequently being readied for higher education by being taught how to take tests. "A huge emphasis is placed on testing in K-12," he notes. "Testing is the main vehicle for evaluating the quality of public education as well as individual student progress, and that is certainly too much reliance on a single test."

In particular, Skaggs is referring to the No Child Left Behind (NCLB) Act's state test requirements. He worries that the punitive nature of the NCLB, which imposes increasingly harsh sanctions on schools that fail to meet the act's "Adequate Yearly Progress" standards, will result in the mislabeling of good schools. "Scores on state tests tend to rise rapidly in the first few years as school teachers and administrators become familiar with the tests and how to prepare students," he explains. "Once this boost is realized, progress tends to be made more slowly. But the NCLB requires the same level of progress each year. At some point, you will have schools with high passing rates that do not meet their average yearly progress requirements."

Another potentially negative consequence of the NCLB is that pressure to meet the act's standards has led schools to reduce or eliminate subjects that are not tested -- physical education, art, music, and even recess -- a strategy that allows more instructional time for the tested subjects: reading, writing, math, science, and social studies.

In addition, many teachers -- in light of the strict demands of the state tests, along with pressure from their own schools -- feel that they are required to "teach to the test."

Teaching to the curricular objectives that are tested is not problematic, Skaggs says. "The problem I see happening is that teachers are being forced to teach subjects the way in which they are tested. For instance, SOL reading tests have specifications on the types of reading passages that are used, such as length, vocabulary, and subject matter. Teachers then feel compelled to teach children to read only those types of reading materials."

Do the ends justify the test?

The merit of testing, Skaggs believes, ultimately rests upon the purpose of the test. He believes that a statewide test cannot adequately measure a student's strengths and weaknesses, simply because it is too short. A state test can, however, adequately measure the progress of students in a state, district, or school in mastering the curriculum as a whole, he says.

As for the efficacy of standardized tests in individually measuring a student's potential, Skaggs thinks that relying on a single score on a single test will hurt some students, particularly those who have not had access to the best test preparation or who do not test well.

"If the purpose of the test is to determine if a student will move on to the next grade or will graduate or will be admitted into some program -- if the test is given to make an important decision about the student -- then there should also be alternative measures, such as teacher recommendations or a portfolio of the student's work," says Skaggs.

In the final analysis, it seems that the only means of accurately assessing an individual's progress is stepping back from standardization and bringing back the human touch.

