![]() February 1999 Should the Census use Sampling? As the Supreme Court contemplates the case involving statistical sampling to adjust for undercounts in the 2000 census, a sample of OR analysts weigh in with their opinions By Peter R. Horner What should the country do when science brings about a situation lawmakers never considered? The answer seems obvious: pass a new law to account for the science. History is full of relevant examples. The Internet, to take a recent and obvious example, was the Wild West of the late 1980s and early 1990s - a lawless place where he who had the biggest gun and fastest draw won. Over the past several years, however, state and federal legislators have restored a sense of order to the 'Net by passing a series of laws designed to bring it in line with more conventional means of communication and commerce. The issue of science getting ahead of the law is neither new nor novel, but it certainly becomes more dicey when the new science treads on old political turf. Such is the case with the ongoing controversy over whether the 2000 U. S. population census should include sampling-based methods to adjust for undercounts. Some time within the next few months the U.S. Supreme Court is expected to issue a ruling on the matter. At stake is the allocation of members of the House of Representatives among states, and of federal funds to state and local governments. Background of the Case Before going any further, some background information is in order. The legal history that follows is distilled from a variety of sources, most notably the web sites of the Census 2000 Initiative, the Southeastern Legal Foundation and the American Statistical Association's Blue Ribbon Panel on the Census (see references). The principals in the lawsuit before the Supreme Count include the House of Representatives itself and the U.S. government as represented by the Clinton administration. House Republicans and the Southeastern Legal Foundation have challenged both the legal and the scientific justification for using sampling. Meanwhile, a number of groups representing ethnic minorities, the poor and large cities have filed court briefs arguing that enumeration without sampling-based, non-response follow-up not only decreases accuracy but also deprives these groups and governments of representation and funding to which they are entitled. Article I, Section 2, Paragraph 3 of the U.S. Constitution states that the administration shall conduct an "actual enumeration" of the population "in such manner as they shall by law direct," to be used in determining the number of House seats each state will have. It is important to note that we're talking about changing the law, not the Constitution. Beginning in 1970, using post-enumeration sampling surveys, the Census Bureau determined that the population had been undercounted. In addition, it appeared that certain ethnic minorities and poorer people were more likely to be undercounted. Because of people's increasing mobility and a trend away from single-family households, the Bureau's experts also predicted that undercounts would get worse. Conventional wisdom among politicians and social scientists holds that correcting these undercounts would increase population count in places which tend to vote Democratic. Certainly it increases counts in large cities and in the most undeveloped rural areas, which would affect the allocation of federal funds under revenue-sharing programs. Therefore, the way people are counted is widely viewed as significantly affecting both money and political power, which is why the issue has attracted such intense political attention. Congress amended the census legislation in 1976 to mandate sampling for adjustment for purposes other than apportionment, and to investigate its potential effects on apportionment. The Census Bureau carried out a sampling-based adjustment of the undercount in 1980. In 1987, the Reagan administration decided that there would be no undercount adjustment in the 1990 census, because, in its view, the science wasn't solid enough. In 1992 Congress mandated a study by an expert panel of the National Academy of Sciences to recommend how to conduct the 2000 census. The panel recommended post-enumeration follow-up surveys based on sampling, and an additional sampling-based survey to provide a cross-validation estimate of undercounts. Meanwhile, Congress passed legislation in 1995 and 1997 which prohibited using sampling instead of full enumeration. Some Congressmen have stated publicly that sampling is "not scientifically valid." In 1997, the House of Representatives and the Southeastern Legal Foundation sued to block the use of the Census Bureau's sampling-based plan. In August 1998, U. S. District Courts in Washington, D.C., and Virginia decided cases in favor of the plaintiffs on narrow legal grounds, holding that the legislation prohibited any method other than full enumeration, at least as applied to apportionment of congressional districts. The U. S. government, in its appeal to the Supreme Court, asserted that the method it proposed will both increase accuracy and reduce cost, and that expert scientific opinion supports this claim. House Republicans and the Southeastern Legal Foundation argued that the government's plan deliberately increased the enumeration undercount to reduce cost, and that the resulting problems with accuracy are more attributable to the plan than to inherent problems in enumeration. Time to Stand Up and Be Counted Politicians and lawyers have dominated the debate to this point. What about operations researchers and management scientists, people who use statistical sampling every day, people who make their living studying public policy issues from a cost-benefit point of view? What do they have to say about the subject of statistical sampling and the census? "Given that the country now appears to be run largely by polls any one of which asks roughly one American in 150,000 for his or her views the use of sampling as a modest adjunct to a direct census count does not seem a terribly radical step," says Arnold Barnett of MIT, who does applied statistical work on health and safety. "Whether the language of the Constitution expressly prohibits such sampling, however, is not a matter on which INFORMS people have any particular insight." Random sampling, says Barnett, is no more controversial to statisticians than stethoscopes are to doctors. "In the case of the census," Barnett continues, "I assume sampling would work roughly as follows: A preliminary attempt to count people might suggest that, in a given city, 1,000 buildings are abandoned and uninhabited. To pursue the issue further, the Census Bureau might choose 100 of these buildings at random and actually visit them. If 15 of the buildings were found actually to have residents and an average of three apiece then 45 people missed by the initial inquiry would have been located. And, given that only 10 percent of the 'abandoned' buildings had been canvassed, it would be reasonable to assume that visiting all the buildings would have found about 450 people. "The only legitimate cause for suspicion might be that the 100 buildings to be visited were not chosen at random, but were selected because, say, they had an especially high chance that people lived there. However, if the sampling procedures were specified well in advance and monitored during the census by outside parties, then this potential difficulty could wither away." Politicians and the public often question whether sampling is "valid." As one OR analyst we talked to noted, besides sounding "jargony," the term has context-specific meanings. If the definition of valid is "able to give a more accurate estimate of the population size given a specific budget of resources available to make that estimate," then most OR analysts would probably agree that sampling is, indeed, "valid." Of course critics can use a different definition and claim, rightly, that a sampling-based approach to estimating the population is almost guaranteed to give the wrong answer. The same could be said for direct enumeration, but it is inconvenient for critics to point that out. As the debate heats up, it's interesting to keep the following points in mind:
In regards to the final point, Jonathan Caulkins of Carnegie Mellon's Heinz School of Public Policy and Management offers this: "It is often said that 'the Devil is in the details' but with census sampling the Devil may be in the debates over the details. Even if essentially every statistician agreed that some form of sampling would be preferred to direct enumeration, we cannot expect unanimous consensus concerning exactly how that sampling should be done. "It seems possible that two equally or nearly equally valid sampling approaches might lead to population estimates different enough to matter for political or budgetary purposes. For example, one approach might assign one more congressional seat to one state than another does. If so, then it is not hard to imagine acrimonious court cases pitting dueling statistical experts against each other in a way that makes the lay observer mistrustful of sample-based estimates, statisticians, and perhaps even science and mathematics more generally." Ed Kaplan, professor of Management Sciences and Public Health at the Yale School of Management, agrees that the possible results not the methodology are really what's causing all the fuss over sampling. "If the goal is to estimate the population of the country as well as the distribution of various features of that population race, income, employment, etc.," Kaplan says, "there is no question that properly employed, statistical sampling can be used to improve the accuracy of the existing approach. The objections raised, of course, are more due to the anticipated consequences of such statistical corrections than due to the 'science' underlying sampling itself. "If it was demonstrated that the employment of sampling would not change greatly the results of the census on the apportionment of congressional seats, for example then the opposition would not be nearly so strong. There will always be those who take the word 'enumeration' literally, but this argument is of course a joke, as the 'enumeration' currently invoked is itself an imperfect sample. "Importantly, this cuts two ways if it was demonstrated that the employment of sampling would not change the consequences of the exercise, I suspect many proponents of sampling would also disappear." Proponents might disappear, but politicians won't. Depending on the Supreme Court's decision, it isn't hard to imagine future Congresses and administrations continuing the fight for decades. When votes, political power and money are at stake, politicians will go to the mat. Count on it.
References
Peter R. Horner is the editor of OR/MS Today. OR/MS Today copyright © 1999 by the Institute for Operations Research and the Management Sciences. All rights reserved. Lionheart Publishing, Inc. 506 Roswell Street, Suite 220, Marietta, GA 30060, USA Phone: 770-431-0867 | Fax: 770-432-6969 E-mail: lpi@lionhrtpub.com URL: http://www.lionhrtpub.com Web Site © Copyright 1999 by Lionheart Publishing, Inc. All rights reserved. |