Science is Badly Used in Risk Assessment

May 15, 1998 • Speeches

If you like Ken Starr, you’re bound to like Environmental Protection Agency risk assessments. The Starr investigation initially focused on a failed real estate deal in Arkansas. Then, with official sanctions, it grew into a great, amorphous mass with its pseudopods spread everywhere, and its fingers pointing at more and more people. No matter how decisively a pseudopod was cut off or a pointing finger bent back, the investigation continues. It finds a new rumor, a new source of information, a new titillation, a new innuendo, and new or regenerated pseudopods reach out even farther.

Without venturing a guess about the motivation of the investigators, their behavior suggests that they are convinced that crime has been committed. They don’t know what the crime was, where or when it happened, or who did it, but they’re convinced it happened.

How very much like an EPA risk assessment. The idea that environmental chemicals cause human diseases is so ingrained that any chemical can be—and in many cases, must be—subjected to test after test. Negative test results can be of little consequence; equivocal ones are relentlessly pursued, and positive ones are ballyhooed. That interpretation of test results turns science on its head.

The null hypothesis is at the center of science. When a scientist suggests or has reason to believe that A causes B, the null hypothesis of that statement is that A does not cause B. Convincing evidence for A causing B comes from testing and faulting the null hypothesis. In science, results that support the null hypothesis are given a bit more weight than ones that don’t, in part, because scientists don’t like over‐​reachng and then having to retract.

In risk assessment, it’s just the opposite. How many times have you read a review of a series of animal tests—some positive, some negative, some equivocal—or of a series of epidemiology studies—some positive, some negative, some equivocal? How many times has the review ended with a comment along the lines that although the evidence is not convincing, the few positive results can’t be discounted? Why not?

More pointedly, since risk assessment depends on the tools of science, why does risk assessment assign priority to positive results, no matter how few or how weak, and shove aside the null hypothesis? The answer is simple. Long ago, a series of policy decisions was made at EPA about how to interpret data. EPA and many other people dignify those policy decisions by calling them “science policy.” I think they are more honestly called assumptions.

Karl Popper identified “falsification” as the central element in science. Science advances as scientists develop hypotheses—ideas or theories—about how the world works, and design tests of those hypotheses by experiment or other comparison to sensible reality. It is the second step—the testing—that sets science apart from other kinds of scholarly endeavor.

Paul Davis, a physicist, put it this way in emphasizing the importance of tests and falsification:

A powerful theory is one that is highly vulnerable to falsification, and so it can be tested in many detailed and specific ways. If the theory passes those tests, our confidence in the theory is reinforced. A theory that is too vague or general, or makes predictions concerning only circumstances beyond our ability to test, is of little value.

Risk assessments begin with the reasonable assumption that at some dose under some circumstances, a chemical will cause an adverse biological effect of one kind or another. We can call that assumption a hypothesis, and we can test it in animals. The result of the test is an observation that we can use to formulate a hypothesis about the likely effect of that chemical on humans.

So we have a risk assessment, which is a hypothesis, but we have no way to test it. As a policy decision, EPA is interested in cancer risks that are thousands of times below the level of detection by epidemiology. For non‐​cancer effects, EPA is interested in risks 100 or 1,000-times below the lowest dose with an adverse effect in animals. Those risks can’t be directly tested either.

Whatever they are called, policy choices or science policy choices or assumptions, they are necessary for risk assessment, which, to use Alvin Weinberg’s term, is a “trans‐​science” activity. Trans‐​science refers to questions that can be asked scientifically, but that cannot be answered by science. For instance, the question, “How many people die from environmental exposures to chloroform?” is a question about causation, but there is no way to measure the number of people who die from environmental exposures to chloroform each year, if there are any.

Faced with the impossibility of directly testing the risk doesn’t mean that the hypotheses are useless. People interested in using risk assessments for public policy could devise tests and experiments to determine mechanisms of action, understand the biochemistry of those reactions, and factor them into the risk assessment process. Keeping in mind the importance of the null hypothesis in science, they could evaluate scientific information and apply it to the assessment of human risks.

Well, that’s fine, but what do we do until we have detailed information about biochemistry, molecular biology, and toxicology? When faced with that problem nearly 30 years ago, the fledgling EPA developed a series of assumptions to guide its risk assessments. That was understandable at the time because we didn’t know very much then. It is not understandable now.

So far as I know, the United States is unique among all the countries of the world in assuming that the risks from essentially all carcinogens can be modeled with a linear no‐​threshold model. As a result, EPA worries about exposures to dioxin that are 160 to 1600‐​times lower than the threshold for worry in other countries. Arguably, we’re safer. I haven’t heard, however, of Canadians pouring across the border seeking refuge from government‐​permitted higher exposures.

It’s clear from looking at predictions from it, that the linear no‐​threshold model produces estimates that can never be falsified. EPA focuses on some upper confidence limit on risk, rather than the best estimate, in its regulations of carcinogens. If it were possible to measure the predicted risk, and if the risk were found to be smaller than the upper confidence limit, it would not falsify the estimate. After all, the risk could be much smaller and be within the confidence limits. In fact, it can be zero and be within the confidence limits. Is that the best that can be done? After decades of risk assessment research?

For years, EPA clung to the assertion that the linear, no‐​threshold model applied to all mutagenic carcinogens and that many chemicals, even if they were not positive in a standard mutagenicity test, might be mutagenic under some unspecified conditions. The weight of scientific knowledge about mutagenic activities has forced EPA to concede that some carcinogens are not mutagenic.

Dioxin is one such chemical. That information has not forced EPA to abandon its linear, no‐​threshold model. On the contrary, in its 1994 “Dioxin Reassessment,” EPA made reference to the idea that every organism is exposed to a plethora of carcinogens and that any addition to that exposure will add to risk. Based, in part, on that idea, EPA fell back on its linear, no‐​threshold default model to estimate the cancer risk from dioxin. The EPA’s Science Advisory Board roundly criticized that estimate, referring to the great amount of information about the biochemical effects of dioxin that EPA ignored, and rejected EPA’s cancer risk estimate.

Soon after that rejection, scientists who supported EPA’s risk characterization had a letter published in Science that said, the SAB had been generally supportive of EPA’s efforts and that the reassessment required only a little more “ripening.” Well, the SAB meeting about dioxin was three years ago today, and no re‐​draft of the Dioxin Reassessment has appeared. After three years most things are described in terms of rot not ripeness.

EPA proudly announced that the 1994 Dioxin Reassessment was a move away from myopic concentration on the carcinogenicity of dioxin to a broader consideration of the other toxicities associated with the chemical. EPA described a number of experiments that investigated dioxin’s immunotoxicity, and it focused on a study that reported immunotoxicity in juvenile marmosets that were exposed to dioxin levels not unlike the levels that a few humans had experienced in workplace accidents. The scientist who did the marmoset study was so surprised (and, I expect, worried) by the results that he repeated the measurements in men who had been highly exposed. There was no effect in men.

Food and Drug Administration scientists who reviewed the Dioxin Reassessment asked EPA scientists why the positive marmoset results were featured in it and the negative human results ignored. The answer was, reportedly, slow in coming, but EPA’s reason was that dioxin might be more immunotoxic in juveniles, and the marmoset data might be more important for risk assessments. Some people would agree with that logic; some wouldn’t. Fair enough, but both experiments should have been discussed, and EPA should have presented its reasons for its focus on the marmoset data. Data selection is not good science.

Summing up its analysis of the non‐​cancer toxicities, EPA said,

It is not currently possible to state exactly how or at what levels humans in the population will respond [to dioxin exposures], but the margin of exposure (MOE) between background levels and levels where effects are detectable in terms of TEQs [summary of the toxic effects expected from exposures to dioxin and dioxin‐​like compounds] is considerably smaller than previously estimated.

What’s that mean? I’m not sure, but the SAB said that EPA’s summary

states that we don’t know what will occur, or at what level this unknown [response] will occur, but we know that it will occur (in terms of TEQs) closer to background levels that previously estimated.

The SAB rejected EPA’s cancer risk assessment because of its reliance on the default model. It rejected EPA’s saying that levels of dioxin equal to or near those in the current environment were causing other toxic effects because of the absence of scientific underpinning.

But EPA, apparently, knows that dioxin is bad, very, very bad. EPA’s attempts to convince the world that dioxin is a terrible carcinogen ran afoul of the fact that dioxin isn’t a mutagen, but EPA latched onto additivity as a justification for its risk model. Even so, the carcinogenicity worry was losing steam. The agency shifted to non‐​cancer endpoints to strengthen its case. SAB didn’t find much to admire in that analysis either. Now EPA is taking its time to repair its Dioxin Reassessment. Does this remind anyone else of Mr. Starr’s constantly replenished reasons to continue his investigation?

All three of EPA’s risk assessments for dioxin, one in 1984 (or 5), one in 1987 (or 8), and one in 1994 bear the legend “Review Draft: Do Not Cite or Quote” or the equivalent. The clearest comment on EPA’s scientific failure is the SAB rejection of the 1994 draft, but throughout the long dioxin story, EPA has not completed a risk assessment for the “most potent carcinogen.” Government scientists estimate that dioxin regulations have cost $100 billion, largely in cleanup costs. That’s a lot of money under any circumstance. It’s really a lot of money based on a risk estimate that EPA doesn’t want cited or quoted.

Two years ago, EPA issued its third cancer testing guidelines. EPA made a great deal of noise about how these new guidelines would encourage the production of mechanistic information, and it suggested that that information would be considered in making decisions to move away from its linear, no‐​threshold default model. The $1 billion dollars spent on dioxin research has not produced enough information to force such a move, but maybe $2 billion’s worth will.

In fact, EPA, in three cases, has moved away from a linear, no‐​threshold model—alpha-2-globulin, some pesticides that affect the thyroid, and chloroform. It’s not a fast process—10 years passed between EPA scientists’ publishing the data that showed the thyroid carcinogen risk should be modeled with a threshold and EPA’s policy decision based on that information.

The marvel, of course, is that EPA ever backs away from the linear, no‐​threshold default. It certainly doesn’t have to. It can invoke the additivity idea as it did with dioxin, and I don’t see how any amount of mechanism information can displace the default. Perhaps EPA will resolve the “mechanism” vs. “additivity” issue.

The EPA’s draft cancer guidelines remain in draft. It’s been over two years since the SAB held a public comment meeting on the guidelines, which remain somewhere in the bowels of EPA. It amazes me that regulations costing billions of dollars and tests costing millions are done to comply with draft risk assessments and draft cancer guidelines. Surely taxpayers and regulated industries deserve more respect than that shown by EPA’s inability to produce final documents.

If I told you that I recommend a paper called “The Science Charade in Toxic Risk Regulation,” you might guess that its author would be from Big Chemical or Big Oil or Big Food. The author is, however, from Big Government. Wendy Wagner was a lawyer for EPA before taking a position at Case Western Reserve University School of Law, and she published “The Science Charade” in the November 1995 Columbia Law Review.

Wagner writes,

…past [regulatory] failures are at least partly attributable to a pervasive “science charade” where agencies exaggerate the contributions made by science in setting toxic standards in order to avoid accountability for the underlying policy decisions.

In practice, this means that once the EPA identifies a chemical as a bad actor—a policy or personal decision—it has everything to gain by dressing up that decision in terms of science. Default assumptions, designed to err on the side of safety, sound scientific and hide policy decision about error. Additivity sounds scientific, but it hides the possibility that a decision can be made that no amount of mechanistic information is going to set aside the default risk assessment.

She addresses trans‐​science issues in a number of places. For instance,

Major policy decisions that undergird a quantitative toxic standard are at best acknowledged as “agency judgments” or “health policies,” terms that receive no elaboration in the often hundreds of pages of agency explanations given for a … standard and appear in a context that gives readers the impression they are based on science.

“Agency judgments” and “health policies” were all anyone had in the 1970s, but we are approaching a new century. It’s hard to square EPA’s advertising itself as a science‐​based agency when it relies on judgments and policies chosen in a time of relative ignorance.

Being convinced that one is right or that the science is too frustrating or doesn’t lead to the correct decision can lead to situations in which, Wagner says,

significant public policy judgments are made by technocrats who have not been appointed as policymakers and who are unlikely to be held accountable for their decisions. In some cases a scientist may become convinced that the highly controversial issues will never be resolved unless she steps in and resolves the trans‐​scientific questions herself. In other cases a scientist may enjoy being the source of public policy, particularly when her hidden value choices are likely to be free from oversight by high level governmental officials and the public at large.

In a similar comment, she says

the science charade confers several benefits on scientists. First, it allows them to control access to the resolution of all questions that include even the slightest component of science, and to do so generally with minimal interference from lawyers and governmental officials.… Second, when problems concerning the regulation of toxics are characterized as appropriate questions for science, their resolution is believed to depend upon scientific research. The science charade thus may actually increase the amount of funding dedicated to research or, at the very least, highlight the importance of scientific research and potentially elevate the role played by scientists in determining the future of toxics regulation. Third, scientists may also enjoy the ability to participate actively in important national policy decisions without the inconvenient public accountability that attaches to most elected and appointed government officials.

The courts, she writes, offer many incentives for the continuation of the charade.

First, … [b]y correlating the survival rate of an agency standard with the extent of technical explanations garnered in its support, the courts offer agencies strong and virtually inescapable incentives to conceal policy choices under the cover of scientific judgments and citations.…
Second, …the tendency of many courts to defer to the agency as expert when the issue is framed as scientific in nature. In fact, if an agency can represent to the court that its technical explanations for a toxic standard lie on the ‘frontiers of scientific inquiry,’ a term that could easily encompass trans‐​scientific issues, the agency decision is subject only to the most cursory review. By insisting on technical justifications on the one hand, and pledging not to scrutinize the accuracy of the technical explanations on the other, the courts not only fail to prevent the science charade, they make it almost obligatory.

My last quote from Wagner parallels my views about the effects of EPA policies on research.

The agencies’ science‐​bias in prioritizing substances for regulation virtually guarantees that greater regulation will ultimately follow advancements in scientific information and knowledge. A rational manufacturer with fiduciary obligations to shareholders rather than to the public is thus unlikely to undertake research voluntarily on the potential carcinogenicity or mutagenicity of a substance it produces if more aggressive regulation is likely to result.

I would add that given the tiny or zero risk from most, and probably all, environmental chemicals, I’m not sure that a rational manufacturer owes any obligation to the public other than assuring that his workers are safe.

EPA uses science poorly or badly in risk assessments and, as Wagner highlights, policy decisions. What can be done to make things better?

It’s difficult to imagine that EPA will change internally. A scientist, David L. Lewis, who has been with EPA since its founding, said that EPA sets it regulations first and then finds science to justify them. Statements like that in a paper he published in Nature cost him. EPA said that the disclaimer he attached to the paper had been too weak and that he was trying to pass off his personal opinions as some sort of agency statement; it also accused him of violating the Hatch Act that forbids legislative lobbying by Executive Branch officials among other things. Aided by a lawyer, Lewis fought back, and the EPA backed off and apologized.

The Washington Times quotes Lewis,

EPA scientists know what Carol Browner cannot seem to understand. Bad science does not protect the environment. She has let EPA’s science organization go without a permanent head for much of her administration, buried scientists in bureaucratic red tape, and aggressively punished those who speak out about the agency’s problems.

It’s easy enough to dismiss Lewis as a sore‐​head, but it’s impossible to disregard the poor performance of EPA in marshalling and understanding science so that it can convince the SAB or produce documents that can advance beyond the “Draft, do not cite or quote” stage. It’s unlikely that EPA change will come from within.

The SAB’s record is spotty. It’s a huge organization, and different committees of the SAB behave differently. I think it would be far, far better to divorce the SAB from EPA. For a start, SAB staff should not be EPA employees, and they should not be housed in the same buildings. Although there are rules about conflicts of interest, I do not know how those rules can be used to keep out EPA grantees who benefit from the continued use of assumptions and EPA’s willingness to focus on finding or confirming risks to the exclusion of doing the best science and making use of it. Additionally, the role of EPA in selection or rejection of SAB members is opaque to me.

Congress could force EPA to be more timely by questioning any regulation that is based on documents stamped “Do not cite or quote.” Courts could do the same thing. Those actions would force EPA to move more quickly. As it is, time is EPA’s great ally. Industrial scientists who really know a subject are very likely to be pulled off into other projects by their companies as years pass. Then, when EPA revisits a long‐​dormant issue, EPA and EPA grantees are the only experts still active in that arena.

Ultimately, we can do well without the EPA. The country made it through almost two centuries without one, and as Robert Crandall of the Brookings Institution has shown again and again, the creation of EPA has had no discernible effect on the rate at which the environment has been cleaned up.

Testing of new chemicals can be handed over to third‐​party organizations, like Underwriters’ Laboratory, that would be formed to test and certify chemicals in the absence of EPA regulations. There’s convincing evidence that state and local governments do a much better job of dealing with old risks in the environment, such as waste sites. Those responsibilities would be taken up by other organizations if EPA disappeared.

Most complaints about EPA are directed at its costs. Well, costs are important. Unlike environmental epidemiology that struggles to find associations, it’s easy to show that rich people live longer with fewer diseases. Simply lining up ZIP codes by average income produces a list that parallels the ZIP codes with highest life expectancy. Economists disagree about whether regulatory costs of $5 million or $50 million cause one premature death, but taking money out of citizens’ pockets to pay for regulations reduces their wealth and their ability to provide for their own health.

I think, in addition, that we have paid a direct health cost for federal agency predispositions to blame everything that goes wrong with health on environmental chemicals. A number of epidemiology studies around waste sites found an increased risk of spina bifida. What was the cause? EPA and the Centers for Disease Control focused on environmental contaminants. Other people, especially in industry, said things like, “Wait a minute, those people tend to be poor. Maybe it’s their nutrition or something else other than environmental chemicals.” Now, we know that folate deficiencies are a major cause of spina bifida and that supplementing food with folate protects against that birth defect. Perhaps the assumed environmental connection didn’t slow the drawing of the low folate‐​spina bifida link. Perhaps it did. And, so far as I know, more than 15 years of epidemiology around waste sites have failed to produce convincing evidence of human harm.