Classic Study of the month

An maintenance randomized trial of olanzapine

A demonstration of the invalidity of the enriched design

The first dopamine blocker to receive a maintenance indication for bipolar illness, which occurred about a decade ago in 2004, was olanzapine.  That indication was based on multiple maintenance trials, all of which were enriched. One study involved addition of olanzapine to standard mood stabilizers (lithium or divalproex); another, which is presented here, involved monotherapy with olanzapine versus placebo. Besides these studies, other maintenance RCTs also were conducted in which olanzapine was compared to lithium, in one study, and compared to divalproex, in another study. This analysis focuses on the monotherapy study mainly, which was the basis for the FDA indication. 


This study was organized and conducted by Eli Lilly specifically to obtain FDA indication.  

Read simply, as presented in the abstract, the results are that  225 patients received olanzapine compared to 136 patients who received placebo, for up to 48 weeks.  After recovery from an acute manic episode with olanzapine, those who stayed on it had much more benefit than if they came off.   The main outcome was time to a mood episode relapse, which was much shorter with placebo (median of only 22 days) versus olanzapine (median of 174 days).  The overall relapse frequency was 80% with placebo versus 47% with olanzapine.  

This basic description of the study results seems stunning. Olanzapine is incredibly effective.  There is twice as much relapse if you don't take it than if you take it. 

But even with this simple description, the results should raise a question in the mind of the reader. Note that placebo-treated patients relapse, on average, only 22 days into the study.  That means that when the randomized study begins, and patients are given a double-blind pill and they don't know what they are receiving, the placebo patients relapse massively in just 3 weeks.  Why is that the case? 

If you start a research study designed to assess long-term relapse in one year of folllow-up, why would all the patients in one arm suddenly relapse within weeks of starting the study?

“…why would all the patients in one arm suddenly relapse within weeks of starting the study?” 

Acute withdrawal effects

This brings us to a central demonstration of the invalidity of the enriched design: acute withdrawal effects.  In the case of this study, acute withdrawal relapse with placebo is obvious.

Let’s set the stage.  Before the maintenance study began, patients were recruited with an acute manic episode and they were treated with olanzapine. This was not double-blind or placebo controlled. It was open-label and unblinded, which means that the patients and doctors were engaged in standard clinical treatment, like you would conduct outside of research. The point was not to show that olanzapine was effective in acute mania; this was already proven. The point was to select out olanzapine responders and make them better so that you could then put them into the maintenance trial to see how long they would stay well. Thus, at the beginning of the maintenance trial, everyone is well.  But a few months earlier, they were all in acute manic episodes, and responded to olanzapine. 

The preselection process is the central place where the enriched design can become invalid.   Let’s review who was treated with olanzapine before the maintenance study began, and what happened to them.  

In the acute mania phase, before the maintenance trial began, 731 patients were recruited and treated with olanzapine.  On average they had two mood episodes in the prior year, one manic and one depressed.  They had been in their current manic episode for about one month on average (median 31 days). As noted, 361 patients entered, the study, which means that about one-half of the sample (49.4%, 361/731) did not respond to or tolerate olanzapine for the acute manic episode.  

So automatically the maintenance trial of 225 olanzapine and 136 placebo patients is actually a study in which twice as many patients were included originally, but half of them were excluded because they didn't do well with olanzapine. 

So this is a preselected group of olanzapine responders, representing only about one-half of patients who have manic episodes.

Then, if they improved with olanzapine, they were put into the randomized trial after 2-4 weeks of staying well.  The average amount of time before entering the maintenance trial was 2 weeks (median 15.9 days for olanzapine, 16.7 days for placebo). 

So, to restate it in clinical terms:  Suppose you had patients with acute mania, lasting for one month. You then treat them with olanzapine, and some improve and get well. Two weeks later, you stop olanzapine.  What do you think would happen?

This fancy expensive randomized trial gives you a definitive answer: They would relapse in 3 weeks (median 22 days).  

Have you just proven that olanzapine is effective as a maintenance treatment for prevention of new mood episodes 1 year or longer in the long-term treatment of bipolar illness? Have you proven olanzapine is a “mood stabilizer?”

No. All you have shown is that if you improve with olanzapine for an acute manic episode, you should not stop the dopamine blocker 2 weeks later.  That doesn’t mean you should stay on it for 2 years or 20 years.  

You have only shown an acute discontinuation effect during the manic episode, not a prevention effect for new mood episodes.

“You have only shown an acute discontinuation effect during the manic episode, not a prevention effect for new mood episodes.” 

A picture instead of words

The figure tells the story.  All patients are well initially, this means that they’re in the one-half of the original acutely manic sample that responded to olanzapine.  The y-axis represents their relapse rate, starting with 1.0 meaning that 100% are well.  The x-axis is days to relapse. Looking at the placebo arm, you’ll notice a steep fall as soon as the study begins, with one-half relapsing in 22 days, and about 80% relapsing by about 90 days.  From 3 months onward, the placebo line is basically flat, as is the olanzapine line.  In other words, all the action is in the first 3 months, namely, massive relapse rates with placebo, and less so with olanzapine.  After 3 months, nothing seems to be happening.

This graph supports visually what is described above conceptually and quantitatively. This study is not really a study of long-term relapse at 6 months to one year of new mood episodes, but rather of short-term discontinuation relapse in 3 months or less back into the acute manic episode that existed only a few weeks before the maintenance study began. 

It doesn’t, therefore, provide evidence of long-term maintenance prophylaxis of new mood episodes in bipolar illness.  

What about the FDA?

If all this is true, readers might be wondering why the FDA didn't figure it out. If this study is so invalid, why did the FDA give a maintenance indication for olanzapine based on this design?

The answer is complex, but we can begin with the fact that the FDA accepts the enriched design as valid; FDA statisticians do not accept the critiques made here about the invalidity of the preselection process.  And indeed, there are ways in which the enriched design can be seen as potentially valid, if conducted in a different way than presented here, but that is a larger story (it relates to other topics, like oncology).  

The FDA approved olanzapine with a low threshold of scientific evidence, partly because there were no prior dopamine blockers with indications for maintenance treatment for bipolar illness. When there are no or few proven treatments, the FDA is somewhat more liberal in approving new treatments.  But after the approval, and during the peer review process for publication, important questions were raised about this study. The PL editor will disclose here, as he has in public lectures previously, that he was one of the peer reviewers for this study when it was submitted to the Archives of General Psychiatry, the highest ranking general psychiatry journal.  In that anonymous peer review, the PL editor made the critiques described above, especially the massive acute withdrawal relapse rates with placebo.  Based on that peer review and those of other reviewers, that journal rejected this study for publication.  Thus, this study was good enough for FDA indication, and thus approved for general medical practice, but it wasn't good enough for publication in the Archives of General Psychiatry. (Readers will note that it was eventually published in a different journal two years after the same study passed FDA review for an indication.)  The questions that were raised in the scientific community by some researchers influenced the FDA to some extent in that, in October 2005, about a year after giving approval, the FDA held an advisory committee meeting in which it wanted to get advice on the question of whether and how the olanzapine-style trial could be improved to be made more valid.  The main focus was on the 2-4 week period of remission from acute mania required before the study began. The FDA wanted to suggest a 6 month or longer period of remission before entry into a maintenance trial.  This suggestion was reasonable, in the PL view, because it is consistent with the natural history of bipolar illness, as described in By the Numbers.  It takes up to 6 months to get out of the acute phase for manic or depressive episodes, so if you want to be certain you are preventing new mood episodes in the maintenance phase, not just relapsing back into the same acute mood episode which you had before the randomized maintenance study started, then 6 months or longer is a wise time frame. 

FDA meetings are publicly available, so readers can read, if they like, the minutes of the 2005 meeting online.  A few academic experts, who claimed to be present on their “own dime”, joined pharmaceutical industry representatives and some patient advocates, in opposing the FDA suggestion. The opposition from the pharmaceutical industry was obvious:  to require 6 months or longer of treatment, even before the maintenance study officially begins, would be very expensive for them, and difficult to complete.  Perhaps they knew that if half the patients drop out with just a few weeks of treatment, as happened in this olanzapine study, very few patients would remain in treatment 6 months later, and thus their maintenance trials would have tiny samples and would fail.  The academic experts and patient advocates took the view that the FDA request was an example of “stigma,” putting too high of a standard on psychiatric research, beyond what is the case with other medical conditions. They didn't appear to realize that the enriched design is, in fact, a very low standard of scientific proof, as discussed in this issue of PL.  In fact, in the PL editor’s review of the medical literature on this topic, there is no other medical discipline in which the FDA approves long-term maintenance treatments based on enriched randomized discontinuation designs. All other medical specialities have a higher standard:  you cannot preselect your patients at all for treatment response.  Thus, in the PL view, the academic experts in particular were uninformed in their attack on the FDA suggestion, and the upshot, as can be read in the minutes, was that the FDA advisory committee decided against the idea of having a long period of remission before enriched maintenance trials in psychiatry.  All agreed, though, that 2-4 weeks was much too short, and a new standard was set of 2-4 months of remission, which has become the way other dopamine blockers have been assessed since that time (like aripiprazole, quetiapine, and lurasidone).

In the PL view 2-4 months is better than 2-4 weeks, but it still isn’t long enough to avoid bias against placebo due to acute withdrawal relapse, and it still does not address the basic problem that preselection for drug responses biases the results in favor of the drug. 

Who stays well?

Even if you took these results at face value, and believed that they are valid, one still should ask an important question: Not how good was olanzapine compared to placebo, but how good was olanzapine, period?  

The benefits suggested in the olanzapine group may not be as great as they seem, when you switch your focus from the relative effect of comparison to placebo and instead look at the absolute effect of overall benefit.  

Recall that these patients were preselected to respond to olanzapine already.  Recall also that the natural history of bipolar illness is that the average patient has one episode per year (see “By the Numbers”).  In this study, patients were manic just before the study began, and they were required to have had at least 2 other manic episodes in the prior 6 years of treatment, thus their natural history was selected such that they were likely to relapse in 1 year of follow-up, and definitely would relapse within 2 years. 

How did olanzapine do in these patients?  

About one-half (47%) relapsed during the study, and they did so, on average, by about 6 months (median 174 days).  

So, in a patient population selected to relapse by natural history once in 1-2 years, one-half of them relapsed within 6 months, despite being preselected to respond to olanzapine and staying on olanzapine after getting better from an acute manic episode.  This relapse rate doesn’t bode well.  At that pace, everyone would have relapsed in 1-2 years of follow-up, just as would be expected if they were completely untreated. 

 (Remember the placebo group isn’t a valid assessment of non-treatment in this study because of the invalidity of the enriched design, leading to massive placebo relapse just 3 weeks into the study due to acute withdrawal effects, as described above).

The PL Bottom Line

  • The olanzapine trial does not show long-term efficacy in prevention of mood episodes in bipolar illness, despite FDA indication.   
  • It mainly reflects massive and immediate placebo relapse due to an acute withdrawal effect after recovering from acute mania a few weeks earlier with olanzapine.   
  • Even if taken at face value, half of olanzapine-treated patients relapsed in 6 months, a pace which is consistent with the natural history of recurrence of mood episodes.  
  • The FDA sought to correct some of the research design problems involving rapid acute withdrawal, but it was resisted.  

PL Reflection

Getting acquainted with ourselves is unsettling.  There are forbidden thoughts but also commonplace ones.  I have often remarked that when the psychotherapist opens us up, he finds what the surgeon finds, all the usual organs.  The unique contour of our being only shapes a universal content.  It is hard to see ourselves because the individuality we may prize is hardly there.  Looking in the mirror we see everyone.  

Leston Havens, Coming to Life

Meet our expert EDITORIAL BOARD, composed of clinicians and researchers from around the world. 

Subscribe to the RSS feed below to follow our "What's new" blog posts