This month's interview features Karl Bilimoria, MD, MS, Director of the Surgical Outcomes and Quality Improvement Center of Northwestern University. He is the principal investigator of the Flexibility in Duty Hour Requirements for Surgical Trainees (FIRST) trial and a Faculty Scholar at the American College of Surgeons. We spoke with him about the FIRST trial, which examined how less restrictive duty hours affected patient outcomes and resident satisfaction. Its results informed recent changes to duty hour policies.

  Published August 2017

Editor's note: Dr. Bilimoria is the Director of the Surgical Outcomes and Quality Improvement Center of Northwestern University, a center focused on national, regional, and local quality improvement research and practical initiatives. He is also the Principal Investigator of the Flexibility in Duty Hour Requirements for Surgical Trainees (FIRST) trial and a Faculty Scholar at the American College of Surgeons. We spoke with him about the FIRST trial, which examined how less restrictive duty hours affected patient outcomes and resident satisfaction. Its results informed recent changes to duty hour policies.
Dr. Robert M. Wachter: How did the FIRST trial happen?
Dr. Karl Bilimoria: Concerns about the duty hour restrictions imposed in 2003 and 2011 have been prevalent in the surgical community. There has been concern about the time available for taking care of patients, and about breaks in continuity of care. The leaders of the American Board of Surgery (ABS) and the American College of Surgeons (ACS) had been talking with the leaders of the Accreditation Council for Graduate Medical Education (ACGME) about what it would take to potentially have evidence-based recommendations around duty hour policies. The ACGME became very willing to entertain the idea that evidence could help them draft better policy recommendations for duty hours. So they approached our research team and asked us what we could do. We proposed the idea of a national cluster randomized trial, a pragmatic trial that leveraged the existing National Surgical Quality Improvement Program (NSQIP) infrastructure in most of the academic hospitals in the country as our data collection platform. With funding from the ACS, the ABS, and the ACGME, we were able to get moving quickly.
RW: What were the major questions you were trying to answer?
KB: We focused on whether we could waive certain duty hour restrictions that interfered with day-to-day continuity of care. Certainly, we got requests to eliminate the 80-hour cap, but that wasn't really where we were focused. We wanted to make sure that residents could stay in operations that they had started, care for patients that were unstable, and also do cases on patients that they had worked up the night before and not have to leave simply because their clock was up. We focused on eliminating the 24-hour cap for residents, the 16-hour cap for interns, and the required 8 to 10 hours off between shifts. We would frequently be operating late into the night and would have to be back the next morning, and that 8 to 10 hour rule could get violated. We wanted to make sure that if we did waive those rules that there was no adverse effect on patient safety.
Our primary outcome measure is a standard NSQIP hallmark outcome measure, the death or serious morbidity measure. It's NQF (National Quality Forum) endorsed, and it's publicly reported on Hospital Compare. We also measured a number of secondary outcomes: wound infections, renal failure, sepsis, and readmissions. But the other major piece was understanding the resident perspective. We randomized programs to flexible policy, where these duty hour requirements were waived, and standard policy. We wanted to make sure that the residents under the flexible policy arm didn't have a degradation in their well-being, and what they thought about how those flexible hours impacted continuity of care.
RW: You said that the 80-hour limit was not on the table. How did that come about?
KB: Our underlying principle was that we wanted to make sure that we had good continuity of care. Most people did not think that [the weekly limit] was interfering with continuity of care. It was the daily limits and the daily restrictions that were causing an issue. To be quite honest, most of the surgical education community believes that 80 hours is a reasonable number. We should be able to train surgeons within 80 hours of clinical contact time. Most of the focus was on the daily limits, and I think the 80-hour limit is here to stay. We did hear that the ACGME was feeling pressure from other groups to even decrease that 80-hour limit further. In the European Union, they've gone to the European Working Time Directive, where there's a 48-hour cap for residents (and all workers essentially) a week. So increasing it from 80 was just not tenable.
RW: Understood. Take us into the sausage factory of conducting the trial. What were the surprising obstacles? It's not like you can snap your fingers and all of a sudden a program that's randomized to flexibility can just do it easily. They're changing schedules; some internal political issues must arise. How did that all play out?
KB: It's why we really worked hard to get hospitals enrolled and randomized very quickly after we conceived of the trial. We wanted to give them enough time to change their schedules once they were randomized to the flexible arm. Moreover, we realized if we just did a 1-year trial or a 1-year waiver of these duty hour restrictions, people probably wouldn't change much. Instead, we ended up offering a 2-year waiver from the very beginning—saying if you're in the flexible arm you can keep those flexible restrictions for at least 2 years.
The biggest hurdles that we had early on were fairly interesting. We wanted to get New York into the trial but their duty hours are regulated by state law. We tried to get the New York State government to waive the Bell Commission rules if a hospital decides to participate in a national, ACGME-sanctioned trial. Initially, it had good traction and got through the Senate fairly quickly. But then was killed by pressure from the resident unions, who didn't want anything to do with the trial. There is a very interesting dichotomy that occurred. In surgery, the residents were very much in favor of flexibility and testing flexibility, but we heard a lot of pushback from the nonsurgical residents, and that's where the pushback came from in the union. Their board didn't even include a surgeon at that time.
RW: So even though they weren't part of the trial, they worried about the trial's impact on how other fields were treated?
KB: Correct.
RW: And was it "intention to treat" or did you have high confidence that every place relaxed the rules and people ended up staying beyond the normal hours?
KB: That's a really interesting question. We ran it both ways, of course. But we looked at this as a policy trial, and we gave programs the opportunity to relax certain rules. They didn't have to relax all of them. They didn't have to relax any of them. But it turned out in the flexible arm, 75% of the programs relaxed all 4 of the rules that they could, such as the 16-hour rule for interns. In this sort of a policy trial though, we gave them that flexibility. What they did with it was real-world implementation, and the outcomes were real-world outcomes. Not everybody will in fact use all of the flexibility. Intention to treat was the policy-relevant analysis in our mind.
RW: I have heard concerns about ACGME rules and regulations over the years, particularly as it comes to surgery. You also hear about not just hours but supervision and autonomy, including the question of whether residents are really ready to practice at the end of their training. Was there anything about what you were doing that was testing any of those concerns or hypotheses?
KB: We did not test any of those ideas necessarily. This was purely about duty hour policies. We didn't want to contaminate that with other issues. Those are certainly important and complicated issues, and we look forward to taking them on in the future.
RW: So give us the CliffsNotes of what you found.
KB: We found that with respect to our primary outcome, death or serious morbidity, there was no difference between the two trial arms. In fact, for any of the other patient outcomes, there were no differences. The story got more interesting as we looked into the resident outcomes. We were able to attach a survey to the end of the annual exam that surgery residents take. They are more likely to complete that survey, so we had a greater than 95% response rate on that survey. They told us that they thought duty hour flexibility was really important for continuity of care and patient safety.
One of the striking findings was that in the standard policy arm, 13% of residents reported having to leave in the middle of an operation in the last month. To me, that was outrageous. In the flexible policy arm in the first year of the trial, it was 7%, a twofold difference. Similarly, the number of times residents reported having to leave an unstable patient because their duty hour clock was up was also about a twofold difference. Residents reported that they could operate on patients well known to them under flexible policies more frequently than under standard policies, where they would otherwise have to go home. So we were able to achieve better continuity of care. The residents thought it was better for patient safety. There were no adverse effects with respect to resident safety, which was a lot of the concern of the resident unions and the watchdog groups. There was no difference with respect to car accidents or needle sticks.
One place we did see that flexibility had some potential adverse effects was related to resident well-being. Residents told us that they perceived less time for hobbies, time with family and friends, and rest. Overall though, when you asked whether they were dissatisfied about this, there was no significant difference between the study arms. I think what we were seeing is something we can all relate to. Yes, I do not have enough time to get to the gym every day because of the job I do. But I accept that tradeoff and I'm happy to have that tradeoff because I like my work. It was really interesting. I gave a talk at an ACGME board meeting and one of their lawyers stood up and said "I was a trial lawyer for years. You'd get the same response from us. We work a lot and we don't make it to the gym or see our doctors probably as much as we should but we love our work."
The other interesting piece was that those who noted that they had less time for family, friends, and well-being were in the intern group. By the time you got to junior and senior residents, there really wasn't any difference. So when the interns use their flexible hours and flex up, typically they are doing sort of more mundane work—scut work if you will. When a chief resident flexes up, they're doing a great case. So some of what we saw may be reflected in that.
Overall, we found no difference in 80-hour violations. Residents weren't necessarily working more hours. They were just reorganizing within the 80 hours to provide better continuity of care, and they had the flexibility to do that. One place where there were slightly more 80-hour workweek violations were in the interns. As we go forward, we'll need to monitor intern hours a little more closely for programs that take advantage of this flexible policy.
RW: Any parts of those results truly surprise you?
KB: No, I think we expected a lot of that. I didn't quite expect what an impact we could have with flexible policies on continuity of care. That was a little surprising. That 13% number was totally striking about how often a resident left an operation. And the fact that we were able to cut that in half. In subsequent years it has dropped even further and is a big advantage for patient care and resident education.
RW: Why doesn't it go to zero in the flexible group?
KB: There are still times where the case was just too long or the resident wasn't as involved and people need to sub out. Liver transplantation is a great example. You can have really long cases, and sometimes the resident isn't the most critical portion of that operation. It's the attending and the fellow or two attendings. So seeing a resident leave that case is not entirely surprising.
RW: What was the reaction to the publication of the trial both within the profession and from outside groups? Then let's talk about what happened in terms of its impact on policy.
KB: Even before the trial came out, the resident unions and Public Citizen filed a complaint with the Department of Health and Human Services about the conduct of the trial. It was odd because we were already nearly 18 months into the trial. And it was because a similar trial in internal medicine started about a year after us. It turned out there were a couple of residents who were unhappy with that trial and got the attention of a watchdog group and the resident union. The pushback started just before the trial results were published.
Once the trial came out, the evidence was clear there was no harm to patients, and in fact the residents really liked it. One of the most striking numbers in the trial is that only 14% of residents, if given the option, would stay with standard policy. The vast majority preferred flexibility. You preferred the option to have flexibility even more if you were older in your residency or you were in the flexible arm. So if you experienced it, you appreciated it even more. Once the trial results came out, the pushback decreased. The surgical community thought it confirmed what they believed all along. But we put the ACGME in a tough spot. We had our trial results out a few years before the internal medicine results would come out. I think the ACGME handled it very well. They put together a group that worked over the course of 18 months to take testimony from all stakeholders, including all those who opposed the trial. All the specialty organizations, all the subspecialties and health care organizations in the country, and the ACGME came up with a series of recommendations that largely reflect the flexibility tested in the FIRST trial. As most people know, those were approved and take effect in July 2017.
RW: And across all specialties.
KB: That is across all specialties. The ACGME wants there to be true common program requirements. It is certainly up to the Residency Review Committees (RRCs) if they want to impose stricter limits. If Medicine wants to continue to have a 16-hour cap for interns, their RRC is more than welcome to do that. But this was the most flexibility that the ACGME was willing to afford programs, and I think their revisions were thoughtful. They did one thing that was slightly different than what we tested. In the trial, we eliminated the 24 plus 4 daily cap for residents. The ACGME said it wants to keep 24 plus 4 as the general cap. But if a resident needs to stay and wants to stay, they can for the care of one patient. But it has to be of their own accord. So essentially, if you're in an operation, stabilizing a patient, or doing a case that you want to do, you can stay longer than that cap. I think that was a common sense revision that turned out to be better than what we tested in the trial.
RW: It sounds like you're impressed with ACGME in deciding to do this and the way that they responded to your results. First, is that accurate? And second, what do you think they've learned more cosmically about policy changes based on your work?
KB: It's fair to say that I was impressed. I think the whole country is impressed with how the ACGME handled this. They had been in a phase of imposing more and more duty hour restrictions, and in the face of evidence they took a year and a half to critically evaluate the evidence and all prior evidence and take expert testimony and put together a series of thoughtful recommendations that reflected the sentiment of the medical community—plus it was in line with the evidence.
I think the ACGME now realizes that you can inform policy with high-level evidence, particularly if they help to support it, and they would be receptive to using the results of that trial to inform policy. Going forward, the ACGME would like us to continue to monitor duty hour effects now that flexibility will be the rule. We'll check for any degradation in patient outcomes or resident well-being over the next several years. The next time they want to test the policy, they should have a mechanism. We've essentially created a trials group of these 119 programs across the country to test a new duty hour policy or a new educational intervention, and then let that evidence guide a subsequent policy change.
RW: Let's say I'm [ACGME CEO] Thomas Nasca and I say the next big question in the training world is the balance between oversight and autonomy. Say I'm worried that we're producing graduates who are not ready to be fully autonomous. I want to test whether we've gotten that calibration right. If ACGME called and asked, "Would you be willing to do this, and structurally how might you test that," what would you say?
KB: I'd say yes. Making sure that we can train residents and give them the right autonomy with strong supervision is the next frontier. In the operating room, certainly they can do certain portions of the operation with the attending surgeon watching closely but not guiding them along. That's going to take training attendings even more about how to teach and how to break the operation up into components and considering how to graduate residents along. A lot of interesting initiatives are focusing on this issue in the surgical community to try to advance supervision and autonomy. We would welcome the chance to test these approaches using this large group of programs we've assembled.

