Nov 1, 2018 5:00 AM

Apple's Heart Study Is the Biggest Ever, But With a Catch

The study enrolled a whopping 400,000 people, but it still won't answer researchers' biggest question: whether mass screening does more good than harm.

Last November, Apple Watch owners began receiving recruitment emails from Apple. The company was looking for owners of its smartwatch to participate in the Apple Heart Study—a Stanford-led investigation into the wearable's ability to sense irregular heart rhythms.

Joining was simple: Install an app and wear your watch. If the watch's optical sensors detected an arrhythmia, you might be shipped a dedicated heart monitor—a benchmark to compare against readings from your Apple Watch—to wear for seven days. In true Apple fashion, enrollment and participation were designed to be as user friendly as possible: "Apple and Stanford Medicine are committed to making it easy for people to participate in medical research," the research partners wrote, "because more data can lead to discoveries that save lives."

Now, not for the first time, Apple's attention to user experience has been rewarded: According to a paper outlining the study's design in this week's issue of the American Heart Journal, Apple and Stanford have managed to enroll a staggering 419,093 participants. That makes it the largest screening study on atrial fibrillation ever performed. A study of that size is a big deal for researchers. But even if the results (which should be published next year) are positive, Apple will still have much to prove about the public benefits of its popular wearable.

First let's talk study size. 400,000 research subjects is huge. By comparison, the Strokestop study—a Swedish investigation into mass arrhythmia screening—has around 25,000 participants. To be fair, the Strokestop study has things going for it that Apple's study doesn't, which we'll get to. But the fact that Apple was able to round up a research population of this size in under a year is impressive.

It's also a major selling point of the study's design. Larger sample sizes make for smaller error margins and a greater degree of certainty in one's results, both of which are important when studying the accuracy of a device designed to flag heart problems. Some five million people in the US are affected by atrial fibrillation and atrial flutter (collectively known as AF), irregular heart rhythms that significantly increase one's risk of stroke and heart failure. An estimated 700,000 of those people don't even know they have AF.

Cardiologists are particularly interested in that second group. If a product like Apple's could detect undiagnosed arrhythmias across large populations and compel flagged users to take appropriate preemptive action? It could save lives.

But here's the thing: Even if the Apple Watch excels at detecting undiagnosed AF (a big if), using it to screen large numbers of asymptomatic people isn't necessarily a good idea.

Screening comes with risks: Misdiagnosis. Unnecessary tests. Overtreatment. "Those are real problems that need to be sorted out," says cardiologist Mintu Turakhia, the study's lead author and director of Stanford's Center for Digital Health. That's why he and his team will also observe what happens after Apple Watch users receive an alert: Whether they follow up with a healthcare provider, whether a diagnosis is made, and what treatment they receive. "We're interested in the patient journey, but we also want to see whether an alert from the watch helps lead to appropriate care," Turakhia says.

The biggest unknown surrounding AF screening is a simple one: Do its benefits outweigh its costs? "The current evidence is insufficient to say one way or the other," says Seth Landefeld, chair of the department of medicine at University of Alabama Birmingham and a member of the US Preventive Services Task Force, an independent, volunteer panel of national experts in disease prevention. That lack of evidence is why the USPSTF recommends against screening asymptomatic adults.

Apple's study doesn't directly address USPSTF's biggest concern. "The million-dollar question, the one we really need to have answered, is whether people who screen for arrhythmias have fewer strokes, long-term, than those who don't," Landefeld says. The present study is designed primarily to see how Apple’s AF detection compares with a dedicated heart monitor—not how the people who receive AF alerts fare in the long haul. For that, you need repeated observations of your participants over a long period of time (a major advantage of the Strokestop study). Randomized control groups wouldn't hurt, either.

Turakhia doesn't disagree. But he notes that the thinking on AF has expanded beyond concerns about stroke. "It's still true that, if you have AF, the worst thing that can happen is stroke,” he says. “But more and more, we're learning that, like hypertension, it's a general marker of cardiovascular risk, and associated with a lot of other conditions, from tiredness and shortness of breath to heart failure and cardiomyopathy."

Of course, Apple and Stanford could address these questions in follow-up studies. But there's a glaring inconsistency in Apple's plan to study heart-screening technology while making it available in the most popular watch on Earth. In their paper describing the design of the Apple Heart Study, Turakhia and his colleagues write that the ultimate goal of studying the Apple Watch's arrhythmia-sensing potential is "learning to responsibly release such a technology at scale." But Apple plans to release the irregular rhythm notification and ECG features to the public by the end of 2018. The results of the Stanford study won't be finalized until 2019 at the earliest. So is Apple putting the cart before the horse?

Hard to say. On one hand, an investigation like the Apple Heart Study would be all but impossible were it not for the company's decision to incorporate medical technology into the Apple Watch. On the other, the technology is, by definition, unproven. I asked Turakhia whether he thinks Apple is getting ahead of itself by releasing the features before the results of his study are in. "All I can say is that we at Stanford were not involved in the regulatory submission process," Turakhia says¹. "So I can’t comment on that."

1. Correction: While neither Turakhia nor his co-PI Marco Perez were involed in the regulatory submission process, a Stanford spokesperson clarifies that university faculty "actually were involved in and supported the FDA filing, but were blinded to the findings and the data that was submitted to the agency."