The UK’s medical device regulator has admitted it has concerns about VC-backed AI chatbot maker Babylon Health. It made the admission in a letter sent to a clinician who’s been raising the alarm about Babylon’s approach toward patient safety and corporate governance since 2017.
The HSJ reported on the MHRA’s letter to Dr David Watkins yesterday. Dr. Rami Shaheen has reviewed the letter (see below), which is dated December 4, 2020. We’ve also seen additional context about what was discussed in a meeting referenced in the letter, as well as reviewing other correspondence between Watkins and the regulator in which he details a number of wide-ranging concerns.
In an interview he emphasized that the concerns the regulator shares are “far broader” than the (important but) single issue of chatbot safety.
“The issues relate to the corporate governance of the company — how they approach safety concerns. How they approach people who raise safety concerns,” Watkins told Dr. Rami Shaheen. “That’s the concern. And some of the ethics around the mis-promoting of medical devices.
“The overall story is they did promote something that was dangerously flawed. They made misleading claims with regards to how [the chatbot] should be used — its intended use — with [Babylon CEO] Ali Parsa promoting it as a ‘diagnostic’ system — which was never the case. The chatbot was never approved for ‘diagnosis’.”
“In my opinion, in 2018 the MHRA should have taken a much firmer stance with Babylon and made it clear to the public that the claims that were being made were false — and that the technology was not approved for use in the way that Babylon were promoting it,” he went on. “That should have happened and it didn’t happen because the regulations at that time were not fit for purpose.”
“In reality there is no regulatory ‘approval’ process for these technologies and the legislation doesn’t require a company to act ethically,” Watkins also told us. “We’re reliant on the healthtech sector behaving responsibly.”
The consultant oncologist began raising red flags about Babylon with UK healthcare regulators (CQC/MHRA) as early as February 2017 — initially over the “apparent absence of any robust clinical testing or validation”, as he puts it in correspondence to regulators. However with Babylon opting to deny problems and go on the attack against critics his concerns mounted.
An admission by the medical devices regulator that all Watkins’ concerns are “valid” and are “ones that we share” blows Babylon’s deflective PR tactics out of the water.
“Babylon cannot say that they have always adhered to the regulatory requirements — at times they have not adhered to the regulatory requirements. At different points throughout the development of their system,” Watkins also told us, adding: “Babylon never took the safety concerns as seriously as they should have. Hence this issue has dragged on over a more than three year period.”
During this time the company has been steaming ahead inking wide-ranging ‘digitization’ deals with healthcare providers around the world — including a 10-year deal agreed with the UK city of Wolverhampton last year to provide an integrated app that’s intended to have a reach of 300,000 people.
It also has a 10 year agreement with the government of Rwanda to support digitization of its health system, including via digitally enabled triage. Other markets it’s rolled into include the US, Canada and Saudi Arabia.
Babylon says it now covers more than 20 million patients and has done 8 million consultations and “AI interactions” globally. But is it operating to the high standards people would expect of a medical device company?
Safety, ethical and governance concerns
In a written summary, dated October 22, of a video call which took place between Watkins and the UK medical devices regulator on September 24 last year, he summarizes what was discussed in the following way: “I talked through and expanded on each of the points outlined in the document, specifically; the misleading claims, the dangerous flaws and Babylon’s attempts to deny/suppress the safety issues.”
In his account of this meeting, Watkins goes on to report: “There appeared to be general agreement that Babylon’s corporate behaviour and governance fell below the standards expected of a medical device/healthcare provider.”
“I was informed that Babylon Health would not be shown leniency (given their relationship with [UK health secretary] Matt Hancock),” he also notes in the summary — a reference to Hancock being a publicly enthusiastic user of Babylon’s ‘GP at hand’ app (for which he was accused in 2018 of breaking the ministerial code).
In a separate document, which Watkins compiled and sent to the regulator last year, he details 14 areas of concern — covering issues including the safety of the Babylon chatbot’s triage; “misleading and conflicting” T&Cs — which he says contradict promotional claims it has made to hype the product; as well as what he describes as a “multitude of ethical and governance concerns” — including its aggressive response to anyone who raises concerns about the safety and efficacy of its technology.
This has included a public attack campaign against Watkins himself, which we reported on last year; as well as what he lists in the document as “legal threats to avoid scrutiny & adverse media coverage”.
Here he notes that Babylon’s response to safety concerns he had raised back in 2018 — which had been reported on by the HSJ — was also to go on the attack, with the company claiming then that “vested interest” were spreading “false allegations” in an attempt to “see us fail”.
“The allegations were not false and it is clear that Babylon chose to mislead the HSJ readership, opting to place patients at risk of harm, in order to protect their own reputation,” writes Watkins in associated commentary to the regulator.
He goes on to point out that, in May 2018, the MHRA had itself independently notified Babylon Health of two incidents related to the safety of its chatbot (one involving missed symptoms of a heart attack, another missed symptoms of DVT) — yet the company still went on to publicly rubbish the HSJ’s report the following month (which was entitled: ‘Safety regulators investigating concerns about Babylon’s ‘chatbot”).
Wider governance and operational concerns Watkins raises in the document include Babylon’s use of staff NDAs — which he argues leads to a culture inside the company where staff feel unable to speak out about any safety concerns they may have; and what he calls “inadequate medical device vigilance” (whereby he says the Babylon bot doesn’t routinely request feedback on the patient outcome post triage, arguing that: “The absence of any robust feedback system significant impairs the ability to identify adverse outcomes”).
Re: unvarnished staff opinions, it’s interesting to note that Babylon’s Glassdoor rating at the time of writing is just 2.9 stars — with only a minority of reviewers saying they would recommend the company to a friend and where Parsa’s approval rating as CEO is also only 45% on aggregate. (“The technology is outdated and flawed,” writes one Glassdoor reviewer who is listed as a current Babylon Health employee working as a clinical ops associate in Vancouver, Canada — where privacy regulators have an open investigation into its app. Among the listed cons in the one-star review is the claim that: “The well-being of patients is not seen as a priority. A real joke to healthcare. Best to avoid.”)
Per Watkins’ report of his online meeting with the MHRA, he says the regulator agreed NDAs are “problematic” and impact on the ability of employees to speak up on safety issues.
He also writes that it was acknowledged that Babylon employees may fear speaking up because of legal threats. His minutes further record that: “Comment was made that the MHRA are able to look into concerns that are raised anonymously.”
In the summary of his concerns about Babylon, Watkins also flags an event in 2018 which the company held in London to promote its chatbot — during which he writes that it made a number of “misleading claims”, such as that its AI generates health advice that is “on-par with top-rated practicing clinicians”.
The flashy claims led to a blitz of hyperbolic headlines about the bot’s capabilities — helping Babylon to generate hype at a time when it was likely to have been pitching investors to raise more funding.
The London-based startup was valued at $2BN+ in 2019 when it raised a massive $550M Series C round, from investors including Saudi Arabia’s Public Investment Fund and a large (unnamed) U.S.-based health insurance company, as well as insurance giant Munich Re’s ERGO Fund — trumpeting the raise at the time as the largest-ever in Europe or U.S. for digital health delivery.
“It should be noted that Babylon Health have never withdrawn or attempted to correct the misleading claims made at the AI Test Event [which generated press coverage it’s still using as a promotional tool on its website in certain jurisdictions],” Watkins writes to the regulator. “Hence, there remains an ongoing risk that the public will put undue faith in Babylon’s unvalidated medical device.”
In his summary he also includes several pieces of anonymous correspondence from a number of people claiming to work (or have worked) at Babylon — which make a number of additional claims. “There is huge pressure from investors to demonstrate a return,” writes one of these. “Anything that slows that down is seen [a]s avoidable.”
“The allegations made against Babylon Health are not false and were raised in good faith in the interests of patient safety,” Watkins goes on to assert in his summary to the regulator. “Babylon’s ‘repeated’ attempts to actively discredit me as an individual raises serious questions regarding their corporate culture and trustworthiness as a healthcare provider.”
In its letter to Watkins (screengrabbed below), the MHRA tells him: “Your concerns are all valid and ones that we share”.
It goes on to thank him for personally and publicly raising issues “at considerable risk to yourself”.
Babylon has been contacted for a response to the MHRA’s validation of Watkins’ concerns. At the time of writing it had not responded to our request for comment.
The startup told the HSJ that it meets all the local requirements of regulatory bodies for the countries it operates in, adding: “Babylon is committed to upholding the highest of standards when it comes to patient safety.”
In one aforementioned aggressive incident last year, Babylon put out a press release attacking Watkins as a ‘troll’ and seeking to discredit the work he was doing to highlight safety issues with the triage performed by its chatbot.
It also claimed its technology had been “NHS validated” as a “safe service 10 times”.
It’s not clear what validation process Babylon was referring to there — and Watkins also flags and queries that claim in his correspondence with the MHRA, writing: “As far as I am aware, the Babylon chatbot has not been validated — in which case, their press release is misleading.”
The MHRA’s letter, meanwhile, makes it clear that the current regulatory regime in the UK for software-based medical device products does not adequately cover software-powered ‘healthtech’ devices, such as Babylon’s chatbot.
Per Watkins there is no approval process, currently. Such devices are merely registered with the MHRA — but there’s no legal requirement that the regulator assess them or even receive documentation related to their development. He says they exist independently — with the MHRA holding a register.
“You have raised a complex set of issues and there are several aspects that fall outside of our existing remit,” the regulator concedes in the letter. “This highlights some issues which we are exploring further, and which may be important as we develop a new regulatory framework for medical devices in the UK.”
An update to pan-EU medical devices regulation — which will bring in new requirements for software-based medical devices, and had been originally intended to be implemented in the UK in May last year — will no longer take place, given the country has left the bloc.
The UK is instead in the process of formulating its own regulatory update for medical device rules. This means there’s still a gap around software-based ‘healthtech’ — which isn’t expected to be fully plugged for several years. (Although Watkins notes there have been some tweaks to the regime, such as a partial lifting of confidentiality requirements last year.)
In a speech last year, health secretary Hancock told parliament that with the government aimed to formulate a regulatory system for medical devices that is “nimble enough” to keep up with tech-fuelled developments such as health wearables and AI while “maintaining and enhancing patient safety”. It will include giving the MHRA “a new power to disclose to members of the public any safety concerns about a device”, he said then.
In the meanwhile the existing (outdated) regulatory regime appears to be continuing to tie the regulator’s hands — at least vis-a-vis what they can say in public about safety concerns. It has taken Watkins making its letter to him public to do that.
In the letter the MHRA writes that “confidentiality unfortunately binds us from saying more on any specific investigation”, although it also tells him: “Please be assured that your concerns are being taken seriously and if there is action to be taken, then we will.”
“Based on the wording of the letter, I think it was clear that they wanted to provide me with a message that we do hear you, that we understand what you’re saying, we acknowledge the concerns which you’re raised, but we are limited by what we can do,” Watkins told us.
He also said he believes the regulator has engaged with Babylon over concerns he’s raised these past three years — noting the company has made a number of changes after he had raised specific queries (such as to its T&Cs which had initially said it’s not a medial device but were subsequently withdrawn and changed to acknowledge it is; or claims it had made that the chatbot is “100% safe” which were withdrawn — after an intervention by the Advertising Standards Authority in that case).
The chatbot itself has also been tweaked to put less emphasis on the diagnosis as an outcome and more emphasis on the triage outcome, per Watkins.
“They’ve taken a piecemeal approach [to addressing safety issues with chatbot triage]. So I would flag an issue [publicly via Twitter] and they would only look at that very specific issue. Patients of that age, undertaking that exact triage assessment — ‘okay, we’ll fix that, we’ll fix that’ — and they would put in place a [specific fix]. But sadly, they never spent time addressing the broader fundamental issues within the system. Hence, safety issues would repeatedly crop up,” he said, citing examples of multiple issues with cardiac triages that he also raised with the regulator.
“When I spoke to the people who work at Babylon they used to have to do these hard fixes… All they’d have to do is just kind of ‘dumb it down’ a bit. So, for example, for anyone with chest pain it would immediately say go to A&E. They would take away any thought process to it,” he added. (It also of course risks wasting healthcare resources — as he also points out in remarks to the regulators.)
“That’s how they over time got around these issues. But it highlights the challenges and difficulties in developing these tools. It’s not easy. And if you try and do it quickly and don’t give it enough attention then you just end up with something which is useless.”
Watkins also suspects the MHRA has been involved in getting Babylon to remove certain pieces of hyperbolic promotional material related to the 2018 AI event from its website.
In one curious episode, also related to the 2018 event, Babylon’s CEO demoed an AI-powered interface that appeared to show real-time transcription of a patient’s words combined with an ’emotion-scanning’ AI — which he said scanned facial expressions in real-time to generate an assessment of how the person was feeling — with Parsa going on to tell the audience: “That’s what we’ve done. That’s what we’ve built. None of this is for show. All of this will be either in the market or already in the market.”
However neither feature has actually been brought to market by Babylon as yet. Asked about this last month, the startup told Dr. Rami Shaheen: “The emotion detection functionality, seen in old versions of our clinical portal demo, was developed and built by Babylon‘s AI team. Babylon conducts extensive user testing, which is why our technology is continually evolving to meet the needs of our patients and clinicians. After undergoing pre-market user-testing with our clinicians, we prioritised other AI-driven features in our clinical portal over the emotion recognition function, with a focus on improving the operational aspects of our service.”
“I certainly found [the MHRA’s letter] very reassuring and I strongly suspect that the MHRA have been engaging with Babylon to address concerns which have been identified over the past three year period,” Watkins also told us today. “The MHRA don’t appear to have been ignoring the issues but Babylon simply deny any problems and can sit behind the confidentiality clauses.”
In a statement on the current regulatory situation for software-based medical devices in the UK, the MHRA told us:
The MHRA ensures that manufacturers of medical devices comply with the Medical Devices Regulations 2002 (as amended). Please refer to existing guidance.
The Medicines and Medical Devices Act 2021 provides the foundation for a new improved regulatory framework that is currently being developed. It will consider all aspects of medical device regulation, including the risk classification rules that apply to Software as a Medical Device (SaMD).
The UK will continue to recognise CE marked devices until 1 July 2023. After this time, requirements for the UKCA Mark must be met. This will include the revised requirements of the new framework that is currently being developed.
The Medicines and Medical Devices Act 2021 allows the MHRA to undertake its regulatory activities with a greater level of transparency and share information where that is in the interests of patient safety.
The regulator declined to be interviewed or response to questions about the concerns it says in the letter to Watkins that it shares about Babylon — telling us: “The MHRA investigates all concerns but does not comment on individual cases.”
“Patient safety is paramount and we will always investigate where there are concerns about safety, including discussing those concerns with individuals that report them,” it added.
Watkins raised one more salient point on the issue of patient safety for ‘cutting edge’ tech tools — asking where is the “real life clinical data”? So far, he says the studies patients have to go on are limited assessments — often made by the chatbot makers themselves.
“It’s one quite telling thing about this sector is the fact that there’s very little real life data out there,” he said. “These chatbots have been around for a good few years now… And there’s been enough time to get real life clinical data and yet it hasn’t appeared and you just wonder if, is that because in the real-life setting they are actually not quite as useful as we think they are?”