OpenAI released ChatGPT about five months ago, and followed it up with GPT4 in March. Tech giants and startups have started exploring the potential for LLMs and generative AI tools in healthcare, medicine, clinical settings, and research.
The healthcare industry is increasingly integrating LLMs due to their ability to enhance patient experiences and automate various management tasks, leading to improved efficiency and productivity. These models can simplify interactions between patients and healthcare organisations, providing medical advice, answering health questions, and automating certain tasks such as collecting clinical data and assessing diagnoses. The use of such chatbots in healthcare is expected to continue to grow due to ongoing investments in artificial intelligence and the benefits they provide.
Here’s a list of LLMs in healthcare.
MedPaLM is a large medical language model created by Google Research and DeepMind to answer medical questions. It was introduced in December 2022 and was benchmarked on MultiMedQA, which evaluates its ability to answer a variety of medical questions. The model includes datasets for multiple-choice questions and longer responses for both medical professionals and non-professionals.
Additionally, a new dataset called HealthSearchQA was added to MultiMedQA. In March 2023, Google announced the latest version of MedPaLM at its annual event called “The Check-Up”. The model achieved an 85% score on USMLE MedQA, comparable to an expert doctor, and surpassed similar AI models such as GPT-4. The team also evaluated the model against 14 criteria, including scientific accuracy, exactness, conformity with medical consensus, logical thinking, partiality, and potential for harm.
Google identified significant disparities and pledged to work with researchers and healthcare professionals to narrow them and improve healthcare services. In addition to MedPaLM, Google launched the PaLM API, allowing developers to construct applications using Google’s large language model.
Doximity, a digital platform for medical professionals, has launched a beta version of a tool called DocsGPT.com that uses ChatGPT to help doctors with administrative tasks such as drafting and faxing pre-authorisation and appeal letters to insurers.
The tool was developed with the help of physicians and works with Doximity’s free fax service. It features a growing library of medical prompts that the AI-based writing assistant has been trained on, including drafts of letters to insurance companies, post-procedure instructions for patients, and treatment instructions for children with asthma. Doximity aims to help physicians be more productive and reduce administrative burden to prevent burnout.
The platform has more than 80% of doctors on its network and is integrating with scheduling automation leader Calendly to help physicians schedule appointments with colleagues.
The academic health centre at the University of Florida, UF Health, partnered with researchers from NVIDIA to develop GatorTron, an artificial intelligence transformer natural language processing model. The aim of this model is to help speed up research and medical decision-making by quickly and accurately extracting insights from large amounts of clinical data.
GatorTron—a clinical language model, has been trained on a large dataset of clinical notes, PubMed articles, and Wikipedia, with over 90 billion words of text. The model has been evaluated on five clinical NLP tasks and has been found to outperform existing transformer models trained on biomedical and clinical narratives. Larger models with more parameters performed better than smaller ones, and increasing the amount of data used to train the model improved its performance. The corpus used to train GatorTron consisted of over 290 million clinical notes from over 2 million patients, covering a wide range of healthcare settings.
There are two versions of GatorTron: GatorTron-OG and GatorTron-S. GatorTron-OG is a pre-trained language model with 345 million parameters and a customised clinical vocabulary of 50,000 tokens. It aims to improve language understanding for downstream clinical tasks. GatorTron-S is a similar model that has been pre-trained on a larger dataset of 22 billion words and can generate synthetic, de-identified discharge summaries using text sampled from MIMIC-III.
Martin Shkreli, the former pharmaceutical executive notorious for being convicted of securities fraud, is now involved in a new venture focused on medical AI. His new chatbot called “Dr. Gupta” claims to be able to answer a wide range of medical questions and could one day become a “replacement for all health care information.” The chatbot offers five free questions and then requires a $20 monthly subscription fee for continued access. Users can also input personal medical information to receive more personalised and informative suggestions.
However, experts have criticised the chatbot as potentially dangerous and inaccurate, as it lacks the necessary medical training and credentials to make accurate diagnoses or legal recommendations. Additionally, the use of AI in medicine and law raises ethical concerns regarding patient privacy and the potential for bias.
Glass AI 2.0
Glass Health, a San Francisco-based medical knowledge management platform, has recently launched Glass AI, an LLM-based tool that generates a diagnosis or clinical plan based on symptoms. The tool is aimed at aiding clinicians in developing better diagnoses and clinical plans. Glass Health was founded in 2021 by Dereck Paul and Graham Ramsey, and it helps physicians learn medicine faster and leverage their knowledge to provide better patient care.
Glass AI is an experimental feature that helps clinicians generate a differential diagnosis (DDx) and draft a clinical plan. The tool is not meant for a general audience and is being developed for clinical use. In just two days of its beta launch, over 14k people used Glass AI to submit 25.7K queries. Feedback from users showed that close to 84% of differential diagnosis (DDx) and 78% of clinical plan outputs were helpful. The accuracy ratings were lower, with 71% of DDx outputs and 68% of clinical plans rated as accurate. However, some outputs from the tool were helpful by suggesting a DDx a clinician didn’t consider or drafting a plan a clinician can easily edit.
To prevent AI from perpetuating harmful biases or stigma, Glass AI has deployed additional safeguards to protect against biassed human decisions or reflect historical or social inequities, even if sensitive variables, including gender, race, or sexual orientations.