AI as a Quick Medical Reference

One of the inevitabilities in medicine is the integration of artificial intelligence (AI) in clinical practice. As of this writing, this full integration has not yet happened, (although it is in the process), the accessibility and ease-of-use of ChatGPT and other large language models has given many of us a taste of what’s to come. Recent studies have highlighted the potential of ChatGPT in the medical field as a resource for medical knowledge, demonstrating its capability to match or even surpass the average trainee in medical exams. This opens up a vast array of possibilities for utilizing AI as an assistive tool for clinicians.

ChatGPT in Clinical Decision-Making

In one recent study, the investigators delved into ChatGPT’s performance in answering clinical questions across a full clinical workflow (1). The findings were significant: ChatGPT achieved an accuracy rate of 72%. This is particularly impressive considering the complexity and variability of clinical scenarios. The ability of ChatGPT to process and provide information relevant to a wide range of medical queries underscores its potential as a valuable resource for healthcare professionals.

Utilization in Clinical Workflows

ChatGPT can be integrated into various stages of the clinical workflow. Physicians and other healthcare providers can pose medical questions to the AI, ranging from queries about disease management to drug interactions and diagnostic approaches. This could be particularly beneficial in situations where quick reference is needed and you don’t have the time and energy to look at ten different references.

Here are some prompt examples on how you can get the chatbot to help you in your clinical workflow:

Prompt:

How do you diagnose nephrogenic insipidus? I’m a physician so your response should be at my level.
Output:

Nephrogenic diabetes insipidus (NDI) is diagnosed through a combination of clinical assessment, laboratory tests, and sometimes imaging studies. The approach includes differentiating NDI from other forms of polyuria, particularly central diabetes insipidus and primary polydipsia. Here’s a structured approach:
Clinical Assessment: Begin with a detailed patient history and physical examination. Key points include:
Symptoms: Polyuria, polydipsia, nocturia, and a history of dehydration.
Medical history: Look for factors like medication use (especially lithium or demeclocycline), electrolyte disorders, renal disease, or a history of obstructive uropathy.
Laboratory Tests: Initial tests generally include:
Urine osmolality: NDI typically shows a persistently low urine osmolality (<300 mOsm/kg) despite dehydration.
Serum osmolality: Elevated (>295 mOsm/kg) indicating dehydration.
Electrolytes: Hypernatremia may be present if the water loss is not adequately compensated by intake.
Urine specific gravity: Generally low (<1.005) in NDI.
Water Deprivation Test: This is the cornerstone of diagnosing diabetes insipidus. It helps differentiate between NDI, central diabetes insipidus, and primary polydipsia.
The patient is deprived of water for a period (usually overnight), with monitoring of urine output, urine osmolality, and serum osmolality.
In NDI, despite dehydration, the kidneys are unable to concentrate urine.
Desmopressin (DDAVP) Trial: After water deprivation, administration of desmopressin helps differentiate central from nephrogenic diabetes insipidus.
In NDI, desmopressin does not increase urine osmolality or decrease urine output significantly, as the kidneys do not respond to ADH.
Imaging Studies: While not routinely required, renal imaging (like ultrasound) may be useful in certain cases to rule out obstructive uropathy or chronic kidney disease.
Genetic Testing: In cases where a hereditary form of NDI is suspected, particularly in children, genetic testing may be indicated.
Response to Therapy: Sometimes empirical therapy with a thiazide diuretic, amiloride, or a low-salt diet can be both diagnostic and therapeutic, especially in mild or partial NDI.
Asking the chatbot for help in workup.
Prompt:

can you generate differential diagnoses for the case below delimited by triple quotations. Please list them, and provide a few words why they are on your list. I’m a physician so your tone should be professional, and it is OK to use medical jargon.

“””37m w htn, etoh use d/o, pw shortness of breath x 3 wks. sx started a week after he viral urti. sob progressed w exertional dyspnea and orthopnea. w/u shows bilateral retic infiltrates on cxr. bnp 5000. trop neg. ekg +tachycardia. wbc normal. blood and urine cx ngtd. echo shows reduced ef 30%.”””
Output:

Acute Heart Failure/Pulmonary Edema:
Elevated BNP and reduced EF suggest heart failure. The history of hypertension and alcohol use disorder can contribute to cardiomyopathy leading to heart failure.
Alcoholic Cardiomyopathy:
History of alcohol use disorder and reduced EF. Alcoholic cardiomyopathy is a type of dilated cardiomyopathy caused by excessive alcohol consumption.
Viral Myocarditis:
Presentation following a viral upper respiratory tract infection (URTI), with reduced EF and normal troponin, could be consistent with myocarditis. This condition can lead to dilated cardiomyopathy and heart failure.
Non-Ischemic Cardiomyopathy:
Reduced EF with negative troponin and a history of hypertension indicates a non-ischemic cause of cardiomyopathy, which could be exacerbated by alcohol use.
Acute Respiratory Distress Syndrome (ARDS):
Bilateral infiltrates and hypoxemia following a viral URTI could be indicative of ARDS, although this is less likely given the reduced EF and elevated BNP.
Pulmonary Embolism:
While the normal troponin and tachycardia might suggest this, the presence of reduced EF and high BNP leans the diagnosis away from PE. However, PE can sometimes present with heart strain and should be considered.
Hypertensive Heart Disease with Acute Decompensation:
Chronic hypertension can lead to left ventricular hypertrophy and heart failure. The acute decompensation could present with symptoms as described.
Toxic Metabolic Cardiomyopathy:
Considering the alcohol use disorder, a toxic metabolic effect on the myocardium is possible, leading to reduced EF and heart failure symptoms.
Infective Endocarditis:
Less likely given the normal WBC count and negative blood cultures, but it can present with heart failure symptoms and needs to be considered, especially if there is a history of intravenous drug use.
Pneumonia with Secondary Heart Failure:
Pneumonia, particularly viral, could lead to increased cardiac stress, resulting in heart failure in a patient with underlying heart disease, although the clinical picture leans more towards a primary cardiac issue.
Asking the chatbot for generating differential diagnoses

However, it is crucial to approach this tool with a critical eye. Despite its advanced capabilities, ChatGPT is not infallible. It can generate responses that are inaccurate or misleading, a phenomenon known in AI parlance as “hallucination.” Fortunately, the incidence of such hallucinations is less ChatGPT-4, but they still occur.

Real-time Access to the Internet

One of the new advanced features added to ChatGPT-plus is the “browsing feature”, which refers to the ability of the chatbot to access and interact with the internet in real-time. Remember, GPT-4 has a knowledge cutoff, but this browsing features gives it access to information on the internet in real-time. This allows the chatbot to:

  1. Retrieve Current Information: Unlike previous versions, ChatGPT with browsing can pull up-to-date information from the web. This means it can access recent news, articles, and data that are not part of its pre-existing training data set.
  2. Search for Specific Details: It can perform targeted searches to find specific information, answer queries more accurately, and provide recent sources.
  3. Summarize and Cite Web Content: The AI can read and summarize content from web pages, providing concise overviews and citing sources, which adds credibility and depth to its responses.
  4. Interact with Web Interfaces: it can interact with web interfaces, fill out forms, or perform similar actions, although this functionality may vary.

Ensuring Accuracy and Ways to Mitigate Hallucinations

To mitigate the risk of hallucinations, providers are advised to employ some prompt engineering techniques. This involves framing questions to the AI in a manner that encourages the most accurate and reliable responses. For instance, instructing ChatGPT to “only provide responses if 100% certain” can significantly reduce the likelihood of receiving incorrect information. We recommend employing the CO-STAR framework that we introduced in our Prompt Engineering article to ensure that you get appropriate and deliberate responses. Avoid asking overly complex questions as these could confuse the chatbot. If the question has multiple steps, break it down into steps. Refine your prompt until you are satisfied with the output (iterative refinement).

WorseBetter
                
recommend workup for neutropenic fever                                                              
recommend workup for neutropenic fever in a bone marrow transplant patient. Initial blood cultures are already negative, and pt is not responding to antibiotics. cxr is negative.
generate differential diagnoses for fever in a returned travelergenerate differential diagnoses for fever in a 45m who recently returned from travel to Nigeria 2 wks ago. He currently lives in the midwest with his family. No one else is sick. No other symptoms. He has 2 cats that he recently acquired.
Tip: provide as many details as possible to make sure that you get the best response for your needs.

There are also less hallucinations when the AI model is trained on more data, so using newer LLMs are expected to have less hallucinations.

Nonetheless, the most crucial step in employing ChatGPT in a clinical setting is verification. Every piece of information provided by the AI should be cross-checked with reliable medical sources or discussed with experienced colleagues. In fact, you should really only be using an AI chatbot as a medical reference if you have the capacity to verify information. Otherwise, using it would carry some risks. This step is indispensable in ensuring patient safety and maintaining high standards of care.

You can also train ChatGPT yourself by creating your own GPT. This way you can limit the source of the output to files that you train it on. We will explore this feature in a later article.

The Future of AI in Medicine

The integration of AI like ChatGPT in medicine is still in its infancy. As technology evolves, so will the capabilities and accuracy of these tools. It is essential for the medical community to stay abreast of these advancements and to continually assess the utility and limitations of AI in clinical practice.

Generative AI holds significant promise as a support tool in the medical field. Its ability to provide quick, relevant information can be a valuable asset to clinicians. However, it is imperative to remember that AI should supplement, not replace, the clinical judgment of healthcare professionals. Ensuring the accuracy of information through vigilant verification remains a cornerstone of utilizing AI in medicine. As we move forward, the integration of AI tools like ChatGPT in healthcare will undoubtedly evolve, offering new opportunities and challenges in enhancing patient care.

Reference:

  1. Rao A, Pang M, Kim J, et al. Assessing the utility of ChatGPT throughout the entire clinical workflow: development and usability study. J Med Internet Res 2023;25:e48659.

One response to “AI as a Quick Medical Reference”

  1. […] have shown in previous articles how it is able to work as a quick medical reference, generate custom images, perform statistical analyses and data visualization, which are all […]

    Like

Leave a comment