Google opens access to AI models for medical imaging and speech, unveils MedGemma 1.5 and MedASR: All you need to know

3 months ago 14

ARTICLE AD BOX

Google has launched MedGemma 1.5 and MedASR, two open AI models for healthcare research. The tools focus on analysing medical images and transcribing clinical speech, offering researchers and developers flexible, community-driven medical AI solutions.

Google has launched two new artificial intelligence models focused on healthcare, MedGemma 1.5 and MedASR, as part of its expanding efforts in medical AI. (GOOGLE)

Google has launched two new artificial intelligence models focused on healthcare, MedGemma 1.5 and MedASR, as part of its expanding efforts in medical AI. Unlike some rivals that offer healthcare AI tools primarily as paid enterprise services, Google has opted for a more open approach by releasing both models publicly for the wider research and developer community.

MedGemma 1.5 targets medical images and text

MedGemma 1.5 is the latest version of Google’s medical vision-language model, built to analyse medical images alongside written information. The model can interpret scans, respond to questions related to visual medical data, and assist with a range of research-oriented tasks.

According to Google Research, the updated version brings improved multimodal reasoning and better performance when dealing with complex medical imagery. It is also designed to be more flexible, allowing researchers to fine-tune it for specialised datasets and specific study requirements.

The model supports multiple forms of medical imaging, including radiology scans and other clinically relevant visuals. Google said MedGemma 1.5 is intended for uses such as image-based question answering, report drafting, and structured data extraction. The company stressed that it is not meant to provide diagnoses or treatment advice and should only be used as a support tool in research and development settings.

MedASR focuses on clinical speech recognition

Alongside MedGemma 1.5, Google introduced MedASR, an automatic speech recognition model designed specifically for healthcare environments. MedASR is built to transcribe spoken clinical conversations into text, with particular attention to medical terminology, diverse accents, and the challenges of real-world clinical audio.

View full Image

Alongside MedGemma 1.5, Google introduced MedASR, an automatic speech recognition model designed specifically for healthcare environments. (GOOGLE)

Google said the model aims to reduce transcription errors that often occur when general-purpose speech recognition systems are used in medical contexts. Potential use cases include transcribing doctor-patient discussions, creating clinical notes, and converting dictated reports into text.

The company added that MedASR can be adapted for different healthcare settings and fine-tuned to match specific clinical workflows or documentation standards.

Open access for developers and researchers

Google said all versions of MedGemma and MedASR are available through Hugging Face and the Vertex AI platform. Developers can also access documentation and tutorials via the MedGemma GitHub repository.

Key Takeaways

MedGemma 1.5 enhances analysis of medical images by integrating visual data with written information.
MedASR focuses on transcribing clinical conversations accurately, addressing the unique challenges of medical terminology.
Google's commitment to open access supports innovation and collaboration in healthcare AI development.

Read Entire Article