Model card for keras-dots-ocr-finetuned-v1

This model is a fined-tuned version of the Dots OCR model. It has been fine-tuned specifically for extracting drug names from images of prescription. This model has been trained on 6k images of prescriptions with drug names annotated and verified by pharmacists.

Model Details

  • Model Type: Vision-Language Model
  • Base Model: Dots OCR
  • Fine-tuning Method: Supervised Fine-Tuning (SFT)
  • Training Data: 6k images of prescriptions with annotated drug names
  • Intended Use Case: Extracting drug names from prescription images for pharmacy and healthcare applications

Requirements

  • transformers == 4.51.3
  • torch == 2.7.0
  • torchvision < 0.23.0

Usage

This model uses very specific prompt to extract drug names. Please don't change the prompt structure even slightly as it may lead to suboptimal results.

from transformers import AutoModelForCausalLM, AutoProcessor
import torch
from qwen_vl_utils import process_vision_info


model = AutoModelForCausalLM.from_pretrained(
    "KeraCare/keras-dots-ocr-finetuned-v1",
    trust_remote_code=True,
    attn_implementation="sdpa", # If the GPU supports flash attention, use "flash_attention_2"
    torch_dtype=torch.bfloat16,
).to("cuda")

processor = AutoProcessor.from_pretrained(
    "KeraCare/keras-dots-ocr-finetuned-v1",
    trust_remote_code=True,
)

# Prepare inputs
prompt = """
You are an assistant that extracts drug names from prescription images (primarily French, sometimes English), even if noisy, blurry, or with background clutter.

Rules: return only drug names. Normalize spelling to the closest valid INN/brand as written (e.g., preserve brand vs. generic identity and combination names), deduplicate, and sort the drug names in lexical order. Do not invent or map to equivalents; if none are found, return an empty list.
Strip accent marks and special characters, and convert to lowercase.

Output strict JSON only in the following format:

{
    "drug_names": [
        "<drug_name_1>",
        "<drug_name_2>",
    ]
}
"""
image_path = "path_to_your_image.jpg"

messages = [
    {"role": "system", "content": prompt},
    {"role": "user", "content": [{"type": "image", "image": image_path}]},
]

text = processor.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=False  # training on full dialogue
)

images, _ = process_vision_info(messages)

inputs = processor(
    text=[text],
    images=images,
    padding=True,
    return_tensors="pt",
).to("cuda")

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=512,
        temperature=0.0,
        top_p=0.9,
        repetition_penalty=1.1,
        eos_token_id=processor.tokenizer.eos_token_id,
    )

Model Card Authors

Model Card Contact

For questions or issues regarding this model, please contact Mitiku Yohannes at kmitiku@kera.health.

Downloads last month
359
Safetensors
Model size
3B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for KeraCare/keras-dots-ocr-finetuned-v1

Finetuned
(5)
this model