Skip to content

X-D-Lab/Doctor-Sun

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Doctor-S

Model Introduction

Doctor Sun: a bilingual (Chinese-English) MLLM specifically designed to advance medical diagnostics across multiple specialties. Doctor Sun integrates three key components: a text foundation model for logical reasoning and clinical decision-making, a visual foundation model to extract image features and identify abnormalities in medical scans, and a cross-modal projector to align and map visual data into the textual semantic space. This architecture enables the seamless integration of imaging findings with clinical notes, providing a comprehensive understanding of patient conditions. The model is trained on a meticulously curated, high-quality bilingual dataset derived from public sources, encompassing radiology images, pathology slides, and clinical photographs, along with corresponding textual annotations in both Chinese and English. To ensure domain-specific expertise, the general-purpose language foundation model of Doctor Sun is first pre-trained and optimized to accumulate fundamental medical knowledge. Subsequently, the entire model undergoes a two-stage training strategy, focusing on feature alignment and instruction tuning, to achieve proficiency in multimodal medical diagnostic tasks while retaining general-purpose capabilities.

At present, Doctor Sun is fine-tuned from CLIP and LLaMA of 1, 000, 000 high-quality bilingual multi-modal medical data, and more data will be collected to expand the model's capabilities and iterate on the update. The details are being worked on, so stay tuned.

When the paper is under review, we will release the relevant data, code, and model.

List of models

Model Name weights
pretrain [modelscope] / huggingface
finetune modelscope / huggingface

List of datasets

dataset link
pretrain modelscope / huggingface
finetune modelscope / huggingface

How to use

  1. Clone this repository and navigate
git clone https://github.com/X-D-Lab/Doctor-Sun
cd Doctor-Sun
unzip xtuner.zip
  1. Install Package
conda create -n DoctorS python=3.10 -y
conda activate DoctorS
pip install --upgrade pip 
pip install -r requirements.txt
  1. Quick Start

Pretrain

NPROC_PER_NODE=4 xtuner train ./xtuner/xtuner/configs/llava/llama3_8b_instruct_clip_vit_large_p14_336/pretrain/llava_llama3_8b_instruct_clip_vit_large_p14_336_e1_gpu8_sharegpt4v_pretrain --deepspeed deepspeed_zero2 --seed 1024

Finetune

NPROC_PER_NODE=4 xtuner ./xtuner/xtuner/configs/llava/llama3_8b_instruct_clip_vit_large_p14_336/finetune/llava_llama3_8b_instruct_full_clip_vit_large_p14_336_lora_e1_gpu8_internvl_finetune.py --deepspeed deepspeed_zero2 --seed 1024

cd ./xtuner/xtuner/configs/llava/llama3_8b_instruct_clip_vit_large_p14_336/finetune/work_dirs/llava_llama3_8b_instruct_full_clip_vit_large_p14_336_lora_e1_gpu8_internvl_finetune
num=23226
xtuner convert pth_to_hf llava_llama3_8b_instruct_full_clip_vit_large_p14_336_lora_e1_gpu8_internvl_finetune ./iter_${num}.pth ./iter_${num}_xtuner


xtuner convert merge /home/models/clip-vit-large-patch14-336 ./iter_${num}_xtuner/visual_encoder_adapter ./iter_${num}_visual_encoder --is-clip


python ./xtuner/xtuner/configs/llava/llama3_8b_instruct_clip_vit_large_p14_336/convert_xtuner_weights_to_llava.py --text_model_id ./iter_${num}_xtuner --vision_model_id ./iter_${num}_visual_encoder --projector_weight ./iter_${num}_xtuner/projector/model.safetensors --save_path ./iter_${num}_llava

The model "clip-vit-large-patch14-336" needs to be downloaded to a specific location in advance.

  1. Evaluation

Refer to the file "eva.py", which is used to evaluate the QA task. With a few simple modifications, it can also be used to evaluate the VQA task.

python eva.py
or
HF_ENDPOINT=https://hf-mirror.com python eva.py

引用


@misc{2024Doctor-Sun, 
  author={Dong Xue*, Ziyao Shao, Zhaoyang Duan, Fangzhou Liu, Bing Li, and Zhongheng Zhang*}, 
  title = {Doctor Sun: A Bilingual Multimodal Large Language Model for Biomedical AI}, 
  year = {2024}, 
  publisher = {GitHub}, 
  journal = {GitHub repository}, 
  howpublished = {\url{https://github.com/X-D-Lab/Doctor-Sun/}}, 
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •