Doctor-S

Model Introduction

Doctor Sun: a bilingual (Chinese-English) MLLM specifically designed to advance medical diagnostics across multiple specialties. Doctor Sun integrates three key components: a text foundation model for logical reasoning and clinical decision-making, a visual foundation model to extract image features and identify abnormalities in medical scans, and a cross-modal projector to align and map visual data into the textual semantic space. This architecture enables the seamless integration of imaging findings with clinical notes, providing a comprehensive understanding of patient conditions. The model is trained on a meticulously curated, high-quality bilingual dataset derived from public sources, encompassing radiology images, pathology slides, and clinical photographs, along with corresponding textual annotations in both Chinese and English. To ensure domain-specific expertise, the general-purpose language foundation model of Doctor Sun is first pre-trained and optimized to accumulate fundamental medical knowledge. Subsequently, the entire model undergoes a two-stage training strategy, focusing on feature alignment and instruction tuning, to achieve proficiency in multimodal medical diagnostic tasks while retaining general-purpose capabilities.

At present, Doctor Sun is fine-tuned from CLIP and LLaMA of 1, 000, 000 high-quality bilingual multi-modal medical data, and more data will be collected to expand the model's capabilities and iterate on the update. The details are being worked on, so stay tuned.

When the paper is under review, we will release the relevant data, code, and model.

List of models

Model Name	weights
pretrain	[modelscope] / huggingface
finetune	modelscope / huggingface

List of datasets

dataset	link
pretrain	modelscope / huggingface
finetune	modelscope / huggingface

How to use

Clone this repository and navigate

git clone https://github.com/X-D-Lab/Doctor-Sun
cd Doctor-Sun
unzip xtuner.zip

Install Package

conda create -n DoctorS python=3.10 -y
conda activate DoctorS
pip install --upgrade pip 
pip install -r requirements.txt

Quick Start

Pretrain

NPROC_PER_NODE=4 xtuner train ./xtuner/xtuner/configs/llava/llama3_8b_instruct_clip_vit_large_p14_336/pretrain/llava_llama3_8b_instruct_clip_vit_large_p14_336_e1_gpu8_sharegpt4v_pretrain --deepspeed deepspeed_zero2 --seed 1024

Finetune

NPROC_PER_NODE=4 xtuner ./xtuner/xtuner/configs/llava/llama3_8b_instruct_clip_vit_large_p14_336/finetune/llava_llama3_8b_instruct_full_clip_vit_large_p14_336_lora_e1_gpu8_internvl_finetune.py --deepspeed deepspeed_zero2 --seed 1024

cd ./xtuner/xtuner/configs/llava/llama3_8b_instruct_clip_vit_large_p14_336/finetune/work_dirs/llava_llama3_8b_instruct_full_clip_vit_large_p14_336_lora_e1_gpu8_internvl_finetune
num=23226
xtuner convert pth_to_hf llava_llama3_8b_instruct_full_clip_vit_large_p14_336_lora_e1_gpu8_internvl_finetune ./iter_${num}.pth ./iter_${num}_xtuner


xtuner convert merge /home/models/clip-vit-large-patch14-336 ./iter_${num}_xtuner/visual_encoder_adapter ./iter_${num}_visual_encoder --is-clip


python ./xtuner/xtuner/configs/llava/llama3_8b_instruct_clip_vit_large_p14_336/convert_xtuner_weights_to_llava.py --text_model_id ./iter_${num}_xtuner --vision_model_id ./iter_${num}_visual_encoder --projector_weight ./iter_${num}_xtuner/projector/model.safetensors --save_path ./iter_${num}_llava

The model "clip-vit-large-patch14-336" needs to be downloaded to a specific location in advance.

Evaluation

Refer to the file "eva.py", which is used to evaluate the QA task. With a few simple modifications, it can also be used to evaluate the VQA task.

python eva.py
or
HF_ENDPOINT=https://hf-mirror.com python eva.py

引用


@misc{2024Doctor-Sun, 
  author={Dong Xue*, Ziyao Shao, Zhaoyang Duan, Fangzhou Liu, Bing Li, and Zhongheng Zhang*}, 
  title = {Doctor Sun: A Bilingual Multimodal Large Language Model for Biomedical AI}, 
  year = {2024}, 
  publisher = {GitHub}, 
  journal = {GitHub repository}, 
  howpublished = {\url{https://github.com/X-D-Lab/Doctor-Sun/}}, 
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Doctor-S

Model Introduction

List of models

List of datasets

How to use

Pretrain

Finetune

引用

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
llava		llava
LICENSE		LICENSE
README.md		README.md
eva.py		eva.py
requirements.txt		requirements.txt
xtuner.zip		xtuner.zip

License

X-D-Lab/Doctor-Sun

Folders and files

Latest commit

History

Repository files navigation

Doctor-S

Model Introduction

List of models

List of datasets

How to use

Pretrain

Finetune

引用

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages