HoneyBee has been officially published in Nature Digital Medicine!
Tripathi, A., Waqas, A., Schabath, M.B. et al. HONeYBEE: enabling scalable multimodal AI in oncology through foundation model-driven embeddings. npj Digit. Med. 8, 622 (2025). https://doi.org/10.1038/s41746-025-02003-4
HoneyBee is a comprehensive multimodal AI framework designed specifically for oncology research and clinical applications. It seamlessly integrates and processes diverse medical data types—clinical text, radiology images, pathology slides, and molecular data—through a unified, modular architecture. Built with scalability and extensibility in mind, HoneyBee empowers researchers to develop sophisticated AI models for cancer diagnosis, prognosis, and treatment planning.
Warning
Alpha Release: This framework is currently in alpha. APIs may change, and some features are still under development.
- Multimodal data support: clinical text, radiology (DICOM/NIFTI), pathology (WSI), and molecular data
- 3-layer modular architecture: clean separation between loaders, processors, and embedding models
- Clinical NLP pipeline: OCR, cancer entity extraction, temporal parsing, and medical ontology mapping
- Whole Slide Image processing: tissue detection, patch extraction, stain normalization, and quality filtering
- State-of-the-art embedding models: GatorTron, BioBERT, PubMedBERT, UNI, REMEDIS, RadImageNet, and more
- Cross-modal integration: unified patient-level representations from multiple data modalities
- Survival analysis: Cox PH, Random Survival Forest, and DeepSurv
- Similar patient retrieval: find patients with matching clinical profiles
- Interactive visualization: t-SNE dashboards for embedding exploration
- GPU-accelerated: CuCIM backend for WSI processing with OpenSlide fallback
# Ubuntu/Debian
sudo apt-get install -y openslide-tools tesseract-ocr
# macOS
brew install openslide tesseractpip install honeybee-ml
python -c "import nltk; nltk.download('punkt'); nltk.download('punkt_tab')"| Extra | Command | Includes |
|---|---|---|
| Clinical | pip install honeybee-ml[clinical] |
NLP, OCR, and text processing dependencies |
| Pathology | pip install honeybee-ml[pathology] |
WSI loading and image processing |
| Molecular | pip install honeybee-ml[molecular] |
Genomics and expression data support |
| All | pip install honeybee-ml[all] |
Everything above |
HoneyBee has been successfully applied to:
- Cancer Subtype Classification: Automated identification of cancer subtypes from multimodal data
- Survival Prediction: Risk stratification and outcome prediction for treatment planning
- Similar Patient Retrieval: Finding patients with similar clinical profiles for precision medicine
- Biomarker Discovery: Identifying multimodal patterns associated with treatment response
See the LICENSE file for details.
If you use HoneyBee in your research, please cite our paper:
Tripathi, A., Waqas, A., Schabath, M.B. et al. HONeYBEE: enabling scalable multimodal AI in
oncology through foundation model-driven embeddings. npj Digit. Med. 8, 622 (2025).
https://doi.org/10.1038/s41746-025-02003-4
