dzhengAP

David Zheng dzhengAP

Achievements

vllm-project/vllm-omni vllm-project/vllm-omni Public

A framework for efficient model inference with omni-modality models

Python 4.3k 757
vllm-project/llm-compressor vllm-project/llm-compressor Public

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Python 3.1k 483
On-Device-Agent-for-adaptive-display-optimization On-Device-Agent-for-adaptive-display-optimization Public

We present a novel on-device hybrid agent combining LLMs with retrieval-augmented generation for real-time display optimization. The system achieves 92% accuracy with CoreML acceleration delivering…

Swift 1
ARS-Adaptive-Reasoning-Suppression-for-Efficient-Large-Reasoning-Language-Models ARS-Adaptive-Reasoning-Suppression-for-Efficient-Large-Reasoning-Language-Models Public

Adaptive Reasoning Suppression for Efficient Large Reasoning Language Models
distributed-inference-engine-nano-vLLM distributed-inference-engine-nano-vLLM Public

Python
distributed-training-infra-demo-megatron distributed-training-infra-demo-megatron Public

Python