Skip to content
@FastLM

FastLM

We develop fast, lightweighted LM in large-scale, distributed, parallel, sparsity senarios.

Popular repositories Loading

  1. tinyserve-vllm tinyserve-vllm Public

    [ACM MM 2025 Oral] TinyServe: Query-Aware Page Allocation Optimization

    Python 10 2

  2. CSV-Decode CSV-Decode Public

    CSV-Decode: Certifiable Sub-Vocabulary Decoding for Efficient Large Language Model Inference

    Python 8

  3. HSGM HSGM Public

    [ICPADS 2025 Oral, *SEM 2025 Oral] HSGM: Hierarchical Segment-Graph Memory for Scalable Long-Text Semantics

    Python 7

  4. SPI_VecDB SPI_VecDB Public

    [ICPADS 2025 Oral] Distributed Parallel Multi-Resolution Vector Search

    Go 7

  5. FastCache FastCache Public

    Forked from NoakLiu/FastCache-xDiT

    FastCache: Fast Caching for Diffusion Transformer Through Learnable Linear Approximation [Efficient ML Model]

    Python 6

  6. CXL-SpecKV CXL-SpecKV Public

    [FPGA'26 Oral] CXL-SpecKV: A Disaggregated FPGA Speculative KV-Cache for Datacenter LLM Serving

    C++ 6 1

Repositories

Showing 10 of 12 repositories

Top languages

Loading…

Most used topics

Loading…