
vllm-ultimate-dgx-spark
AEON-7
AEON vLLM Ultimate — vLLM 0.24.0 built from source for DGX Spark / Blackwell (sm_121a/GB10). One image serves the whole AEON fleet (Gemma-4-26B-A4B, Qwen3.6-27B, Qwen3.6-35B-A3B) with DFlash speculative decoding, NVFP4 weights + Triton NVFP4/FP8 KV cache, high-concurrency DFlash fix, UMA cudagraph clamp, FlashInfer 0.6.12.
Python
54
Stars
6
Forks
52
Watchers
3
Issues
Star 增长
今日0
近 7 天0
近 30 天0
综合评分35.54
默认分支main
暂无 README 内容
项目可能尚未同步完成,请稍后查看