AEON-7

vllm-ultimate-dgx-spark

AEON-7

AEON vLLM Ultimate — vLLM 0.24.0 built from source for DGX Spark / Blackwell (sm_121a/GB10). One image serves the whole AEON fleet (Gemma-4-26B-A4B, Qwen3.6-27B, Qwen3.6-35B-A3B) with DFlash speculative decoding, NVFP4 weights + Triton NVFP4/FP8 KV cache, high-concurrency DFlash fix, UMA cudagraph clamp, FlashInfer 0.6.12.

Python
54
Stars
6
Forks
52
Watchers
3
Issues

Star 增长

今日0
近 7 天0
近 30 天0
综合评分35.54
默认分支main

暂无 README 内容

项目可能尚未同步完成,请稍后查看