Qwen3.6-27B-AEON-Ultimate-Uncensored-DFlash

AEON-7

Fully uncensored, capability-enhanced abliteration of Qwen3.6-27B. NVFP4 + z-lab DFlash speculative decoding (n=12) on the unified ghcr.io/aeon-7/aeon-vllm-ultimate:latest container, tuned for long-context draft acceptance on DGX Spark. 6 HF variants (BF16/NVFP4/MTP/MTP-XS), docker-compose, and QuickStart.

AI 简介

这是一个基于Qwen3.6-27B大语言模型的无审查增强版推理优化方案，通过硬件感知量化（NVFP4）、DFlash推测解码（n=12）及CUDA图等技术显著提升长上下文推理吞吐与首词延迟。项目提供BF16/NVFP4/MTP等6种Hugging Face模型变体、预构建vLLM容器镜像及docker-compose快速部署支持，专为DGX Spark等Blackwell架构GPU集群优化，适用于需高吞吐、低延迟、强能力保持的本地化AI代理或企业级推理服务场景。

Python

Apache License 2.0

在 GitHub 查看

403

Stars

Forks

Watchers

Issues

Star 增长

今日0

近 7 天0

近 30 天+30

综合评分47.87

默认分支main

Qwen3.6-27B-AEON-Ultimate-Uncensored-DFlash

Star 增长

加入交流群