OFQ-LLM:面向高效低比特大语言模型加速的离群值弹性量化方案
OFQ-LLM: Outlier-Flexing Quantization for Efficient Low-Bit Large Language Model Acceleration 阅读总结
中文标题:OFQ-LLM:面向高效低比特大语言模型加速的离群值弹性量化方案
作者:Gang Wang, Siqi Cai, Wenjie Li, Dongxu Lyu, Guang…
2026/7/5 5:30:01