Google Gemma 4(unsloth版)

博主： AIHGF
发布时间：2026 年 04 月 06 日
341 次浏览
暂无评论
1917字数
分类：大语言模型

https://.ai/docs/models/gemma-4

Google DeepMind 推出的 Gemma 4 是一系列尖端的开放权重模型，采用了多模态架构（支持文本、图像、音频）以及混合专家（MoE）架构。

Unsloth 现已全面支持 Gemma 4 的推理与微调，并能显著降低显存占用。

1. 模型概览

Gemma 4 提供了多种尺寸，以平衡性能与硬件需求：

E2B & E4B (Edge 系列)：专为移动端和边缘设备设计，体积小巧，支持图像和音频输入。
26B A4B & 31B (MoE/Dense 系列)：适用于工作站和消费级 GPU（如 RTX 系列），在推理、编程和复杂智能体工作流中表现卓越。

核心特性：

内置思考模式（Thinking Mode）：模型在回答前会进行分步推理（类似于 O1 模型）。
长上下文（Long Context）：边缘版本支持 128K token，大版本支持高达 256K token。
多语言支持：原生支持 140 多种语言。
多模态能力：可处理文本、图像、音频（E2B/E4B 版本）。

2. 硬件需求 (4-bit 量化后的估算）

Unsloth 优化版，可以在较低配置的硬件上运行这些模型：

Gemma-4-E2B / E4B：
- 4-bit 量化：约 5GB RAM/VRAM
- 16-bit 完整精度：约 15GB
Gemma-4-26B-A4B：
- 4-bit 量化：约 18GB VRAM
- 8-bit 量化：约 28GB
Gemma-4-31B：
- 4-bit 量化：约 20GB VRAM
- 8-bit 量化：约 34GB

模型版本	4-bit 量化 (推荐)	8-bit 量化	16-bit 完整精度
Gemma-4-E2B	~3.5 GB	~5.5 GB	~9 GB
Gemma-4-E4B	~5 GB	~8 GB	~15 GB
Gemma-4-26B-A4B	~18 GB	~28 GB	~52 GB
Gemma-4-31B	~20 GB	~34 GB	~62 GB

3. 推荐参数配置

Gemma 4's max context is 128K for E2B / E4B and 256K for 26B A4B / 31B.

Gemma 4 上下文长度规格表

模型系列	模型版本	最大上下文长度 (Max Context)
Edge 系列 (轻量级)	Gemma-4-E2B / E4B	128K tokens
MoE / Dense 系列 (高性能)	Gemma-4-26B-A4B / 31B	256K tokens

推荐参数：

temperature = 1.0
top_p = 0.95
top_k = 64

最后修改：2026 年 04 月 06 日

© 允许规范转载

如果觉得我的文章对你有用，请随意赞赏

发表评论取消回复
使用cookie技术保留您的个人信息以便您下次快速评论，继续评论表示您已同意该条款

评论 *

私密评论

名称 *

🎲

邮箱 *

地址

Google Gemma 4(unsloth版)

AIHGF • 2026 年 04 月 06 日

<blockquote><span class="external-link"><a class="no-external-link" href="https://.ai/docs/models/gemma-4" target="_blank"><i data-feather="external-link"></i>https://.ai/docs/models/gemma-4</a></span></blockquote><p>Google DeepMind 推出的 <strong>Gemma 4</strong> 是一系列尖端的开放权重模型，采用了多模态架构（支持文本、图像、音频）以及混合专家（MoE）架构。</p><p>Unsloth 现已全面支持 Gemma 4 的推理与微调，并能显著降低显存占用。</p><h2>1. 模型概览</h2><p>Gemma 4 提供了多种尺寸，以平衡性能与硬件需求：</p><ul><li><strong>E2B & E4B (Edge 系列)</strong>：专为移动端和边缘设备设计，体积小巧，支持图像和音频输入。</li><li><strong>26B A4B & 31B (MoE/Dense 系列)</strong>：适用于工作站和消费级 GPU（如 RTX 系列），在推理、编程和复杂智能体工作流中表现卓越。</li></ul><p><strong>核心特性：</strong></p><ul><li><strong>内置思考模式（Thinking Mode）</strong>：模型在回答前会进行分步推理（类似于 O1 模型）。</li><li><strong>长上下文（Long Context）</strong>：边缘版本支持 128K token，大版本支持高达 <strong>256K token</strong>。</li><li><strong>多语言支持</strong>：原生支持 140 多种语言。</li><li><strong>多模态能力</strong>：可处理文本、图像、音频（E2B/E4B 版本）。</li></ul><h2>2. 硬件需求 (4-bit 量化后的估算）</h2><p>Unsloth 优化版，可以在较低配置的硬件上运行这些模型：</p><ul><li><p><strong>Gemma-4-E2B / E4B</strong>：</p><ul><li>4-bit 量化：约 <strong>5GB RAM/VRAM</strong></li><li>16-bit 完整精度：约 15GB</li></ul></li><li><p><strong>Gemma-4-26B-A4B</strong>：</p><ul><li>4-bit 量化：约 <strong>18GB VRAM</strong></li><li>8-bit 量化：约 28GB</li></ul></li><li><p><strong>Gemma-4-31B</strong>：</p><ul><li>4-bit 量化：约 <strong>20GB VRAM</strong></li><li>8-bit 量化：约 34GB</li></ul></li></ul><table><thead><tr><th><strong>模型版本</strong></th><th><strong>4-bit 量化 (推荐)</strong></th><th><strong>8-bit 量化</strong></th><th><strong>16-bit 完整精度</strong></th></tr></thead><tbody><tr><td><strong>Gemma-4-E2B</strong></td><td>~3.5 GB</td><td>~5.5 GB</td><td>~9 GB</td></tr><tr><td><strong>Gemma-4-E4B</strong></td><td>~5 GB</td><td>~8 GB</td><td>~15 GB</td></tr><tr><td><strong>Gemma-4-26B-A4B</strong></td><td>~18 GB</td><td>~28 GB</td><td>~52 GB</td></tr><tr><td><strong>Gemma-4-31B</strong></td><td>~20 GB</td><td>~34 GB</td><td>~62 GB</td></tr></tbody></table><h2>3. 推荐参数配置</h2><blockquote>Gemma 4's max context is <strong>128K</strong> for <strong>E2B / E4B</strong> and <strong>256K</strong> for <strong>26B A4B / 31B</strong>.</blockquote><p>Gemma 4 上下文长度规格表</p><table><thead><tr><th><strong>模型系列</strong></th><th><strong>模型版本</strong></th><th><strong>最大上下文长度 (Max Context)</strong></th></tr></thead><tbody><tr><td><strong>Edge 系列 (轻量级)</strong></td><td>Gemma-4-E2B / E4B</td><td><strong>128K</strong> tokens</td></tr><tr><td><strong>MoE / Dense 系列 (高性能)</strong></td><td>Gemma-4-26B-A4B / 31B</td><td><strong>256K</strong> tokens</td></tr></tbody></table><p>推荐参数：</p><ul><li><code>temperature = 1.0</code></li><li><code>top_p = 0.95</code></li><li><code>top_k = 64</code></li></ul>