Gemma 3n 建筑创新——推测与模型探究
Gemma 3n Architectural Innovations – Speculation and poking around in the model

原始链接: https://old.reddit.com/r/LocalLLaMA/comments/1kuy45r/gemma_3n_architectural_innovations_speculation/

您的请求已被阻止,原因是网络策略。请尝试在此登录或创建帐户以继续浏览。如果您正在运行脚本或应用程序,请在此处使用您的开发者凭据注册或登录。此外,请确保您的 User-Agent 不为空,并且是唯一且具有描述性的,然后重试。如果您提供的是替代 User-Agent 字符串,请尝试改回默认值,因为这有时会导致阻止。您可以在此处阅读 Reddit 的服务条款。如果您认为我们错误地阻止了您,或者您想讨论获取所需数据的更简便方法,请在此处提交工单。联系我们时,请提供您的 IP 地址:47.245.80.60 和 Reddit 帐户。

Hacker News 的讨论线程关注 Google Gemma 3n 模型的架构创新,重点在于其逐层嵌入(per-layer embeddings)。用户“impossiblefork”发现多嵌入方法很有趣,并建议在一个玩具问题中实现它,以修改 NanoGPT 为例,这包括将嵌入层更改为嵌入列表并将它们整合到前向传播中。另一位用户“limoce”对“4x门控残差流”(4x gated residual streams)架构提出疑问,并寻求相关的论文或技术报告。“3abiton”指出 Google 在 GitHub 上发布了该模型的 APK,而不是仅仅链接到 Play 商店,认为直接发布 APK 是一种有趣的选择。

原文
Blocked

Your request has been blocked due to a network policy.

Try logging in or creating an account here to get back to browsing.

If you're running a script or application, please register or sign in with your developer credentials here. Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again. if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

You can read Reddit's Terms of Service here.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

when contacting us, please include your ip address which is: 47.245.80.60 and reddit account

联系我们 contact @ memedata.com