关于 DeepSeek 的笔记
Notes on DeepSeek

原始链接: https://twitter.com/NikoMcCarty/status/2064686557400100884

DeepSeek 由梁文锋于 2023 年创立,是一家总部位于杭州、规模精简且低调的初创公司,拥有 300 名员工。尽管该公司影响力显著——尤其是 2025 年 1 月发布的 R1 模型——但它一直避免在公众视野中露面,与 Anthropic 等西方竞争对手相比,其行事风格更为低调。 在最近的一次实地探访中,该团队显得年轻、充满活力,并将重心完全放在技术执行而非长期的“奇点”猜想上。与那些热衷于 AGI(通用人工智能)安全问题的西方实验室不同,DeepSeek 主要关注的是人工智能对青年就业的影响。他们的开发方法务实:不进行正式的红队测试,并满足于比美国基准水平落后约六个月的进度。 在中国政府侧重于实际应用而非存在性风险的背景下,DeepSeek 将 R1 模型视为其巅峰之作。他们对快速扩张或退出机制几乎没有兴趣,更倾向于保持目前的规模,同时与阿里巴巴和字节跳动等巨头同台竞争。该公司对西方研究保持高度敏锐,展现出一种在全球人工智能竞赛中成熟、务实且审慎的态度。

Hacker News 最新 | 往期 | 评论 | 提问 | 展示 | 招聘 | 提交 登录 **关于 DeepSeek 的笔记 (twitter.com/nikomccarty)** 18 分 | vinhnx 发布于 31 分钟前 | 隐藏 | 往期 | 收藏 | 3 条评论 | 帮助 **cmrdporcupine** 7 分钟前 | 下一条 [-] “总的来说,中国似乎将人工智能仅仅视为又一项技术,而非某种奇点时刻。” 这是一个令人耳目一新的视角。 回复 **cmrdporcupine** 7 分钟前 | 上一条 | 下一条 [-] https://xcancel.com/NikoMcCarty/status/2064686557400100884 回复 **dude250711** 6 分钟前 | 上一条 [-] “他们的基础设施主管尤其年轻;大约 30 岁,显然是国内最顶尖的人工智能建设和能源专家之一。” 是建设专家还是蒸馏专家? 回复 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系 搜索:
相关文章

原文

Notes on DeepSeek: We visited the company HQ last Tuesday. It was founded in 2023 by Liang Wenfeng and operated out of his hedge fund, High-Flyer, until somewhat recently. The company released their R1 model in January 2025, so it was interesting to see what they’ve been doing The company is located in an unmarked, 12-story building in Hangzhou. There is no DeepSeek branding visible from the street or lobby. I asked why this is, and the team demurred and said, “Well, there are many companies in this building, and we are not special.” They want to keep a low profile. We met with their Head of Data and Head of Infrastructure. The company only has 300 employees. They are at least an order-of-magnitude smaller than Anthropic, and don’t care to scale further just yet. Their Head of Infrastructure, in particular, was young; maybe 30 years old and apparently one of the best AI buildout and energy experts in the country. (We briefly walked through the labs, and everybody seemed young. There was a lot of discussion; it felt like an exciting and energetic place.) Lots of competition is coming from Alibaba (Qwen), ByteDance, and Moonshot (Kimi). People in China seem to mostly use Kimi or Deepseek. Young people use VPNs to access Claude, though Anthropic has blockers around usage in China and make it difficult. Poaching between groups is common, just like in the U.S. DeepSeek has a reputation as being really smart and “cool,” maybe similar to Anthropic. Big labs are mostly in Beijing, near Tsinghua and Peking University, with Hangzhou as the main exception (DeepSeek and Alibaba/Qwen are there). The DeepSeek team reads western AI writers. They listen to Dwarkesh and read Gwern. The people we met with said they had never met with any employees from Anthropic. They were not at all concerned with some kind of hostile / AGI takeover scenario. They kept bringing up job loss (which is already high amongst youth in China) as their main concern. When we asked if they do red teaming on their models, they said no. In China, AI models are not regulated directly; the government instead has restrictions on how those models can be used in software, services, etc. As a whole, China seems to treat AI as just another technology, rather than as some kind of singularity moment. National attention is still on basic needs and infrastructure buildouts, and on providing more medicines for people. The “dreams of singularity" seem like a luxury or distant consideration. We asked the DeepSeek team: “What has the highlight been so far? What are your plans for an exit?” And they said that their highlight and great achievement was R1. They did not gesticulate at a future model or vision, but rather seemed proudest of what they’ve already done. They are content for now to remain ~6 months behind U.S. companies while maintaining a lower profile and team size.

联系我们 contact @ memedata.com