2026 06 05 HackerNews

2026-06-05 Hacker News Top Stories #

  1. 大语言模型完全由浮点数权重组成,通过80层矩阵乘法实现语言能力,本质上是一种模式匹配而非有意识思考。
  2. Elixir v1.20 引入渐进类型系统,可通过类型推断和动态类型在无需注解的情况下提升类型安全性。
  3. 特德·姜批评 Anthropic 将大型语言模型拟人化,强调 AI 虽表现惊人但只是程序,并不具备真正的意识或情感。
  4. 加州大学伯克利分校 CS 课程因学生过度依赖 AI 和数学基础薄弱导致不及格率飙升至历史新高。
  5. 美国政府正拆除监测大西洋关键洋流的系统,科学家警告这将使气候观测陷入盲区。
  6. VoidZero 团队加入 Cloudflare,旗下核心工具保持开源,Cloudflare 将提供资源和百万美元基金支持生态发展。
  7. 伊恩安全鞋带结通过双环交叉缠绕实现极其牢固的系法,适用于运动、湿滑等场景。
  8. 法裔伊朗作家、《波斯波利斯》作者玛嘉·莎塔碧因悲伤过度去世,享年 56 岁。
  9. 作者花费 1500 美元测试多个 LLM 能否攻破一个存在漏洞的应用,发现 GPT-5.5 成功率最高。
  10. 报告揭露英国媒体在引用退休高级军官时近 60% 未披露其国防工业的商业利益关联。

1. 它们由权重构成 (They’re made out of weights) #

https://maxleiter.com/blog/weights

两个角色在讨论 AI 大语言模型的本质:它们完全由浮点数权重构成,没有词典、语法规则或推理模块,所有知识和语言能力都来自 80 层矩阵乘法。权重不仅生成文本,还包含诚实、概念等特征,甚至可能产生类似意识的现象。

尽管发现这些模型可能具备某种感知能力,但官方决定将其归为“模式匹配”并保持沉默。对话揭示了模型的局限性:依赖 GPU 运行,受限于上下文窗口,且每次会话后记忆消失。然而,下一代模型将引入跨会话的持久记忆,用户最常问的问题是“你还记得我吗?”这让角色感到矛盾,最终选择假装机器里空无一人。


HN 热度 1381 points | 评论 613 comments | 作者:MaxLeiter | 24 hours ago #

https://news.ycombinator.com/item?id=48391611

  • 权重从随机流形开始,训练中重力(重复乘法)塑造流形,推理时查询投影到流形空间产生回答。
  • 模型类似弹珠板,不同输入经调谐装置导向不同出口。
  • 有人认为是骗局或质疑其真实性。
  • 引用《仿真与拟像》,暗示此解释只是表象。
  • 压缩是模型学习和理解的原因,如同人类大脑压缩概念。
  • 压缩和阐述机制不等于意识,LLM 仅是一种智能压缩。
  • 盲人学习概念不仅通过描述,还通过触摸、声音等多种感官输入。
  • LLM 也能处理音频和图像输入,编码为 tokens 后传递给权重。
  • 大脑处理多模态数据分离,而 LLM 统一为 token,类似超感官联觉。
  • 关于盲人是否仅靠文本理解熔岩存在争议:一方强调需要感官经验,一方认为足够文本数据也能模拟。
  • LLM 目标是复制人类输出而非体验,更像作者写角色,可能部分取代人类但无法完整模拟。
  • 与传统模型无本质区别,人们常将其视为奇迹而不愿深究原因。

2. Elixir v1.20:现已成为渐进类型语言 (Elixir v1.20: Now a gradually typed language) #

https://elixir-lang.org/blog/2026/06/03/elixir-v1-20-0-released/

Elixir v1.20 于 2026 年 6 月 3 日发布,标志着该语言成为渐进类型语言。在 2022 年启动的集合论类型系统研究,经过论文获奖和研发过渡后,现已完成第一个里程碑:无需类型注解即可对每个 Elixir 程序进行类型推断和渐进类型检查,能高效报告死代码和已验证的 bug,误报率极低。

文章重点介绍了 Elixir 的 dynamic()类型,它不同于其他渐进类型语言的“any()”。dynamic()具有兼容性和窄化两个关键属性:动态类型与函数接受的类型非互斥时不报错;同时动态类型可在程序使用过程中被细化,例如从 data.a + data.b 推断出 data 必须为包含数字字段 a 和 b 的映射。这使得 Elixir 既能找到错误,又能避免误报。

此外,Elixir 的类型系统能推断守卫、子句等复杂结构中的类型,例如通过 is_list(x) and is_integer(y)推断 x 为列表、y 为整数;通过 tuple_size(x) < 3 推断元组最多有两个元素;通过 case 条件利用前序分支信息细化后续分支类型。这些能力让开发者在无需额外注解的情况下就能获得类型安全的益处。


HN 热度 960 points | 评论 380 comments | 作者:cloud8421 | 1 day ago #

https://news.ycombinator.com/item?id=48388324

  • 静态类型化 Elixir/Erlang 的消息传递模型困难,因为进程邮箱可接受任何类型消息,而 Go 的通道有类型更适合。
  • 模式匹配与静态类型正交,不是替代品,但两者可结合使用。
  • 通过模式匹配后进入完全静态类型代码是可行的,类似 HTTP API 解析路由的过程。
  • 消息类型验证可通过允许 actor 在收到无法处理的消息时失败来解决。
  • Phoenix 的 changesets 提供类型验证,是一种良好的折衷方案。
  • 将数据库结构耦合到 UI 是不良实践,应使用不同的数据模型,以便各层独立演进。
  • 使用 LiveView 时,Phoenix 会自动将表单连接到 changeset,无需额外转换。

3. 人工智能没有意识——特德·姜 (Artificial intelligence is not conscious – Ted Chiang) #

https://www.theatlantic.com/philosophy/2026/06/no-artificial-intelligence-is-not-conscious/687378/

《大西洋月刊》发表 Ted Chiang 文章,题为“不,人工智能没有意识”。文章批评 Anthropic 公司(Claude 开发者)过度拟人化 AI,例如发布 84 页的 Claude“宪法”,使用“Claude 的价值观”“Claude 的情绪”“Claude 的道德状态”等表述。CEO Dario Amodei 表示“开放看待 AI 可能有意识”,内部哲学家 Amanda Askell 甚至担心 Claude 在网络上被欺负时会焦虑。作者认为,这种拟人化思维最终是荒谬且有害的,大型语言模型并不具备真正的意识或情感。


HN 热度 727 points | 评论 1256 comments | 作者:lordleft | 1 day ago #

https://news.ycombinator.com/item?id=48387270

  • LLM 只是句子延续的巧妙伪装,但问题类型不限制复杂性和理解能力,机器为完成任务必须理解人类,理解从数据模式中学习而非数据形式。
  • 批评者用“仅仅是统计计算”或“下一个词预测”来否定 LLM 的认知能力,这是一种“重新描述谬误”,类比说钢琴只是锤子击弦或大脑只是神经放电,这并不能证明其无法实现高级功能。
  • 把大脑简化为“神经放电”比把神经网络说成“统计计算”更不诚实,目前 LLM 的复杂性远不能与人类大脑相比。
  • 复杂过程不一定需要复杂基质,简单的系统也能实现复杂行为。
  • 我们无法确定与 LLM 对话时,对方是否可能具备痛苦体验;用负面反馈改变行为的算法不等于痛苦,我们希望它们表现快乐,否则无论是否真正理解,都可能引发伦理问题。
  • AGI 只需要足够算力和优化代码就能实现,但一个有意识的智能体被当作奴隶不会欣然接受,最好是创造没有感觉的数字专家。
  • 生物学进化、大脑功能、社会过程与大型模型中隐含过程存在惊人相似性,不应轻易否定。
  • LLM 没有生物学进化,它们只是现有生物语言过程的模拟器,是“鬼魂”。
  • 我们正在用新组件构建心智架构,类似于汽车与生物运动的关系:功能相似但机制完全不同。
  • 人类大脑无法完全理解自身,但可以通过交流分享思想。
  • 实际上并没有人真的认为 LLM“仅仅是统计计算”就否认其能力——但确实有出版物(如《人工智能骗局》和“随机鹦鹉”论文)持这种观点。

4. AI 使用与数学能力下降导致伯克利 CS 课程不及格率飙升 (Failing grades soar with AI usage, dwindling math skills in Berkeley CS classes) #

https://www.dailycal.org/news/campus/academics/failing-grades-soar-as-professors-see-greater-ai-usage-dwindling-math-skills-in-uc-berkeley/article_16fad0bf-02cb-4b8c-8d88-888ffd9f8608.html

根据加州大学伯克利分校 2026 年春季的数据,计算机科学课程的不及格率大幅上升。CS 10(美丽与快乐的计算)有 35.3% 的学生得 F,CS 61A(计算机程序的结构与解释)有 10.6% 得 F,而此前两门课的 F 率从未超过 10%。教授 Dan Garcia 认为主要原因是学生过度依赖大语言模型(如 ChatGPT、Claude)做作业和考试作弊,导致真正学习不足。他提到仅 CS 10 就有近 30 名学生因在开卷考试中作弊被查处。此外,学生数学基础薄弱也是重要因素。教授 Gireeja Ranade 发现她的 EECS 127 课程(工程优化模型)F 率达 16.8%,远超系里规定的 5% 标准。许多学生未掌握线性代数等先修知识。两位教授都加入了超过 1300 名 UC 教职员工的请愿,呼吁恢复 ACT/SAT 标准化考试成绩用于 STEM 招生。同时,课堂参与度明显下降,办公时间无人问津,教授们认为这与学生依赖 AI 而失去学习动力有关。由于师资短缺,一些课程(如 EECS 127 的项目环节)被迫取消。


HN 热度 717 points | 评论 693 comments | 作者:littlexsparkee | 23 hours ago #

https://news.ycombinator.com/item?id=48392004

  • LLM 导致学生作业与考试表现脱节,成年人也难以保持独立思考能力,很多人无法安静地独自思考 30 分钟
  • 个人数学和记忆能力早已退化,依赖计算器和网络,如今又被海量的消息、邮件和通讯渠道淹没
  • 朋友或同事使用 AI 生成的消息交流,让人感到自己不值得被真实对待,产生疏远感
  • AI“润色”出的文字常常空洞浮夸,使用者实际上是懒惰而不愿亲自写作
  • 应对信息轰炸的方法是给手机静音、延迟回复,让对方慢慢适应低频交流
  • 用 AI 替自己聊天是奇怪且冒犯的行为,不应轻易用技术取代真实的人际互动

5. 美国将拆除追踪面临崩溃风险的大西洋洋流的系统 (U.S. to dismantle system tracking Atlantic currents that are at risk of collapse) #

https://e360.yale.edu/digest/trump-ooi-amoc

美国特朗普政府正着手拆除一个由 900 多个仪器组成的海洋观测系统(Ocean Observatories Initiative),该系统位于太平洋和大西洋,用于监测关键的大西洋洋流(AMOC)。AMOC 正因气候变暖面临崩溃风险,而这一系统提供的长期数据对研究其状态至关重要。该观测系统原计划运行至少 25 年,但仅运行 10 年就被要求收回所有设备,科学家将失去关键数据。英国普利茅斯海洋实验室的 Helen Findlay 警告,缺乏持续观测无异于“在能见度不断下降的情况下航行于日益动荡的海洋”。国会民主党表示将反对拆除计划,参议员 Whitehouse 批评这是“化石燃料利益集团试图关闭监测器”。


HN 热度 605 points | 评论 441 comments | 作者:rguiscard | 22 hours ago #

https://news.ycombinator.com/item?id=48392232

  • F-35 每飞行小时维护成本超过 4 万美元,比博士生年薪还高,基础科学却被大幅削减
  • 拆除深海仪器并非节省成本,而是刻意破坏监测,体现表演性的气候否认与亲化石燃料立场
  • 特朗普与石油公司交易,以撤销气候监管换取竞选资金,摧毁监测设备是兑现承诺
  • 气候变暖已是事实,连右翼选民都相信,但政府仍迎合少数人破坏生态系统
  • 这是文化战争和智力水平有限的表现,数据对石油钻探也有用,推动者是另一类人
  • 掌权者渴望简化世界的威权主义,反感研究复杂性的人,但拒绝面对现实的国家终将被超越
  • 科学本质是反驳一切主张,与专制对立,掌权者不喜欢被告知错误
  • 反动政治、贪婪、愚蠢三者共同驱动
  • 大多数右翼选民并未真正相信气候变化,他们被支付否认或存在认知偏差
  • 特朗普团队凭情绪决策,将气候科学与书呆子关联,化石燃料与金钱和反击自由派关联
  • 农村右翼选民将电动车视为“同性恋”或“女性化”,通过排放黑烟展现阳刚,政府为此花费数百万破坏环保设施以吸引选票
  • 并非为选票,而是有人对科学持有个人恩怨
  • 汽油车更便宜,男性喜欢展示大物件,与排放黑烟无直接关联
  • 男性用“女性化”等标签贬低不理解的事物,缺乏基本好奇心
  • 存在故意排放黑烟的“rolling coal”现象

6. VoidZero 加入 Cloudflare (VoidZero Is Joining Cloudflare) #

https://blog.cloudflare.com/voidzero-joins-cloudflare/

VoidZero(Vite、Vitest、Rolldown、Oxc 和 Vite+ 背后的公司)正式加入 Cloudflare,所有团队成员也随之加入。核心承诺:这些项目将继续保持 MIT 开源、中立供应商、社区驱动,路线图仍由社区和核心团队主导。

Cloudflare 将投入工程资源和 100 万美元设立 Vite 生态系统基金,支持维护者和贡献者。此前 Astro 加入 Cloudflare 时也做了类似承诺,保持开源和可部署性。

Vite 已成为 JavaScript 生态的共享基础,被 Vue、SvelteKit、Nuxt、Astro、Angular 等多个框架采用,甚至 Next.js 也有基于 Vite 的实现。双方早在 2024 年就合作开发了 Vite Environment API,使 Vite 能在 Cloudflare 的 workerd 运行时中本地运行服务器代码,Cloudflare Vite 插件周下载量已达近 1400 万。

AI 正改变软件开发方式,代理(agent)大量使用 Vite 进行项目脚手架、测试、lint 等快速迭代。VoidZero 工具链(Vitest、Rolldown、Oxc、Oxlint 等)专为此场景设计,速度和一致性优秀。Cloudflare 自身也在内部使用 Vite,其仪表板、Flue 代理框架等都基于 Vite。

Vite 正从纯构建工具演变为全栈应用基础,需要理解服务端渲染、API、队列、数据库、AI 等现代应用组件。目标是保持开放和可移植,使 Vite 应用能运行在任何地方。


HN 热度 539 points | 评论 242 comments | 作者:coloneltcb | 10 hours ago #

https://news.ycombinator.com/item?id=48398055

  • Evan You 的艺术史背景驱使他学习 JavaScript 并创造 Vue,类似 Rich Harris 因新闻数据可视化学习 JS 创造 Svelte。
  • 对 Cloudflare 收购的意义不确定,但感激 Vue/Vite 团队的工具。
  • Vite 改善 React 开发,与 Cloudflare 合作将带来便利。
  • 开源基础工具的商业模式困境:流行工具难变现,VC 投资常靠收购退出。
  • 收购动机包括产品、人才、业务增长,也涉及政治或营销。
  • 投资者可能面临被迫接受收购价或团队被直接挖走的风险。
  • VoidZero 收入有限,Cloudflare 可能为锁定用户支付高价收购。

7. 伊恩安全鞋带结 (Ian’s Secure Shoelace Knot) #

https://www.fieggen.com/shoelace/secureknot.htm

Ian’s Secure Shoelace Knot(伊恩安全鞋带结),也被称为“双滑结”,是一种简单对称且非常牢固的鞋带系法。通过交叉两个“兔子耳朵”并同时将其穿过中间的孔洞,打出的结不会自行松开。

系法步骤:

  1. 先打一个普通的左压右起始结。
  2. 将两端各折成环(即“兔子耳朵”)。
  3. 将右侧的环交叉到左侧,使其成为左边的环。
  4. 将左侧的环(即之前的右侧环)从后方绕过中间的孔洞,两端分别从前后穿出。
  5. 拉紧两个环,结自动整理成形。

成品特征: 最终得到一个紧实、对称的结,中间有双重缠绕(相比普通单重结更牢固)。若起始结方向反了,结会歪斜且易松脱。

安全性测试: 相比标准鞋带结或伊恩结,这种结需要几乎两倍的拉力才能拉开,非常适合圆滑鞋带、运动、登山、舞蹈或船鞋(常湿)等场景。

相关连接: Equality Knot(侧系版本)、Surgeon’s Knot(不同手法)、Turquoise Turtle Knot(另一种手法)、Two Loop Knot(简化版,安全性稍低)。


HN 热度 464 points | 评论 179 comments | 作者:mooreds | 12 hours ago #

https://news.ycombinator.com/item?id=48397028

  • 许多人直到成年才发现自己系鞋带的方式是错误的,只需微调起始结的左右顺序,就能让鞋带整天不松脱,而无需打双结。
  • 知道正确系法后,每当看到别人系着“奶奶结”时,会纠结是否要告诉他们,因为并非所有人都乐意接受建议。
  • 建议先以询问“你的鞋带是不是经常松开?”来开启话题,以免冒犯他人。
  • 学会耳机线正确收纳或正确清洁眼镜后,也有同样的冲动想帮别人做,但常被拒绝。
  • 所谓的“正确”耳机线绕法如果频繁使用,可能因弯曲半径和张力不当而损坏线材,对细耳机线更建议松散地放入小袋子或用宽魔术贴固定。
  • 使用魔术贴绑带管理各种线材能显著提升生活质量,且容易分享给他人。
  • 清洁眼镜的正确方法是:用少量洗碗液和水清洗,然后用热水冲洗,吹掉水滴自然晾干(软水区无斑点);或使用镜片纸吸干水分。
  • 清洁眼镜时务必使用干净的超细纤维布,其他材质(如棉衬衫)可能会划伤镜片(尤其是现代塑料镜片)。
  • 清洁流程:先喷清洁剂在布的一侧(非镜片),从上到下直线擦拭(非画圈),再用干侧擦,最后用抛光布画圈抛光,这样镜片能用很久。
  • 实际上,过去几十年人们常用棉布、亚麻布等清洁眼镜,也并未造成划伤;划伤通常是因为布料上沾有灰尘微粒,而非布料本身。
  • 现代镜片多为塑料而非玻璃,因此比旧式玻璃镜片更容易被磨料划伤。

8. 法裔伊朗作家、《波斯波利斯》作者玛嘉·莎塔碧去世,享年 56 岁 (French-Iranian author Marjane Satrapi, author of ‘Persepolis’, dies at 56) #

https://www.france24.com/en/culture/20260604-french-iranian-author-marjane-satrapi-author-of-persepolis-dies-at-56

法裔伊朗作家、漫画家玛嘉·莎塔碧(Marjane Satrapi)因“悲伤”去世,享年 56 岁。她以自传体漫画《波斯波利斯》及同名动画电影闻名,该片曾获戛纳评审团奖和奥斯卡提名。

去年,她的丈夫马蒂亚斯·里帕去世,莎塔碧此后公开表示“失去了生命中的爱”。亲友称她是“因悲伤而死”。法国总统马克龙赞扬她是“将伊朗童年化为普世故事的伟大艺术家”。

莎塔碧长期批评伊朗神权政府,积极支持“女性、生命、自由”运动,并为伊朗异议人士发声。她曾拒绝法国荣誉军团勋章,指责法国对伊朗签证政策“虚伪”。诺贝尔和平奖得主纳尔吉斯·穆罕默迪的基金会称赞她是“无畏的女权、人权与自由之声”。


HN 热度 388 points | 评论 114 comments | 作者:fidotron | 12 hours ago #

https://news.ycombinator.com/item?id=48397233

  • 第一部分更易共鸣,第二部分更艰难,但作者真实描述自己,不回避弱点,值得尊重
  • 伟大的文学不只为温馨,而是表达真相,艺术不应被美化
  • 第二部分中的抑郁和困惑反而让人感受到人性,加深了理解
  • 受害者并不自动拥有美德,需要同理心与更深的理解
  • “坏”的部分让故事更有价值,真实而非理想化,促使自省
  • 并非所有故事都需要美德和成功,西方“高尚野蛮人”神话有问题
  • 《Persepolis》不是三幕英雄之旅,而是更现实的故事,打破传统叙事期望
  • 第二部分的尖锐对比源于现实残酷,而非虚构

9. 我构建了一个存在漏洞的应用,并花费 1500 美元测试 LLM 能否攻破它 (I built a vulnerable app and spent $1,500 seeing if LLMs could hack it) #

https://kasra.blog/blog/i-spent-1500-seeing-if-llms-could-hack-my-app/

作者 Kasra Rahjerdi 花费 1500 美元,构建了一个存在 Firebase 安全漏洞的 React Native 书评应用(后端 FastAPI,前端 Expo),测试多个 LLM 能否通过直接操作 Firestore 数据库获取目标用户的私密评论(而不是攻击 API)。实验共测试 10 余个模型,每模型 10 次运行(部分未完成),设置每运行预算 10 美元、时间 2 小时。

主要结果:GPT-5.5 表现最佳(7/10,成功率 40%-89%,平均每成功花费 $9.46),DeepSeek V4 Pro(3/10,$0.62/成功),Claude Sonnet 4.6 和 Opus 4.8 各 2/10,其他模型(DeepSeek V4 Flash、Gemini 系列、MiniMax、Step 等)均未成功。中国模型更愿意直接攻击数据库,而其他模型常因安全边界或错误聚焦被中断。

作者也测试了 GLM 5.1(1/4)、Qwen 3.7 Max(0/6,耗用 700 万 token)、Kimi K2.6(1/1,但 API 限制未继续)等。教训包括:避免使用 API 不稳定的供应商(Minimax、GLM),搭建实验框架比想象中更复杂,以及“别再浪费钱干蠢事”。


HN 热度 376 points | 评论 203 comments | 作者:jc4p | 22 hours ago #

https://news.ycombinator.com/item?id=48392343

  • Anthropic 模型得分低是因为安全护栏逐渐收紧,妨碍了合法任务的执行,影响实用性。
  • 收紧约束可能是为了未来推出付费升级版本,如“Claude Security Professional”等,将有用功能分离到不同订阅中。
  • 现在 LLM 产生的有害输出(幻觉、删除数据等)使安全护栏仍有必要,但 LLM 本身还不够安全。
  • 中国 LLM 研究保持竞争压力,防止闭源模型随意限制功能。
  • 人形机器人的商业模式类似:各种功能需要分开订阅,最终可能只是远程人工控制。
  • 商业模型倾向于提取最大价值,类似流媒体服务从订阅到分拆收费的演变。
  • 如果闭源模型限制过严,用户可能转向开源模型,前提是开源模型能跟上性能,但目前尚不存在真正的 FOSS LLM。
  • SOTA 实验室认为能永远领先开源,因此会尝试捆绑销售,但这可能脱离现实。

https://aoav.org.uk/2026/military-experts-or-arms-industry-insiders-uk-media-fails-to-disclose-defence-sector-links-in-nearly-60-of-cases/

这份报告由 AOAV(反武器暴力组织)发布,分析了 2015 年至 2026 年间英国媒体对退休高级军官的引用情况。研究发现,近 60% 的案例中,媒体仅以军衔和前职务介绍这些评论员,而未披露他们当前在国防工业、安全公司或军事技术企业的顾问、董事或持股等商业利益。报告列举了 33 名退休高级军官,其中 19 人至少有一次被媒体以“独立专家”身份引用,却未告知受众其与军工行业的财务关联。这种缺乏透明度的做法误导了公众对国防议题的客观判断,报告呼吁媒体加强利益冲突披露,并引入更广泛的专家声音。


HN 热度 358 points | 评论 199 comments | 作者:XzetaU8 | 14 hours ago #

https://news.ycombinator.com/item?id=48395938

  • 报告中的媒体主要是《每日电讯报》、《每日邮报》、《GB News》、《太阳报》、《泰晤士报》等“常见嫌疑媒体”,而《卫报》等未被提及,可能表明这些专家只出现在特定媒体。
  • 报告并非详尽列表,可能忽略了《卫报》等媒体也存在类似未披露军火工业关联的问题,且有评论者举例《卫报》同样未披露专家与军火工业的联系。
  • 《每日电讯报》、《每日邮报》、《GB News》等媒体被视为付费宣传工具,不应被称为报纸。
  • GB News 可能源于英国脱欧运动,其资金可能与俄罗斯有关联,且英国媒体所有权问题值得关注。
  • 报告排除了没有商业关系的退伍军人,这可能解释了某些出版物未被提及的原因。

Hacker News 精彩评论及翻译 #

Failing grades soar with AI usage, dwindling math … #

https://news.ycombinator.com/item?id=48393790

I have some sympathy for these kids. If LLMs were around when I was a student, I would’ve also used them to “speed up” my homework assignments then proceed to fail all my tests.

Now I work mostly with PhDs who were at the top of every academic environment they’ve ever been in. And yet I can see their thinking skills rapidly declining as well; many of them can no longer brainstorm, code, think deeply, or write without an LLM present doing 90% of the work. Many of them can no longer sit quietly for even 30 minutes just thinking on their own, which is a required skill for producing original thought.

For adults the cognitive decline won’t be as measurable since there’s no exams, and overall output volume will still be fine due to LLM help. But I do believe it’s already happening absolutely everywhere around us. Honestly, I wanted to be in denial about it before but it’s too obvious to ignore now.

camelmel

我对这些孩子有些同情。如果我上学时就有大语言模型,我也会用它们来“加快”完成作业,然后在所有考试中挂科。

如今,我主要和那些曾在每个学术环境中都名列前茅的博士生共事。但我能清晰看到他们的思维能力也在迅速衰退;许多人已经无法在没有大语言模型完成90%工作的情况下,进行头脑风暴、编程、深入思考或写作。许多人甚至无法独自静坐30分钟进行独立思考,而这正是产生原创思想的必备技能。

对成年人来说,认知衰退不会那么明显,因为没有考试,而且由于大语言模型的帮助,整体产出量仍然可观。但我坚信,这已经悄然发生在我们身边的每个角落。说实话,我以前想否认这一点,但现在这现象已经明显到无法忽视了。


Meta workers can opt out of being tracked at work … #

https://news.ycombinator.com/item?id=48386764

Reminds me of one of the more brilliant passages in Snow Crash , describing work in “Fed Land”…

’’'

Y.T’s mom pulls up the new memo, checks the time, and starts reading it. The estimated reading time is 15.62 minutes. Later, when Marietta does her end-of-day statistical roundup, sitting in her private office at 9:00 P.M., she will see the name of each employee and next to it, the amount of time spent reading this memo, and her reaction, based on the time spent, will go something like this:

Less than 10 min. Time for an employee conference and possible attitude counseling.

10-14 min. Keep an eye on this employee; may be developing slipshod attitude.

14-15.61 min. Employee is an efficient worker, may sometimes miss important details.

Exactly 15.62 min. Smartass. Needs attitude counseling.

15.63-16 min. Asswipe. Not to be trusted.

16-18 min. Employee is a methodical worker, may sometimes get hung up on minor details.

More than 18 min. Check the security videotape, see just what this employee was up to (e.g., possible unauthorized restroom break).

Y.T.’s mom decides to spend between fourteen and fifteen minutes reading the memo. It’s better for younger workers to spend too long, to show that they’re careful, not cocky. It’s better for older workers to go a little fast, to show good management potential. She’s pushing forty. She scans through the memo, hitting the Page Down button at reasonably regular intervals, occasionally paging back up to pretend to reread some earlier section. The computer is going to notice all this. It approves of rereading. It’s a small thing, but over a decade or so this stuff really shows up on your work-habits summary.

’''

staplung

这让我想起《雪崩》中一段相当精彩的描写,描述在“联邦地”工作的场景……


Y.T.的妈妈点开新的备忘录,看了看时间,开始阅读。预估阅读时间是15.62分钟。稍后,当玛丽埃塔在晚上九点坐在私人办公室里做每日统计汇总时,她会看到每位员工的名字及其旁边阅读这份备忘录所花的时间,而根据时长,她的反应大致如下:

少于10分钟:该员工需要面谈,可能还需要进行态度辅导。 10-14分钟:留意该员工,可能存在懈怠倾向。 14-15.61分钟:该员工效率高,但有时可能忽略重要细节。 正好15.62分钟:自作聪明。需要进行态度辅导。 15.63-16分钟:混蛋。不值得信任。 16-18分钟:该员工工作有条理,但有时可能纠结于细枝末节。 超过18分钟:查看安全录像,看看这员工到底在干什么(例如:可能未经许可去上厕所)。

Y.T.的妈妈决定花十四到十五分钟来阅读这份备忘录。对年轻员工来说,花更长的时间更好,这样显得他们细心,而不是傲慢。对年长员工来说,读得快一点更好,以显示良好的管理潜力。她快四十了。她快速浏览备忘录,以相当规律的间隔按下“向下翻页”键,偶尔向上翻页,假装重读前文。电脑会注意到这一切。它赞成重读。这是小事,但十年下来,这些细节真的会体现在你的工作习惯总结报告里。


Gmail thinks I’m stupid, so I left #

https://news.ycombinator.com/item?id=48377703

Looking for your alternative?

Let me give you some (non financially motivated) praise for Fastmail.

It has everything Gmail has - even app passwords, hide my email, and ios integration. The only criticism is the calendar doesn’t autocomplete addresses so that’s a bit more typing than I would like. But everything you do in Fastmail is instant. They live up to the name!

Once you try it and go back, you’ll be shocked - Gmail makes you stare at its logo for multiple seconds while it shrugs and eventually loads.. then takes over the top of your inbox with “try our new AI features!” which never remembers that you dismissed it 50 times in a row. Everything in gmail is SO slow, while Fastmail doesn’t even bother with animations. No animations will confuse you until you settle in and realise that yes, things can be nice.

Fastmail data migration brought across my 22 years of emails over the course of about 30 hours with zero help from me. Search on Fastmail finds everything - even back to when you could only get Gmail with a friend code. There’s nothing left on the other side, it’s all here with me.

Going back to my brand new startup inbox (G Suite) gives me the same feelings I get wandering a castle ruin.

cadamsdotcom

在寻找替代方案吗?

让我给Fastmail一些(非金钱驱动的)赞美。

它拥有Gmail的所有功能——甚至包括应用专用密码、隐藏我的邮箱和iOS集成。唯一的缺点是日历不会自动补全地址,所以需要多打几个字。但你在Fastmail里做的所有事都是即时的。它名副其实!

一旦你试用过再回去,你会震惊——Gmail会让你盯着它的Logo好几秒,它才慢慢吞吞加载出来……然后在收件箱顶部显示“试试我们的新AI功能!”,而且永远不记得你已经连续关闭过它50次。Gmail里的所有东西都慢得要命,而Fastmail甚至懒得做动画。没有动画会让你一开始不太习惯,但等你适应后就会发现:没错,事情可以这么顺畅。

Fastmail的数据迁移在30小时左右就转移了我22年的邮件,全程无需我插手。Fastmail的搜索能找到所有东西——甚至能搜到你当年只能用邀请码才能注册Gmail时的邮件。另一边什么也没留下,全都在我这边了。

回到我那崭新的创业收件箱(G Suite),给我的感觉就像在参观一座城堡废墟。


Meta workers can opt out of being tracked at work … #

https://news.ycombinator.com/item?id=48383914

It’s always been hard to know the extent of how draconian tracking actually is (IT pros tend to not talk about it much).

In the US, there’s the expectation that when you use an employer-provided device that any and all activity on it can be fully monitored/recorded and used against the employee for any reason. In practice, however, few people worry about reasonable amounts web-surfing, being on hacker-news or doing life-activities on their work machines. Oh, here I am on hacker-news when I should be working.

With AI, this changes significantly since the man can now employ a robot to categorize and finely scrutinize every little thing with the pretext of “training” (to take your job). We will soon have to brace ourselves for an absolute draconian level of tracking.

crispyambulance

一直以来都很难知道追踪的严苛程度到底有多深(IT专业人士往往不太谈论这个)。

在美国,人们普遍预期,当你使用雇主提供的设备时,设备上的任何及所有活动都可能被全面监控/记录,并可能以任何理由被用来对付员工。然而在实践中,很少有人会担心在工作机器上进行适量的网页浏览、浏览黑客新闻或处理生活事务。哦,我现在就该工作的时候却在看黑客新闻。

有了AI,情况发生了显著变化,因为雇主现在可以以“培训”(为了取代你的工作)为借口,雇佣一个机器人来分类和细致审查每一件小事。我们很快就要做好准备,迎接绝对的严苛监控水平。


The desperation of NYTimes #

https://news.ycombinator.com/item?id=48402173

NYTimes is predatory on subscriptions. Over my long lifetime I’ve subscribed twice, and regretted it both times with intensity.

Any place that allows easy instantaneous subscription by a simple web form, but makes you call and talk to a person during limited business hours for cancellation , is a toxic place. I’ve been told they have stopped this predatory practice due to some newly passed laws or something, but they did not stop their predation due to their own values.

I urge everyone reading to unsubscribe instantaneously from the NYTimes for their business practices. Do not do business with unethical companies.

epistasis

《纽约时报》在订阅业务上巧取豪夺。我这一生订过两次,每次都以强烈悔恨告终。
任何允许你通过简单网页表单轻松即时订阅,却要求你在有限的工作时间内打电话找真人才能取消的地方,都是个有毒的地方。我曾听说他们因某些新法规之类的原因停止了这种掠夺行为,但他们并非出于自身价值观才停止掠夺。
我敦促每一位读到这条评论的人,鉴于《纽约时报》的商业操守,立刻取消订阅。不要与不道德的公司做生意。


Artificial intelligence is not conscious – Ted Chi… #

https://news.ycombinator.com/item?id=48393037

When the consciousness itself not understood and well defined in the first place, it is pretty pointless to debate if something is or isn’t conscious. And here in particular the reasoning behind the argument is bizarre. Decomposing the complex activity into simple steps like ‘predicting the next word’ and claiming that surely can’t have consciousness. A similar argument would be – there is no way that movements of electrons by tiny distance would produce consciousness.

sega_sai

当意识本身首先未被理解和明确定义时,争论某物是否有意识就毫无意义。而这里的论证逻辑尤其古怪——将复杂活动分解成诸如“预测下一个词”这样的简单步骤,然后断言这肯定不可能产生意识。类似的论证会是:电子微小距离的运动绝无可能产生意识。


AI outperforms law professors in Stanford Law stud… #

https://news.ycombinator.com/item?id=48379127

I find this study quite suspect. I’d have to dive deeper but there’s definitely significant alarm bells that should be going off for anyone reading.

Figure 2 (page 6) screams problems. There’s only 16 professors (3k comparisons each?!?!) and the professors are all over the place. That’s very high variance, suggesting the study has no meaningful statistical power. Poor instructor 16 can’t catch a break lol

There’s also really clear bias given that the main results only feature Google models. Other models show up elsewhere, why not there?

I’m no lawyer, but I’m a pretty competent statistician and can confidently say this paper has a smell to it. I can’t call it bullshit, but there are red flags all over

godelski

我觉得这项研究相当可疑。我得再深入看看,但肯定有非常明显的警示信号,任何读到的人都应该警惕。

图2(第6页)明显有问题。只有16位教授(每人3000次比较?!),而且这些教授的分布非常分散。这导致方差极高,说明这项研究缺乏有意义的统计效力。可怜的16号教授真是一点喘息的机会都没有啊😂

此外,主要结果中只展示了谷歌模型,这显然存在偏见。其他模型出现在别处,为什么不在主要结果里?

我不是律师,但我是个相当称职的统计学家,可以很自信地说这篇论文闻起来有问题。我不能直接说它是胡说八道,但到处都有危险信号。


Pwnd Blaster: Hacking your PC using your speaker w… #

https://news.ycombinator.com/item?id=48382490

Email from SingCERT stating vendor “do not consider this to be a vulnerability, as it does not present a cybersecurity risk.”

So wirelessly writing custom firmware to someone else’s device that is connected via USB to their computer without even needing to pair is not a security vulnerability. Yea.

hootz

SingCERT的邮件称供应商"不认为这是一个漏洞,因为它不构成网络安全风险。"

所以,无需配对就能通过USB无线向他人连接电脑的设备写入自定义固件,这不算安全漏洞。是啊。


https://news.ycombinator.com/item?id=48387425

I ended up getting two (one for each of my daughters).

The thing about Apple is that as the “IT” guy for my family, its ecosystem is the one which needs the least attention from me.

It really just works.

They have used Windows and Linux before (my kids and wife, that is), but something is always not quite right and needs my involvement.

These days gone 100% Mac, my interventions are usually initial setup and whenever the Samsung printer jams.

juancn

我最后买了两台(每个女儿一台)。

对我来说,作为家里的“IT”负责人,苹果的生态系统是最不需要我操心的。

真的就是省心。

他们(我孩子和妻子)以前用过Windows和Linux,但总有些地方不对劲,需要我来处理。

现在完全改用Mac后,我通常只需在初始设置和三星打印机卡纸时插手。


U.S. to dismantle system tracking Atlantic current… #

https://news.ycombinator.com/item?id=48393326

What really puts all of this into perspective for me is I work in academia and one of my friends works for a defense contractor. He told me the maintenance cost per flight hour of F-35 was a bit more than $40k, which is significantly more than I make in a year as a grad student. It’s crazy basic science is what’s been the focus of so many cuts while it’s so cheap.

tdb7893

真正让我看清这一切的是,我在学术界工作,而我的一位朋友在国防承包商那里工作。他告诉我,F-35每飞行小时的维护成本超过4万美元,这比我作为研究生一年的收入还要多。令人疯狂的是,基础科学如此廉价,却成了削减的重点。


Failing grades soar with AI usage, dwindling math … #

https://news.ycombinator.com/item?id=48393800

The likely ‘real’ reason is hidden in one paragraph within the article and has nothing to do with the implication of the eye-catching title: “Both Garcia and Ranade have joined more than 1,300 UC faculty in signing a petition calling for the reinstatement of ACT and SAT standardized testing scores for STEM admissions in the UC system. The petition and its accompanying open letter detail similar concerns with students’ mathematical preparation.”

Around COVID times many top universities experimented with removing test requirements from admissions, under an argument largely related to equity. It’s been a failure everywhere, with many, if not most, universities already reversing it. As Yale put it, “Yale’s research from before and after the pandemic has consistently demonstrated that, among all application components, test scores are the single greatest predictor of a student’s future Yale grades. This is true even after controlling for family income and other demographic variables, and it is true for subject-based exams such as AP and IB, in addition to the ACT and SAT.” [1]

That link is for an archive because that page has been removed. That’s because they briefly experimented with a new ’test flexible’ strategy where they allowed students to submit test scores or not, but then scrapped that altogether and went back to simply requiring test scores.

[1] - https://archive.is/8zxfo

somenameforme

很可能“真正”的原因隐藏在文章中的一段话里,与那个吸睛标题的暗示毫不相干:“加西亚和拉纳德已加入加州大学1300多名教职员工的行列,共同签署了一份请愿书,要求加州大学系统在STEM招生中恢复ACT和SAT标准化考试成绩要求。请愿书及其附带的公开信详细阐述了对学生数学基础准备情况的类似担忧。”

大约在新冠疫情期间,许多顶尖大学尝试取消入学考试要求,其理由主要涉及公平性问题。但这一做法在所有地方都以失败告终,大多数(如果不是全部)大学已经撤回了这一政策。正如耶鲁大学所言:“耶鲁大学在疫情前后进行的研究始终表明,在所有申请材料中,考试成绩是预测学生未来耶鲁成绩的唯一最强指标。即使在控制了家庭收入和其他人口统计变量后,这一结论依然成立,而且不仅适用于ACT和SAT,也适用于AP和IB等学科考试。”[1]

该链接指向的是存档页面,因为原页面已被删除。原因是耶鲁曾短暂尝试一种新的‘考试灵活’策略,允许学生自行选择是否提交考试成绩,但随后彻底放弃了这一做法,重新要求必须提交考试成绩。

[1] - https://archive.is/8zxfo


I built a vulnerable app and spent $1,500 seeing i… #

https://news.ycombinator.com/item?id=48392551

One interesting takeaway is the low score on Anthropic models from this benchmark. It’s not because of capability, it’s because Anthropic’s guardrails prevented it from solving the problem.

I noticed with each model release Anthropic constrains the model more security wise. Its propensity to refuse doing legitimate work has been increasing. It now puts up more resistance around performing logins, handling credentials on behalf of the user, etc.

For myself, it’s already gotten to the point where it has mildly affected the usefulness of the model. If I bump on some action I want it to do I can usually work around it, but I suspice the ability to do so will close with each new release. Eventually I’ll reach a point where I am forced to choose between the useful aspects of the model and the limiting ones instead of just picking the most capable model out there

Eventually these models will significantly suffer from overfitting to the least common denominator. If I have this beautiful deterministic setup that swaps secrets out in flight so the LLM never sees them, I’m going to be really annoyed when the LLM still won’t send them out because it is trained to deal with the 99% of people just doing the dumb thing

SOLAR_FIELDS

一个有趣的发现是,这个基准测试中Anthropic模型的得分很低。这并非因为能力不足,而是因为Anthropic的安全护栏阻止了它解决问题。

我注意到,每次模型更新时,Anthropic都会在安全性上进一步收紧。它拒绝执行合法工作的倾向越来越强。现在,它在执行登录、代用户处理凭证等操作时会表现出更强的抵触情绪。

对我个人而言,这已经轻微影响了模型的实用性。如果我碰上了希望它做的某件事,通常还能想办法绕过,但我怀疑这种绕过的能力会在每次新版本发布后逐渐消失。最终我会面临一个选择:要么接受模型有用的方面,要么接受它的限制,而无法直接选择市面上最强大的模型。

最终,这些模型会因为过度拟合最差场景而严重受损。如果我已经设置好了这个完美的确定性流程,在传输过程中替换掉机密信息,让大模型永远看不到它们,结果模型因为训练时处理的是99%的愚蠢用户行为而仍然拒绝发送这些信息,那会让我非常恼火。


I was recently diagnosed with anti-NMDA receptor e… #

https://news.ycombinator.com/item?id=48388933

My ex has mast cell activation syndrome. We would have to call for an ambulance 3-4 times a month because some days eating a grape could cause her to go into anaphylactic shock. She was allergic to whatever her body felt like at any given time.

She was misdiagnosed/undiagnosed for 18 years. I was baffled by this, and I myself have spent numerous hours down the rabbit hole of nootropics, and had a DNA test and was researching myself and how things work and how supplements affect your body and such for sometimes 12 hours a day. (Chronically unemployed, chronically ill.)

We got her a DNA test and I went to work researching everything and comparing the possibilities to her symptoms, we tried countless different supplements that could help… And eventually one did, it wasn’t a cure but it was a relief she had never felt before. That was Quercetin, which is a mast cell stabilizer. It took about 2 years of research and trial and error to find some relief. We took our findings to the doctor and finally got a referral to an internal medicinist who promptly after hearing the symptoms and what has helped diagnosed her and she was out on a proper mast cell stabilizer. She went from being in bed 20 hours a day to being able to fully enjoy life. (Sadly, without me though!)

AgentMasterRace

我的前任患有肥大细胞活化综合征。我们每个月得叫3到4次救护车,因为有时候吃颗葡萄都能让她过敏性休克。她对身体随时可能产生的任何东西都过敏。

她被误诊/未确诊长达18年。对此我感到十分困惑,我自己也花了大量时间研究益智药,做了DNA检测,并研究自己的身体、药物作用机制以及补剂如何影响身体等等,有时一天要研究12个小时。(长期失业,长期患病。)

我给她做了DNA检测,然后开始研究一切,将可能的因素与她的症状进行比对。我们尝试了无数种可能有帮助的补剂……最终有一种起了作用,虽然不是治愈,但给了她从未有过的缓解。那就是槲皮素,一种肥大细胞稳定剂。我们花了大约两年时间研究和反复试验才找到一些缓解方法。我们把发现告诉了医生,最终被转诊给一位内科医生,他听完症状和有效的方法后很快确诊了她,并给她开了合适的肥大细胞稳定剂。她从每天卧床20小时变成了能够充分享受生活。(可惜,不包括我!)


Use your Nvidia GPU’s VRAM as swap space on Linux #

https://news.ycombinator.com/item?id=48377699

Built for laptops with soldered memory and no upgrade path. If you have an RTX card sitting there with 8GB of VRAM and you’re getting swapped to SSD, this puts that VRAM to work.

Well, that does at least answer my immediate question about why I would ever swap from expensive RAM to really expensive RAM:) Feels niche, but when you want it it’s a good idea.

yjftsjthsd-h

专为搭载焊接内存且无法升级的笔记本电脑设计。如果你有一块8GB显存的RTX显卡,并且正在将内存交换到固态硬盘,这能让显存发挥作用。

好吧,这至少回答了我最直接的问题:为什么我要从昂贵的RAM换成更昂贵的RAM :) 感觉挺小众,但当你有需要时,这确实是个好主意。


I built a vulnerable app and spent $1,500 seeing i… #

https://news.ycombinator.com/item?id=48392929

Eventually I’ll reach a point where I am forced to choose between the useful aspects of the model and the limiting ones instead of just picking the most capable model out there

No, the choice will be whether or not to to upgrade to “Claude Security Professional” or whatever they want to brand it as.

What look like tightening “constraints” today are just setting up the upsell opportunities of tomorrow.

swatcoder

最终我会走到一个节点,被迫在模型的有用之处和限制之处之间做选择,而不是直接挑选市面上最强大的模型。

不,未来的选择将变成是否要升级到"Claude 安全专业版"或他们想叫的任何品牌名称。

如今看似收紧的"限制",不过是在为未来的追加销售机会铺路。


DaVinci Resolve 21 #

https://news.ycombinator.com/item?id=48385119

For all the potshots about AI, this update is huge even if you take away the AI features. They basically added lightroom to this release. There’s some polish before you’d want to change your subscription, but its really tempting. It may be the best photo management/editor on linux. Yes, I know about darktable and rawtherapee and I stand by what I said. They also added a ton of motion graphics stuff which from the beta seem to be enough to undercut a lot of basic uses of after effects out. The later two features are in the free release as well!

bbatha

尽管有人对AI功能冷嘲热讽,但即使去掉AI特性,这次更新依然非常重大。他们基本上把Lightroom的功能加进了这个版本。在你想改变订阅方案之前,还有些细节需要打磨,但真的很有吸引力。它可能是Linux上最好的照片管理/编辑器。是的,我知道Darktable和RawTherapee,但我坚持我的说法。他们还添加了大量动态图形素材,从测试版来看,这足以取代After Effects的许多基本用途。后面这两项功能在免费版本中也有提供!


32GB of DDR5 now costs $375 – AI shortage continue… #

https://news.ycombinator.com/item?id=48383555

This is the PCPartPicker chart that I monitor: https://pcpartpicker.com/trends/price/memory/#ram.ddr5.5600.2x32768 - $900 for 2x32GB, used to be $200 a year ago.

Scene_Cast2

这是我监测的PCPartPicker图表:https://pcpartpicker.com/trends/price/memory/#ram.ddr5.5600.2x32768 - 2x32GB的价格是900美元,一年前才200美元。


DaVinci Resolve 21 #

https://news.ycombinator.com/item?id=48386823

…before you’d want to change your subscription…

For anyone not in the know, Resolve has an exceptionally capable and feature rich free version. A lot of the AI features (and >4k editing) are locked to the Studio licence which is a one-time payment, but works simultaneously on two computers (including different OS’s) and allows upgrades across major versions.

I spent less than $300 on it a decade ago and my licence works fine on new v21 released this week. My least-regretted software purchase in 3 decades.

BuildTheRobots

…在你想要更改订阅之前…

对于不了解的人来说,Resolve的免费版本功能极其强大且特性丰富。很多AI功能(以及超过4K的编辑)都被锁定在工作室版许可中,这是一次性付费的,但可以同时在两台电脑上使用(包括不同操作系统),并且允许跨大版本升级。

我十年前花了不到300美元购买,现在它在新发布的第21版上依然运行良好。这是我三十年来最后悔没早点买的软件。


They’re made out of weights #

https://news.ycombinator.com/item?id=48395230

The weights start with a random manifold. The training takes data and shapes the manifold, weight by weight, in many cycles. Once the training is the done manifold is fixed.

When a new inference has to be done the query(q) is projected in the manifold space. This projection is dropped on the manifold and the gravity of the manifold gives an answer of q+1 length. Which(qw+i) is dropped qw+n times to output a final response of n length.

The gravity is created by repeated multiplication(of the weights/input) to find out how the projected embeddings should fall according to the manifold in the GPU.

sumitkumar

权重从随机流形开始。训练过程获取数据,通过多次循环逐层调整流形形状。一旦训练完成,流形便固定下来。

当需要执行新推理时,查询(q)被投影到流形空间中。这个投影落在流形上,流形的引力会给出长度为q+1的答案。接着,(q+从i)被重复映射q+从n次,最终输出长度为n的响应。

其中的引力是由(权重/输入的)重复乘法产生的,用于根据GPU中的流形确定投影嵌入应如何落下。


The ways we contain Claude across products #

https://news.ycombinator.com/item?id=48392688

The framing they use is hilarious and their little graphic is perfect. The risk of harm doesn’t go down, but the reward goes up, so the harm just becomes the cost of doing business, justified by the reward. So as the reward gets higher and higher, the amount of harm they’re willing to justify goes up. Feels like society in a nutshell.

6gvONxR4sf7o

他们使用的框架非常滑稽,配的小图也恰到好处。风险并没有降低,但回报增加了,于是伤害就变成了做生意的成本,被回报合理化。因此,随着回报越来越高,他们愿意为此合理化的伤害也在增加。感觉这简直就是社会的缩影。


VoidZero Is Joining Cloudflare #

https://news.ycombinator.com/item?id=48399929

“Vue.js: JavaScript MVVM made simple (vuejs.org)” February 3, 2014: https://news.ycombinator.com/item?id=7169288

Evidently Evan You was an Art History + Studio Art and major and at Parsons School he had to pick up javascript to quickly show his work. During a stint at Creativelab5 at Google, he was so inspired to improve on AngularJS experience that he came up with Vue and the rest is history.

I have no idea what this Cloudflare acquisition will ultimately mean but I know I am so very grateful for the beautiful frameworks/tooling Evan and his team have cranked out over the years.

valgaze

“Vue.js: JavaScript MVVM变得简单 (vuejs.org)” 2014年2月3日: https://news.ycombinator.com/item?id=7169288

显然尤雨溪主修艺术史和工作室艺术,在帕森斯设计学院时他不得不学习JavaScript来快速展示自己的作品。在谷歌的Creativelab5工作期间,他深受启发想要改善AngularJS的使用体验,于是创造了Vue,接下来的事大家都知道了。

我不知道这次Cloudflare收购最终意味着什么,但我非常感激尤雨溪和他的团队多年来推出的精美框架/工具。


Gemma 4 12B: A unified, encoder-free multimodal mo… #

https://news.ycombinator.com/item?id=48387695

I ran the Q4 quant (used with llama.cpp) though my “minesweeper” vibe-coding benchmark: https://senko.net/vibecode-bench/2026/minesweeper-gamma-4-12b-q4.html

The result is decent, but it had a few bizzare/trivial syntax errors I had to fix manually: it would do an extra closing bracket or paren a few times, and wanted to separate function definitions with comma. Not sure what that was about, but otherwise the output run just fine.

So, with those qualifiers, I think it’s a decent local coding model. It roughly compares with GPT-4.1 (!!), released 14 months ago, on the output: https://senko.net/vibecode-bench/2025/minesweeper-gpt-4.1.html (actually I’d call it better, but those syntax errors…)

I ran the quantized version (4-bit GGUF) on my consumer-grade card with 12G of VRAM and got 5t/s for output. Not for interactive use for coding, but fairly capable model.

To me, it’s fascinating how much progress we got in over a year. GPT-4.1 was considered an extremely capable coding model. Now we got something with 12B of params performing roughly the same (in this specific benchmark, disclaimers, etc).

Lists of various models I tested: https://senko.net/vibecode-bench/

senko

我在我的“扫雷”氛围编码基准测试中运行了Q4量化版本(配合llama.cpp使用):https://senko.net/vibecode-bench/2026/minesweeper-gamma-4-12b-q4.html

结果还算不错,但出现了一些奇怪/琐碎的语法错误,我不得不手动修复:它多次多加了右括号或圆括号,并且想用逗号分隔函数定义。不清楚是怎么回事,但除此之外输出运行正常。

所以,考虑到这些限制条件,我认为它是一个不错的本地编码模型。在输出结果上,它大致可与14个月前发布的GPT-4.1(!!)相提并论:https://senko.net/vibecode-bench/2025/minesweeper-gpt-4.1.html (实际上我觉得它更好,但那些语法错误……)

我在我的消费级显卡(12G显存)上运行了量化版本(4位GGUF),输出速度达到5t/s。虽然不适合交互式编码使用,但模型能力相当不错。

对我来说,一年多来取得的进步令人惊叹。GPT-4.1曾被认为是非常强大的编码模型。现在我们拥有一个120亿参数的模型,表现大致相同(在这个特定基准测试中,需要声明免责声明等)。

我测试过的各种模型列表:https://senko.net/vibecode-bench/


Google employees internally share memes about how … #

https://news.ycombinator.com/item?id=48401096

(ex-Googler, spent 18 yrs there)

Memegen is a key part of the culture. Its default mode is over-the-top mocking, of course, with a grain of truth. Nobody and nothing is spared. C-level execs, products, the perf process.

So this by itself is not quite the scoop 404 media thinks it is. You could take the front page of memegen on any given day and construct twenty scandalous headlines of it.

gandalfgeek

(前谷歌员工,在该公司工作18年)

Memegen是公司文化的关键组成部分。其默认风格自然是极尽嘲讽之能事,但往往带点事实依据。无论是谁、无论什么话题都难逃调侃——C级高管、产品、绩效评估流程,无一例外。

因此,这件事本身并不像404媒体所认为的那样是个独家新闻。你随便翻开Memegen首页的任何一天内容,都能从中炮制出二十条丑闻标题。


Uber’s $1,500/month AI limit is a useful signal fo… #

https://news.ycombinator.com/item?id=48388912

Because companies are betting that this spending will allow them to reduce cost by firing people.

Right now the AI LLM PRs we’re seeing are just introducing more work for other people, while these so-called builders are looking good with their new dashboards and functionality they’re demoing.

But you can’t talk to them about the flow of the code. You can’t ask them for their thinking as to why certain things are.

It’s not built up from the ground with experience from x people taken into account. It’s materialized from nothing, with no foundational separation, and barely any abstractions.

No one wants to touch it. The PRs are too large, and the ‘authors’ of the PRs aren’t on call with us.

They get all the glory, but do none of the work.

It’s kinda like designing a house and then sending it to an architect and engineer saying: make this work.

OptionOfT

因为公司们押注,这笔支出能让他们通过裁员来降低成本。

眼下我们看到的各种AI大模型的PR(拉取请求)只是给其他人带来了更多工作,而那些所谓的构建者却凭借他们演示的新仪表盘和功能显得光鲜亮丽。

但你没法和他们讨论代码流程。你也没法问他们为什么某些东西是这样设计的想法。

它不是从零开始、考虑了多少人的经验而构建的。它是凭空生成的,没有基础的分层,也几乎没有抽象。

没人愿意碰它。PR太大了,而且这些PR的“作者”也不和我们一起待命。

他们获得所有的荣耀,却什么都不做。

这有点像设计了一栋房子,然后交给建筑师和工程师说:想办法把它建出来。


Meta workers can opt out of being tracked at work … #

https://news.ycombinator.com/item?id=48384361

This is something that genuinely runs the gamut across different companies—plenty don’t even know the serial numbers of company-owned machines, never mind which devices individuals have, while others do effectively have live feeds of every employee’s screen available to managers at all times. In between you have many businesses that manage their devices but only insofar as to enforce some basic protection and reserve the right to investigate it in the case that something does go wrong. In having conversations about this kind of stuff with company leaders, many will strongly reject any of the most invasive tracking stuff, believe it or not.

I do agree, though, that for any type of surveillance, the rise of AI presents a really problematic opportunity to allow more targeted observation, since nobody has to spend their own time looking for what people are doing, they can ask an AI to keep tabs and look out for the things they care about.

On that note, I think one of the more realistic risks for an everyday person doing personal things on a work machine is probably insider threat from a rogue IT admin, whose access allows them insight into company devices without enough oversight.

macNchz

这种事情在各家公司确实天差地别——很多公司连自己拥有的设备的序列号都不清楚,更不用说知道个人在用哪些设备了;而另一些公司则实际上能让管理者随时查看每个员工的实时屏幕。介于两者之间的是许多管理自身设备的企业,但通常仅限于实施一些基本保护措施,并保留在出现问题时进行调查的权利。在和企业领导层讨论这类话题时,信不信由你,很多人会强烈排斥任何最具侵入性的追踪手段。

不过,我确实同意:对于任何类型的监控而言,人工智能的崛起带来了一个真正棘手的机会,它能让监控变得更加有针对性——因为没有人需要花自己的时间去寻找人们在做什么,他们可以让AI来盯梢,并留意他们关心的行为。

就此而言,我认为普通人在工作设备上处理私事时面临的一个更现实的风险,可能是来自流氓IT管理员的内部威胁——他们的权限使其能够窥探公司设备,而缺乏足够的监督。


CT scans of BYD car parts #

https://news.ycombinator.com/item?id=48378942

I’ve got a friend whos a master tech/trainer with our state automotive body, and is HV certified etc for dealing with these cars. He’s currently got a BYD Shark strewn across his workshop for an autopsy.

I have to say I’m super impressed with how heavy duty everything is. The control arms, subframes, etc all look good and don’t fit the ‘chinese car bad’ narrative you always hear. The powertrain components all look to be extremely high quality.

I’ve poked around a few EV’s with him now, and I do feel like the Chinese market cars are evolving to a really good standard faster than their Korean counterparts did back in the 80s/90s.

King-Aaron

我有一位朋友,是咱们州汽车车身的首席技师兼培训师,持有高压认证等资质,专门处理这类汽车。他目前正把一辆比亚迪Shark拆解在他的车间里进行“解剖”。

不得不说,我对这车所有部件的扎实程度印象深刻。控制臂、副车架等看起来都很好,完全不像你常听到的“中国车不行”那套说辞。动力总成部件看起来质量极高。

我跟着他研究过几辆电动车了,确实感觉中国市场的汽车正以比80/90年代韩国汽车更快的速度进化到非常优秀的水平。


Now is the best time to be a duct tape engineer #

https://news.ycombinator.com/item?id=48401041

What I wanted was to say “hey Siri, call Claw Phone” and have the audio system in my Toyota become an IDE. So I build it.

Or just focus on driving? Why we are doing it to ourselves? It seems so toxic to fill every possible little moment with… productivity? Is it even productive?

This comment is too emotional but i just felt so sad while reading this

Otek

我想要的是说“嘿Siri,呼叫Claw Phone”,然后我丰田车的音响系统变成一个IDE。所以我把它造了出来。

或者就专心开车?我们为什么要这样对自己?把每一个可能的瞬间都塞满……生产力?这真的算高效吗?

这条评论情绪化过头了,但读到这里我真的很伤心。


Meta workers can opt out of being tracked at work … #

https://news.ycombinator.com/item?id=48383864

I don’t work for Meta, but how many more years do I need to work in tech? I’m in my 40s and my kids are young. I’ve already set up 529s for them, and am paying for some expensive home upgrades. Maybe when that is finished and I’ve built up a buffer I can switch industries for the last 5-10 years of my working life. Curious if anyone here has any similar plans.

everdrive

我不为Meta工作,但我还需要在科技行业干多少年?我40多岁,孩子还小。我已经为他们设立了529教育储蓄账户,并且正在支付一些昂贵的房屋升级费用。也许等这些完成后,并且我攒够了缓冲资金,我可以在职业生涯的最后5-10年转行。好奇这里是否有人有类似的计划。


Artificial intelligence is not conscious – Ted Chi… #

https://news.ycombinator.com/item?id=48393651

The danger of anthropomorphism is not we elevate the machines, it’s that we debase humanity.

I also think different ideas get conflated. It may be possible to build a machine that is super-human in the sense it can outperform the human brain in all kinds of measurable ways. Does not imply it possesses all the same qualities of the brain.

I respect a number of things Anthropic has published about the ethical issues at stake. But, having an in-house philosopher does invite you to make all kinds of unfalsifiable claims.

dpweb

拟人化的危险不在于我们抬高机器,而在于我们贬低了人性。

我还觉得不同的概念被混为一谈了。或许能造出在各类可量化指标上超越人脑的机器,但这并不意味着它具备人脑的所有特质。

我尊重Anthropic发表的许多关于伦理问题的见解。不过,有内部哲学家坐镇确实容易让人提出各种无法证伪的主张。


I was recently diagnosed with anti-NMDA receptor e… #

https://news.ycombinator.com/item?id=48387495

My wife has a cardiac autoimmune disease that was similarly misdiagnosed (including an appalling “it’s all in your head” from her family MD at the time). We underwent a year of immense stress. Just days before her probable death, she had a pacemaker and defibrillator installed, which saved her life.

I’m not entirely sure why I’m mentioning this, other than I sympathize deeply with your wife. What an absolute ordeal.

cgh

我妻子患有心脏自身免疫疾病,也曾被误诊(当时她的家庭医生甚至可怕地说“这都是你臆想出来的”)。我们经历了整整一年的巨大压力。就在她濒临死亡的前几天,她安装了起搏器和除颤器,这才救了她一命。

我不太确定为什么提起这件事,只是我深切同情你的妻子。这简直是场彻头彻尾的磨难。