通过大型语言模型(LLM)表征解读人脑语言处理过程
Deciphering language processing in the human brain through LLM representations

原始链接: https://research.google/blog/deciphering-language-processing-in-the-human-brain-through-llm-representations/

研究人员发现,一个语音转文本模型(Whisper)的内部表征与自然对话中的大脑活动惊人地吻合,尽管该模型并非旨在模拟大脑处理过程。在言语产生过程中,额下回(额叶)的语言处理先于感觉运动区和颞上回(颞叶)的言语编码;相反,在理解过程中,颞上回的言语处理先于额下回的语言编码。这种吻合表明该模型捕捉到了神经语言处理的关键要素。 这项研究还揭示了大脑区域的“软分层”结构。虽然额下回主要关注语义和句法信息,但也处理听觉特征。类似地,颞上回主要关注声学和语音,但也捕捉词级信息。这表明,不同的脑区虽然具有专门的功能,但也参与了语言处理的更高和更低层次的各个方面。该模型的吻合性支持将其作为理解语言神经基础的框架。

Hacker News 上的一篇帖子讨论了一篇谷歌的研究论文,该论文使用大型语言模型 (LLM) 的表征来破译人脑中的语言处理过程。 用户“taosx”表达了对这项研究能够帮助更深入地理解大脑的潜力充满热情,希望能够在“黑客入侵”大脑以改善认知功能方面取得进展。他提到自己缺乏动力,并渴望提高信息吸收、空间可视化和内在动机等能力,这促使了他表达这样的愿望。他明确表示希望能够像版本控制一样编辑自己大脑的“代码”。 用户“goatlover”质疑了“可黑客入侵的大脑”的愿望,这促使“taosx”进行了解释。“eMPee584”则简单地回复了“升级啦 8D”,表明了对大脑增强可能性同样积极的展望。

原文

During speech production, it is evident that language embeddings (blue) in the IFG peaked before speech embeddings (red) peaked in the sensorimotor area, followed by the peak of speech encoding in the STG. In contrast, during speech comprehension, the peak encoding shifted to after the word onset, with speech embeddings (red) in the STG peaking significantly before language encoding (blue) in the IFG.

All in all, our findings suggest that the speech-to-text model embeddings provide a cohesive framework for understanding the neural basis of processing language during natural conversations. Surprisingly, while Whisper was developed solely for speech recognition, without considering how the brain processes language, we found that its internal representations align with neural activity during natural conversations. This alignment was not guaranteed — a negative result would have shown little to no correspondence between the embeddings and neural signals, indicating that the model's representations did not capture the brain's language processing mechanisms.

A particularly intriguing concept revealed by the alignment between LLMs and the human brain is the notion of a "soft hierarchy" in neural processing. Although regions of the brain involved in language, such as the IFG, tend to prioritize word-level semantic and syntactic information — as indicated by stronger alignment with language embeddings (blue) — they also capture lower-level auditory features, which is evident from the lower yet significant alignment with speech embeddings (red). Conversely, lower-order speech areas such as the STG tend to prioritize acoustic and phonemic processing — as indicated by stronger alignment with speech embeddings (red) — they also capture word-level information, evident from the lower yet significant alignment with language embeddings (blue).

联系我们 contact @ memedata.com