(评论)
(comments)

原始链接: https://news.ycombinator.com/item?id=43784205

Hacker News 上的一个帖子讨论了一篇名为“每个人都应该知道的关于视觉Transformer的三件事”的论文。 这个标题引发了关于其是否为点击诱导的争论,一些人将其比作博客标题,另一些人则认为它是一个双关语。用户们就学术论文标题的有效性和动机进行了辩论,并将它们与更直接的技术性标题进行了对比。 讨论强调了大型语言模型 (LLM) 在总结研究论文方面的应用日益增多,一些人发现 LLM 的总结比原始摘要更有帮助,因为 LLM 总结更适合积极的科研人员每天浏览摘要,而不是休闲读者。一位用户还提供了一个 LLM 生成的要点列表,概述了关键内容:视觉 Transformer 可以并行化以提高效率;微调注意力层通常就足够了;基于 MLP 的 patch 预处理改进了掩码自监督学习。其他人建议阅读摘要或引言/结论部分。


原文
Hacker News new | past | comments | ask | show | jobs | submit login
Three things everyone should know about Vision Transformers (arxiv.org)
67 points by reqo 1 day ago | hide | past | favorite | 16 comments










There's something that tickles me about this paper's title. The thought that everyone should know these three things. The idea of going to my neighbor who's a retired K-12 teacher and telling her about how adding MLP-based patch pre-processing layers improves Bert-like self-supervised training based on patch masking.


Clickbait titles are something of a tradition in this field by now. Some important paper titles include "One weird trick for parallelizing convolutional neural networks", "Attention is all you need", and "A picture is worth 16x16 words". Personally I still find it kind of irritating, but to each their own I guess.


Only the first one is clickbait in the style of blogs that incentivize you to click on the headline (i.e. the information gap), the last two are just fun puns.


Honestly I took the first one as making fun of that trope. Usually the “one weird trick to” ends in some tabloid-style thing like lose 15 pounds or find out if your husband is loyal. So “parallizing CNNs” is a joke, as if that’s something you’d see in a checkout isle.


In what sense is "Attention is all you need" a pun?


It's a reference to the lyric "love is all you need" from the song "All You Need Is Love" by the Beatles, and it uses a faux-synonym with a different meaning.


"Attention is all you need" is an outlier. They backed up their bold claim with breakthrough results.

For modest incremental improvements, I greatly prefer boring technical titles. Not everything needs to a stochastic parrot. We see this dynamic with building luxury condos. On any individual project, making that pick will help juice profit. When the whole city follows that , it leads to a less desirable outcome.



Hey, when the AI powered T-rex is chasing you down you'll wish you paid attention that the vision transformers perception is based on movement!

Had to throw some Jurassic Park humor in here.



Yeah, I guess today was the day that I learned I am not part of "everyone". I feel so left out now.


I put this paper into 4o so i can check if it is relevant, so that you do not have to do this too here are the bullet points:

- Vision Transformers can be parallelized to reduce latency and improve optimization without sacrificing accuracy.

- Fine-tuning only the attention layers is often sufficient for adapting ViTs to new tasks or resolutions, saving compute and memory.

- Using MLP-based patch preprocessing improves performance in masked self-supervised learning by preserving patch independence.



just read the abstract


You would think. I don't know about this paper in particular, but I'm continually surprised about how much more I get out of LLM summaries of papers than the abstracts of papers written by the authors.


Paper abstracts are not optimized by drive-by readers like you and me. They are optimized for active researchers in the field reading their daily arXiv digest that lists all the new papers across the categories they work in, and needing to take the read/don't-read decision for each entry there as efficiently as possible.

If you’ve already decided you’re interested in the paper, then the Introduction and/or Conclusion sections are what you’re looking for.



Wouldn't a more comprehensive, digestible bullet point summary be even more helpful to actual researchers choosing which papers to read?


This would be an interesting metric to track, how different an abstract generated from LLM giving it the paper as source, vs the actual abstract is, and if it has any correlation whatsoever with the overall quality of the paper or not


Same. I don't think GP deserves the downvotes.






Join us for AI Startup School this June 16-17 in San Francisco!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact



Search:
联系我们 contact @ memedata.com