Block Diffusion: Interpolating Autoregressive and Diffusion Language Models

macawfish · 2025-05-09T00:36:33 1746750993

Wow.

I can't wait to see ideas from the diffusion image generation world (like controlnet) work their way into language models.

soulofmischief · 2025-05-09T02:38:50 1746758330

I've built diffusion based text models, it's old hat and not necessarily the most performant way to generate text. However it does produce interesting results and I'd love to test some ideas at scale.

joejoo · 2025-05-09T02:29:50 1746757790

There’s already a few models that are diffusion based.

gitroom · 2025-05-09T04:02:21 1746763341

Yeah I always end up lost in papers like this too, even with my CS degree, the research keeps leveling up nonstop.

notrealyme123 · 2025-05-08T20:50:03 1746737403

This was posted here already a few weeks ago.

holoduke · 2025-05-08T21:11:25 1746738685

Whenever I try to read and understand this paper, I feel extremely dumb. I have my degree in CS, but this is just too complex for me to understand.

evertedsphere · 2025-05-08T22:14:00 1746742440

an undergraduate degree in a field is not enough to understand recent research in a specialised subfield of a subfield and you shouldn't beat yourself up over that

there's nothing wrong with you, you just need the right background and you can go get that. see e.g. the fast.ai course

smrtinsert · 2025-05-09T03:42:59 1746762179

Do you mean the fast.ai stable diffusion lectures? The initial series doesn't get too deep at all from what I remember.

tippytippytango · 2025-05-09T06:11:16 1746771076

I wouldn’t beat yourself up over it. Very few papers can be understood without reading a significant amount of the neighboring literature and the history of how that work came to be. There are norms and customs and a kind of academic language in every community that you won’t be able to see unless you’ve read a lot from that community. Even if you have the right math level it’s tricky.

A single paper is part of a conversation, not something that stands alone. Trying to read one random paper is like finding a 1000 page thread on an obscure topic that has been running for 10+ years and reading only the last page. It won’t make any sense without reading back a ways.

AlexCoventry · 2025-05-08T21:29:33 1746739773

Ask ChatGPT o3 about anything you don't understand, ask it about anything in its responses you don't understand. Keep drilling down until you do understand. Takes patience, but you can learn a lot very fast, this way.

echelon · 2025-05-09T03:01:27 1746759687

ChatGPT o3 understands the latest literature and isn't going to hallucinate weird details or make incorrect analogies or math?

I'd worry about learning the wrong things.

Ey7NFZ3P0nzAe · 2025-05-09T05:46:11 1746769571

I disagree. It's all about rephrasing information that is in the paper. Possinly a few other papers too.

vessenes · 2025-05-09T05:46:30 1746769590

o3 with a pdf or in deep research mode is excellent. Especially if you’re disciplined about staying to what’s research. But really, it’s excellent, better than benchmarks indicate, I’d say.

IncreasePosts · 2025-05-09T01:23:11 1746753791

Might want to study some stats or other math.

（评论） (comments)

（评论）
(comments)