展示HN:Tiny Diffusion – 一个从头开始构建的字符级文本扩散模型
Show HN: Tiny Diffusion – A character-level text diffusion model from scratch

原始链接: https://github.com/nathan-barry/tiny-diffusion

## 微型扩散:莎士比亚文本生成 微型扩散是一个基于字符的文本生成扩散模型,基于nanochat-gpt进行修改,并使用Tiny Shakespeare的完整作品进行训练。它体积非常小,只有1070万个参数,设计为可在本地运行。 该项目提供预训练权重,但可以使用`training.py`脚本进行重新训练(在4xA100上大约需要30分钟)。生成的文本长度可达30个字符,通过`sample.py`生成。 除了基本的文本生成,该仓库还包括可视化工具:`diffusion-process.py`显示去噪步骤,`game-of-life.py`提供了一种独特的实验性采样方法。关键参数包括6层、6个注意力头、384的嵌入维度和128个扩散步骤。代码和数据可以通过git clone轻松获取。

相关文章

原文

A character-level language diffusion model for text generation. The model is a modified version of the nanochat gpt implementation and is trained on Tiny Shakespeare! It is only 10.7 million parameters, so you can try it out locally!

Demo

# Clone the repository
git clone <repository-url>
cd tiny-diffusion

# Install dependencies (Python 3.10+)
uv sync

The file training.py puts the weights in weights/diffusion_model.pt. The sample and animation files load the model from this file.

Currently, the weights are already provided for you! It took me around half an hour to train this model for 20,000 steps on 4xA100s. But if you want to retrain the model again, run:

# Train from scratch on Shakespeare
uv run training.py

# Training will save checkpoints to weights/diffusion_model.pt

To generate a continuous stream of output (currently 30 context lengths), run:

# Generate samples using the pre-trained model
uv run sample.py

Visualize the Diffusion Process

To see the diffusion process as a nice animation, run:

# Watch the denoising process step-by-step
uv run animations/diffusion-process.py

# See Game of Life-inspired sampling (fun little experiment)
uv run animations/game-of-life.py
  • Parameters: 10.7 million
  • Layers: 6
  • Attention Heads: 6
  • Embedding Dim: 384
  • Sequence Length: 256 characters
  • Diffusion Steps: 128
tiny-diffusion/
├── model.py                    # Core diffusion transformer
├── training.py                 # Training script
├── sample.py                   # Text generation
├── data/
│   └── tiny_shakespeare.txt    # Training data
├── weights/
│   └── diffusion_model.pt      # Pre-trained weights
└── animations/
    ├── diffusion-process.py    # Denoising visualization
    └── game-of-life.py         # Game of Life sampling
联系我们 contact @ memedata.com