重新思考扩散桥采样器的损失函数
Rethinking Losses for Diffusion Bridge Samplers

原始链接: https://arxiv.org/abs/2506.10982

Sebastian Sanokowski 等人的论文“重新思考扩散桥采样器的损失函数”挑战了当前在训练扩散桥采样器时偏好对数方差 (LV) 损失而非反向 Kullback-Leibler (rKL) 损失的观点。虽然 LV 损失在使用重参数化技巧梯度时表现出更好的性能,但作者认为这种优势并不适用于扩散桥,尤其是在学习扩散系数的情况下。他们认为,在这种情况下,LV 损失缺乏基于数据处理不等式的 rKL 损失的理论基础。 该论文提出使用带有对数导数技巧的 rKL 损失 (rKL-LD)。他们的分析表明,rKL-LD 规避了与扩散桥的 LV 损失相关的概念性问题,并在实验中证明了其优越性。在各种扩散桥类型和具有挑战性的基准测试中,使用 rKL-LD 训练的采样器都取得了改进的性能。此外,作者还强调了 rKL-LD 的实际好处,包括减少超参数优化和更稳定的训练。

Hacker News 最新 | 过去 | 评论 | 提问 | 展示 | 工作 | 提交 登录 重新思考扩散桥采样器的损失函数 (arxiv.org) 8 分,来自 badmonster,1 天前 | 隐藏 | 过去 | 收藏 | 1 评论 semiinfinitely 23 小时前 [–] 数学符号太多了 回复 考虑申请 YC 2025 秋季批次!申请截止日期为 8 月 4 日 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系我们 搜索:
相关文章

原文

[Submitted on 12 Jun 2025]

View a PDF of the paper titled Rethinking Losses for Diffusion Bridge Samplers, by Sebastian Sanokowski and 4 other authors

View PDF HTML (experimental)
Abstract:Diffusion bridges are a promising class of deep-learning methods for sampling from unnormalized distributions. Recent works show that the Log Variance (LV) loss consistently outperforms the reverse Kullback-Leibler (rKL) loss when using the reparametrization trick to compute rKL-gradients. While the on-policy LV loss yields identical gradients to the rKL loss when combined with the log-derivative trick for diffusion samplers with non-learnable forward processes, this equivalence does not hold for diffusion bridges or when diffusion coefficients are learned. Based on this insight we argue that for diffusion bridges the LV loss does not represent an optimization objective that can be motivated like the rKL loss via the data processing inequality. Our analysis shows that employing the rKL loss with the log-derivative trick (rKL-LD) does not only avoid these conceptual problems but also consistently outperforms the LV loss. Experimental results with different types of diffusion bridges on challenging benchmarks show that samplers trained with the rKL-LD loss achieve better performance. From a practical perspective we find that rKL-LD requires significantly less hyperparameter optimization and yields more stable training behavior.
From: Sebastian Sanokowski [view email]
[v1] Thu, 12 Jun 2025 17:59:58 UTC (1,739 KB)
联系我们 contact @ memedata.com