Moebius：具备百亿级性能的 2 亿参数图像修复模型

Moebius：具备百亿级性能的 2 亿参数图像修复模型
Moebius: 0.2B image inpainting model with 10B-level performance

为了解决大规模基础模型在图像修复任务中高昂的计算成本，研究人员推出了 **Moebius**，这是一个高效且轻量级的框架。虽然通常的深度压缩往往会降低模型性能，但 Moebius 通过两项关键创新克服了这一“表征瓶颈”。首先，它采用了 **局部-λ 混合交互（LλMI）模块**。该模块重构了扩散主干网络，将空间和语义数据总结为紧凑的线性矩阵，在保持复杂潜在交互的同时大幅减少了参数量。其次，它采用了一种**自适应多粒度蒸馏策略**，通过在潜在空间内全流程运行来对齐高保真输出，从而避免了代价高昂的像素级解码过程。实验结果表明，Moebius 的性能足以媲美甚至超越拥有 119 亿参数的工业级模型 FLUX.1-Fill-Dev。通过仅使用不到 2% 的参数（0.22B）并实现 15 倍的推理速度提升，Moebius 为高效、高质量的图像修复确立了新的基准。

抱歉。

原文

While 10B-level industrial foundation models have pushed the boundaries of image inpainting, their prohibitive computational costs severely hinder practical deployment. Constructing a highly optimized task-specific specialist offers a promising solution; however, extreme structural compression inevitably triggers a severe representation bottleneck. To conquer this, we propose Moebius, a highly efficient lightweight inpainting framework. We systematically reconstruct the diffusion backbone by introducing the Local-λ Mix Interaction (LλMI) block. Comprising Local-λ and Interactive-λ modules, it elegantly summarizes spatial contexts and global semantic priors into fixed-size linear matrices, preserving complex latent interactions while drastically shedding parameters. Furthermore, to unlock the full representational capacity of this highly compact architecture, we synergistically pair it with an adaptive multi-granularity distillation strategy. Operating strictly within the latent space to avoid expensive pixel-space decoding, this strategy dynamically balances multiple gradient-based losses to achieve high-fidelity alignment. Extensive experiments across natural and portrait benchmarks demonstrate that this optimal synergy enables Moebius to rival or even surpass the generation quality of the 10B-level industrial generalist FLUX.1-Fill-Dev. Remarkably, Moebius achieves this using less than 2\% of the parameters (0.22B vs. 11.9B) while delivering a >15× acceleration in total inference time, setting a new efficiency standard for high-fidelity inpainting.

Moebius：具备百亿级性能的 2 亿参数图像修复模型 Moebius: 0.2B image inpainting model with 10B-level performance

Moebius：具备百亿级性能的 2 亿参数图像修复模型
Moebius: 0.2B image inpainting model with 10B-level performance