NotaGen：符号音乐生成

NotaGen：符号音乐生成
NotaGen: Symbolic Music Generation

原始链接: https://electricalexis.github.io/notagen-demo/

NotaGen是一个音乐生成模型，它从包含160万首不同类型音乐作品的预训练语料库中学习基本的音乐结构。微调过程使用了8948首高质量的古典音乐乐谱，这些乐谱按时期（巴洛克、古典、浪漫）和乐器（键盘、室内乐、管弦乐、艺术歌曲、合唱、声乐-管弦乐）进行了分类，从而能够使用“时期-作曲家-乐器”提示进行条件生成。为了进一步提高音乐性和提示的可控性，该模型采用了CLaMP-DPO，这是一种使用直接偏好优化(DPO)的强化学习方法。CLaMP 2，一个多模态符号音乐信息检索模型，充当评估器，区分首选和拒绝的音乐输出，从而指导NotaGen的优化。实验表明，CLaMP-DPO有效地提高了各种符号音乐生成模型的可控性和音乐性，无论数据模式、编码方案或模型架构如何，都突显了其广泛的适用性。

Hacker News 最新 | 过去 | 评论 | 提问 | 展示 | 招聘 | 提交登录 NotaGen：符号音乐生成 (electricalexis.github.io) 10 分，来自 explosion-s，1 小时前 | 隐藏 | 过去 | 收藏 | 1 评论 echelon 15 分钟前 [–] 这就是 AI 成为艺术家工具的地方。我们需要一个 AI 数字音频工作站（DAW），而不是 Suno。回复加入我们 6 月 16-17 日在旧金山举办的 AI 初创公司学校！指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系我们搜索：

Pre-training

NotaGen is pre-trained on 1.6M pieces of music. This corpus covers a wide range of genres and periods, enabling NotaGen to capture fundamental musical structures and patterns through next-token prediction.

Fine-tuning

NotaGen is fine-tuned on high-quality classical music sheet data to further enhance musicality in generation. We curated a fine-tuning dataset comprising 8,948 classical music sheets and covering 152 composers, from DCML corpora, OpenScore String Quartet Corpus, OpenScore Lieder Corpus, ATEPP, KernScores, and internal resources. We label all the pieces with 3 periods---Baroque, Classical, and Romantic; 6 instrumentations---Keyboard, Chamber, Orchestral, Art Song, Choral, and Vocal-Orchestral. Each piece is preprended with a ''period-composer-instrumentation'' prompt for conditional generation.

Reinforcement Learning

To refine both the musicality and the prompt controllability of the fine-tuned NotaGen, we present CLaMP-DPO. This method builds upon the principles of Reinforcement Learning from AI Feedback (RLAIF) and implements Direct Preference Optimization (DPO). In CLaMP-DPO, CLaMP 2, a multimodal symbolic music information retrieval model, serves as the evaluator within the DPO framework, distinguishing between chosen and rejected musical outputs to optimize NotaGen. Our experiments demonstrated that CLaMP-DPO efficiently enhanced both the controllability and the musicality across different symbolic music generation models, irrespective of their data modalities, encoding schemes, or model architectures. This underscores CLaMP-DPO's broad applicability and potential for auto-regressively trained symbolic music generation models.