Pre-training
NotaGen is pre-trained on 1.6M pieces of music. This corpus covers a wide range of genres and periods, enabling NotaGen to capture fundamental musical structures and patterns through next-token prediction.
Fine-tuning
NotaGen is fine-tuned on high-quality classical music sheet data to further enhance musicality in generation. We curated a fine-tuning dataset comprising 8,948 classical music sheets and covering 152 composers, from DCML corpora, OpenScore String Quartet Corpus, OpenScore Lieder Corpus, ATEPP, KernScores, and internal resources. We label all the pieces with 3 periods---Baroque, Classical, and Romantic; 6 instrumentations---Keyboard, Chamber, Orchestral, Art Song, Choral, and Vocal-Orchestral. Each piece is preprended with a ''period-composer-instrumentation'' prompt for conditional generation.
Reinforcement Learning
To refine both the musicality and the prompt controllability of the fine-tuned NotaGen, we present CLaMP-DPO. This method builds upon the principles of Reinforcement Learning from AI Feedback (RLAIF) and implements Direct Preference Optimization (DPO). In CLaMP-DPO, CLaMP 2, a multimodal symbolic music information retrieval model, serves as the evaluator within the DPO framework, distinguishing between chosen and rejected musical outputs to optimize NotaGen. Our experiments demonstrated that CLaMP-DPO efficiently enhanced both the controllability and the musicality across different symbolic music generation models, irrespective of their data modalities, encoding schemes, or model architectures. This underscores CLaMP-DPO's broad applicability and potential for auto-regressively trained symbolic music generation models.