State of the Field
Current research in music AI is increasingly focused on enhancing the quality and interpretability of generated music while addressing practical challenges in the industry. Recent work on automatic drum transcription has introduced semi-supervised methods that create high-quality datasets from unlabeled audio, improving performance over traditional approaches reliant on paired data. Concurrently, the development of ConceptCaps provides a structured dataset for interpretability, allowing for better understanding of music models through explicit labeling of attributes. In piano accompaniment generation, new diffusion models are demonstrating improved coherence and fidelity to musical constraints, which could streamline composition processes for musicians. Additionally, efforts to reduce the size of foundation models for music information retrieval are making these technologies more accessible and cost-effective. Finally, the establishment of clear frameworks for music plagiarism detection is essential for addressing legal and ethical concerns in the music industry, highlighting the field's responsiveness to real-world applications.
Papers
1–5 of 5Towards Realistic Synthetic Data for Automatic Drum Transcription
Deep learning models define the state-of-the-art in Automatic Drum Transcription (ADT), yet their performance is contingent upon large-scale, paired audio-MIDI datasets, which are scarce. Existing wor...
ConceptCaps -- a Distilled Concept Dataset for Interpretability in Music Models
Concept-based interpretability methods like TCAV require clean, well-separated positive and negative examples for each concept. Existing music datasets lack this structure: tags are sparse, noisy, or ...
D3PIA: A Discrete Denoising Diffusion Model for Piano Accompaniment Generation From Lead sheet
Generating piano accompaniments in the symbolic music domain is a challenging task that requires producing a complete piece of piano music from given melody and chord constraints, such as those provid...
Linear Complexity Self-Supervised Learning for Music Understanding with Random Quantizer
In recent years, foundation models have become very popular due to their exceptional performance, mainly in natural language (NLP) tasks where they were first introduced. These models usually consist ...
Music Plagiarism Detection: Problem Formulation and a Segment-based Solution
Recently, the problem of music plagiarism has emerged as an even more pressing social issue. As music information retrieval research advances, there is a growing effort to address issues related to mu...