AI that Transcribes Drums—No Paired Audio Required
AI that transcribes drums—no paired audio required
Most drum-transcription AIs need huge, matched audio–MIDI datasets. Those are scarce. Synthetic stand‑ins often sound cheap, creating a domain gap.
We flip the script. With a semi‑supervised pipeline, we automatically curate a large, diverse library of high‑quality one‑shot drum samples from unlabeled audio. Then we render realistic drum tracks from MIDI only and train a sequence‑to‑sequence model on this data.
- High‑fidelity, diverse drum timbres—no manual labeling
- Trained from MIDI + curated one‑shots (no paired audio)
- New state of the art on ENST and MDB, beating fully supervised and prior synthetic‑data methods
Why it matters: more accurate drum transcription for music search, practice apps, and production tools—at a fraction of the data cost.
Paper: https://arxiv.org/abs/2601.09520 • Code: https://github.com/pier-maker92/ADT_STR
Paper: https://arxiv.org/abs/2601.09520v1
Register: https://www.AiFeta.com
#AI #Audio #MusicTech #Drums #MachineLearning #DeepLearning #MIDI #OpenSource