AES AIMLA

Latent rhythm transformation of drum recordings

Audio examples and figures accompanying the paper.

Jason Hockman ยท Jake Drysdale

Abstract

A method is proposed for rhythm style transfer of multitimbral drum recordings via conditioning a VAE on rhythm and timbral features. Modulation and estimation of latent parameters and a novel resequencing process for reconstruction loss result in an end-to-end transformation circumventing manual segmentation and alignment.

Audio Examples

Examples below contain source (original with timbre), target (rhythm), and transformation (model output) recordings. Figures depict waveforms, ADT conditioning* and attention head activations associated with examples. Download available in above link for more examples.

Example 1

Example 2

Example 3

Example 4

Example 5

Example 6

Conference Information

Organization: Audio Engineering Society

Event: AES International Conference on Artificial Intelligence and Machine Learning for Audio

Date: September 8โ€“10, 2025

Location: London, UK

BibTeX

@inproceedings{hockman2025latent,
  title     = {Latent rhythm transformation of drum recordings},
  author    = {Hockman, Jason and Drysdale, Jake},
  booktitle = {Late-Breaking Demo at the AES International Conference on Artificial Intelligence and Machine Learning for Audio},
  year      = {2025},
  address   = {London, UK},
  month     = {September 8--10},
  organization = {Audio Engineering Society}

}