Dr Brendan O’Connor
AI Researcher · Software Engineer · Signal Processing
I’m a machine learning researcher and engineer. Most of my work involves building systems that learn from signals and sequences: classification, generative modelling, embedding spaces, anomaly detection, and time-series problems. I completed my PhD at the Centre for Digital Music (QMUL), and most of my experience has been in the audio and music domain. The methods carry over to other areas, though, and I’m keen to apply them somewhere new. I tend to get up to speed on an unfamiliar field fairly quickly.
If you’ve already seen my CV, this page covers the same ground with a bit more room for the things a CV leaves out.
📄 CV · 🔗 LinkedIn · 💻 GitHub · 🎧 SoundCloud · 🎬 Vimeo
A note on the gap in contributions after early 2024
There isn’t much public code here since early 2024. That’s because I moved into industry roles, working on commercial codebases and proprietary datasets under NDA, so the work doesn’t end up on a public repo. I’m happy to talk through the details, and references are available on request.
Experience
- Lead Researcher of AI Voice, Sound Patrol, USA (2025–present) — voice-biometrics research for a music-infringement detection platform used by major labels.
- Researcher of AI Music, Jammable Ltd, London (2024–2025) — shipped ML features across transcription, TTS, voice cloning, lyrics conversion, and audio enhancement.
- Teaching, School of EECS, QMUL (2019–2024) — taught ML, computer vision, AI search, and digital systems to undergraduate and postgraduate students.
- Research Scientist Intern, TikTok, London (2022) — GAN/VAE architectures for controllable symbolic music generation.
- Research Intern, MXX, London (2019) — built a singing-voice-detection model and the dataset pipeline behind it.
What I work with
Languages: Python, Bash/Linux, Git/GitHub, PHP, C++
ML: PyTorch, NumPy, SciPy, Pandas, scikit-learn, Hugging Face Transformers
Architectures: Transformers, Diffusion, GANs, VAEs, CNNs, RNNs, Attention (generative, discriminative, and self-supervised)
Applied AI: time-series analysis, embedding-space modelling, anomaly detection, classification, sequence-to-sequence, multimodal fusion
MLOps: GCP, AWS, Docker, GitHub Actions, CI/CD, deployment
Signal & Audio: feature engineering (MFCCs, mel-spectrograms, learned embeddings), source separation, speaker diarisation, voice conversion, TTS
I use AI coding agents day to day for prototyping, code review, and documentation.
Education
- PhD, Media & Arts Technology, Queen Mary University of London, 2024. Analysis, Disentanglement and Conversion of Singing Voice Attributes (supervisor: Simon Dixon)
- MA, Music Technology & Composition, University of West London, 2015
- BMus, Music, MTU Cork School of Music, 2011
Selected Repositories
My open-source work is primarily from my PhD research on the singing voice. The thread running through it is voice attribute conversion: taking a sung recording and changing one property of it — the technique or the singer’s identity — while leaving everything else untouched. There’s also a set of repositories from music information retrieval coursework.
Singing voice research
- Perceptual Spaces of the Singing Voice — I ran a listening test where people rated how different pairs of vocal sounds were from one another, then checked whether a model’s latent space lined up with those ratings. The setup, analysis, and conclusions are in the repo and in the AIMC 2020 paper, which was the conference’s top-reviewed paper that year.
- Singing Technique Classification — a classifier that identifies the technique used in a sung passage. Its embeddings feed the conversion model below.
- Singing Technique Conversion — an adaptation of the AutoVC framework, conditioned on the technique classifier’s embeddings, that changes the perceived singing technique of a recording without affecting other vocal attributes. Described in the CMMR 2021 paper.
- Singer Identity Conversion — converts the singer identity of a recording. The SMC 2023 paper compares latent regressor losses for this task.
- Singer Identity Encoder — a timbre encoder trained on singing rather than speech, producing embeddings that capture vocal identity for the conversion models.
- WORLD Vocoder adaptation — an adaptation of a Python WORLD vocoder wrapper, used for its pitch-invariant voice features in the work above.
Music information retrieval and DSP
- Audio Fingerprinting — a Shazam-style system that takes a noisy recording of a song as a query and returns the three most likely matches from a database. Evaluated on a subset of the GTZAN dataset. The method follows Müller’s Fundamentals of Music Processing (2015).
- Beat Tracker — estimates tempo and follows the beat, combining techniques from several researchers. Evaluated on the Ballroom dataset.
- Tempo Estimation — related tempo-tracking experiments.
- Custom MIR Toolkit — reusable audio and ML utility code I rely on across the projects above.
Selected Publications
- Analysis, Disentanglement and Conversion of Singing Voice Attributes, PhD Thesis, QMUL 2024
- A Comparative Analysis of Latent Regressor Losses for Singing Voice Conversion, SMC 2023
- Zero-shot Singing Technique Conversion, CMMR 2021
- An Exploratory Study on Perceptual Spaces of the Singing Voice, AIMC 2020 (top-reviewed paper)
- Invited to present on AI Voice Conversion at the ISMIR 2024 tutorial.
Background in music
Before I got into machine learning, I spent years as a musician. I’m a classically trained guitarist and conductor, and I’ve worked as a composer, orchestrator, and producer. I do a fair amount of creative coding too, mostly in MAX/MSP and PureData, and I’m as comfortable programming a live show as playing in one. I also taught performance, theory, and production for over a decade, from one-to-one lessons to full classrooms.
A lot of that fed into sound art and interactive installations. My work has shown at Tate Modern and Ars Electronica (Soundstitcher, 2019), and at a few London festivals: Defence to Forbid (Everything Must Go, 2018), Igniting the Universe (We Are Robots, 2018), Painting Music (Heart & Soul, 2018), and Penillion, a piece for live graphic coding and orchestra (2017). For a while I also co-founded and ran a touring live act, building automated shows that networked laptops, lights, and instruments together with fail-safes throughout.
🎧 Hear it on SoundCloud · 🎬 watch the showreel on Vimeo
References available on request.