How Sound Recognition Works — Identify Tracks by Audio

The problem

Every DJ library accumulates files that have no useful identity. Tag readers see them as `track01.mp3` or `unknown.flac`. Filename searches return nothing. Manual identification means listening to each one and hoping you recognise it — a process that scales to hundreds of files only on the patience of someone who has nothing else to do on a Sunday. The audio waveform itself contains the song's identity; the metadata around it just lost the memo somewhere between the original encoder and the USB transfer that lost the tags. The right tool restores the connection.

How Music Library Doctor does it

1 When you trigger Sound Recognition on a file (right-click → Identify by Sound, or the 🪄 toolbar button), MLD reads a short window of audio — long enough to be unique, short enough to send quickly.
2 That audio sample gets sent to a music recognition service through MLD's own server-side proxy. Your machine never talks to the recognition provider directly — the proxy authenticates on your behalf and forwards just the fingerprint. Quota tracking happens at the proxy so misuse can't blow past the daily limit silently.
3 The recognition service compares the fingerprint against a database of commercially released music and returns the most likely matches with confidence scores. For tracks that have been released anywhere, accuracy is typically 95%+; for promos, mashups, or unreleased material, expect lower confidence or no match.
4 MLD presents the top candidate side-by-side with the file's current name and tag, plus the confidence score. You confirm by listening in the built-in player and clicking Accept — or reject if the confidence is borderline and you want to investigate.
5 On Accept, MLD writes the identified metadata everywhere it needs to live: ID3/Vorbis tags inside the audio file, filename and parent folder on the filesystem, and your Rekordbox / Serato / VirtualDJ database entry so the change shows up the next time you open the DJ app. All in a single coordinated transaction — nothing falls out of sync.

Supported today

Rekordbox · Serato DJ · VirtualDJ (incl. Favorite Folders) on Windows 10+ and macOS (Apple Silicon + Intel).

Why native integration matters

The architectural choice that makes Sound Recognition work cleanly is treating the audio as the source of truth. The filename, the tag, and the DJ database are downstream consequences of "what song is this" — once that question is answered, propagating the answer to every layer is straightforward. Tools that try to fix mis-named files via metadata heuristics (fuzzy-matching the broken filename against a track list, say) cap out at "better than nothing" because they can't recognise a file whose filename is completely meaningless. Sound Recognition recognises what's in the file, not what's around it. And the server-side proxy + per-license quota architecture means the recognition service stays a sustainable cost rather than an open spigot.

Frequently asked questions

Why is Sound Recognition a Pro feature?

The recognition service is a paid third-party API. MLD covers the per-recognition cost from the Pro license fee, with a daily quota that's generous for normal cleanup use but bounded enough that the cost stays predictable. Putting it behind the Pro gate keeps the quota meaningful for users who actually need it.

Is my audio uploaded anywhere?

No. A short audio fingerprint (a compact hash, not the audio file) is sent through MLD's server-side proxy to the recognition provider. The audio file stays on your machine. The proxy is MLD's own infrastructure, not the third party's.

What happens if the confidence is borderline?

MLD shows you the score and lets you decide. Below 70% confidence is flagged for review. You can listen to the file in the built-in player and confirm by ear, or reject and try a different recognition by sampling a different segment of the audio.

Will it work on unreleased music — promos, edits, mashups?

Recognition is only as good as the reference database. Released music covered by the major catalogs (which is most commercially distributed music) gets identified reliably. Promos that have never been released, custom edits, and mashups may not be recognisable — those are the edge cases where the technology can't help.

How is this different from the acoustic-fingerprint duplicate scan?

Both use audio fingerprinting, but different jobs. The duplicate scan finds files in your library that contain the same recording — local fingerprint match. Sound Recognition matches a fingerprint against the world's music database to identify what the song actually is. Same underlying technique, opposite directions.

Does it modify my DJ database safely?

Yes. MLD takes a timestamped backup of your Rekordbox master.db / Serato crates / VirtualDJ database before any rename operation. Restore is one click if the rename ever looks wrong.

Is there a batch mode?

Yes. Point Sound Recognition at a folder of unknown files and it recognises each one, queues the rename proposals, and lets you approve them in bulk with per-file accept/reject for borderline matches.