Skip to content

A Rust cli implementation of Shazam-style audio identification

Notifications You must be signed in to change notification settings

rugbedbugg/ResonanceID-cli

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ResonanceID-cli

GitHub last commit GitHub repo size Stars

A Rust-based audio fingerprinting CLI inspired by Shazam-style matching.

This project is being built for a Design and Analysis of Algorithms course, with focus on:

  • fingerprint pipeline design
  • matching quality vs false positives
  • practical CLI workflows
  • measurable runtime behavior

Features

  • Store songs into a local SQLite fingerprint DB
  • Recognize unknown clips against stored references
  • 🔊 Live system audio recognition - identify songs playing on your computer (YouTube, Spotify, etc.)
  • Show ranked candidates (top matches)
  • Manage DB from CLI (list-songs, remove-song, db-stats)
  • Config layering (/etc, user config, local config)
  • CLI overrides for all key tuning params
  • Optional clipping for reference indexing (--clip-start, --clip-duration, --auto-clip)

Tech Stack

  • Rust
  • SQLite (rusqlite)
  • FFT (rustfft)
  • WAV I/O (hound)
  • TOML config (serde, toml)
  • Audio capture (libpulse-binding for Linux/PulseAudio)

Pipeline (High-Level)

Store / Remember

  1. Read WAV samples
  2. (Optional) clip audio range
  3. STFT spectrogram
  4. Peak extraction (constellation points)
  5. Fingerprint generation (hash, anchor_time_ms)
  6. Insert song metadata + fingerprints into SQLite

Recognize

  1. Read WAV samples
  2. STFT spectrogram
  3. Peak extraction
  4. Fingerprint generation
  5. Hash lookup in DB + offset voting
  6. Rank songs by strongest offset consistency

Installation / Run

cargo build
cargo run -- --help

Note: pass app args after -- when using cargo run.

Diagnose issues using

cargo test

CLI Commands

Store a reference track

cargo run -- store <wav_path> "<Title>" "<Artist>" [options]

Alias:

cargo run -- remember <wav_path> "<Title>" "<Artist>" [options]

Recognize a clip

cargo run -- recognize <wav_path> [options]

🔊 Live system audio recognition (NEW!)

cargo run -- listen [duration] [options]

Capture and identify audio playing on your computer in real-time.

Examples:

# List available audio devices (shows monitors and microphones)
cargo run -- list-devices

# Capture system audio for 10 seconds (default)
cargo run -- listen --monitor

# Capture system audio for 5 seconds
cargo run -- listen 5 --monitor

# Use specific device by index
cargo run -- listen --device 0

Key features:

  • 🎧 Works with headphones (captures before audio output)
  • 🔇 Works at any volume level (even muted)
  • 📻 Recognizes music from YouTube, Spotify, web browsers, etc.
  • 🎯 Uses PulseAudio monitor sources (Linux)

Show ranked candidates

cargo run -- list-top-matches <wav_path> [options]

Database management

cargo run -- list-songs [--db <db_path>]
cargo run -- remove-song <song_id> [--db <db_path>]
cargo run -- db-stats [--db <db_path>]

Common Options

  • --db <db_path>
  • --config <path>
  • --no-config

Fingerprint options:

  • --window-size <n>
  • --hop-size <n>
  • --anchor-window <n>
  • --threshold-db <f32>

Recognition options:

  • --min-match-score <n>
  • --dynamic-gate-scale <f32>
  • --small-query-threshold <n>
  • --max-results <n>

Clip options (store/remember):

  • --clip-start <seconds>
  • --clip-duration <seconds>
  • --auto-clip (center clip; default 20s if duration not specified)

Config

Search order (when --config is not given):

  1. /etc/resonanceid-cli/config.toml
  2. ~/.config/resonanceid-cli/config.toml
  3. ./resonanceid-cli.toml

Precedence:

CLI flags > config file > defaults

Example config:

[fingerprint]
window_size = 1024
hop_size = 512
anchor_window = 50
threshold_db = -20.0

[recognition]
min_match_score = 2
dynamic_gate_scale = 30.0
small_query_threshold = 1000
max_results = 5

You can copy from resonanceid-cli.toml.example.

Quick Demo

File-based recognition

# 1) Convert audio to WAV (mono, 44.1k)
ffmpeg -y -i input.mp3 -ac 1 -ar 44100 input.wav

# 2) Store reference
cargo run -- store input.wav "My Song" "My Artist"

# 3) Recognize clip
cargo run -- recognize clip.wav

System audio recognition (Shazam-style)

# 1) Store some reference songs
cargo run -- store song1.wav "Song 1" "Artist 1"
cargo run -- store song2.wav "Song 2" "Artist 2"

# 2) Play music on your computer (YouTube, Spotify, etc.)

# 3) Identify what's playing
cargo run -- listen --monitor

Notes

  • File-based commands (store, recognize) expect WAV input files
  • Use ffmpeg for mp3/flac conversion before running file-based commands
  • For stable matching quality, reference clips around 20–45 seconds are recommended
  • System audio capture (listen) works directly - no file conversion needed
  • The listen command is Linux-only (requires PulseAudio/PipeWire)
  • For other platforms, use file-based recognition with recognize

About

A Rust cli implementation of Shazam-style audio identification

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages