diff options
Diffstat (limited to 'audio/SongRec/README')
-rw-r--r-- | audio/SongRec/README | 213 |
1 files changed, 213 insertions, 0 deletions
diff --git a/audio/SongRec/README b/audio/SongRec/README new file mode 100644 index 0000000000..e9b4ddb365 --- /dev/null +++ b/audio/SongRec/README @@ -0,0 +1,213 @@ +SongRec is an open-source Shazam client for Linux, written in Rust. + +Features: + +* Recognize audio from an arbitrary audio file. +* Recognize audio from the microphone. +* Usage from both GUI and command line (for the file recognition part). +* Provide an history of the recognized songs on the GUI, exportable to +CSV. +* Continuous song detection from the microphone, with the ability to +choose your input device. +* Ability to recognize songs from your speakers rather than your +microphone (on compatible PulseAudio setups). +* Generate a lure from a song that, when played, will fool Shazam into +thinking that it is the concerned song. + +A (command-line only) Python version, which I made before rewriting in +Rust for performance, is also available for demonstration purposes. It +supports file recognition only. + +## How it works + +For useful information about how audio fingerprinting works, you may +want to read [this article](http://coding-geek.com/how-shazam-works/). +To be put simply, Shazam generates a spectrogram (a time/frequency 2D +graph of the sound, with amplitude at intersections) of the sound, and +maps out the frequency peaks from it (which should match key points of +the harmonics of voice or of certains instruments). + +Shazam also downsamples the sound at 16 KHz before processing, and cuts +the sound in four bands of 250-520 Hz, 520-1450 Hz, 1450-3500 Hz, +3500-5500 Hz (so that if a band is too much scrambled by noise, +recognition from other bands may apply). The frequency peaks are then +sent to the servers, which subsequently look up the strongest peaks in +a database, in order look for the simultaneous presence of neighboring +peaks both in the associated reference fingerprints and in the +fingerprint we sent. + +Hence, the Shazam fingerprinting algorithm, as implemented by the +client, is fairly simple, as much of the processing is done +server-side. The general functionment of Shazam has been documented in +public [research +papers](https://www.ee.columbia.edu/~dpwe/papers/Wang03-shazam.pdf) and +patents. + + +Note: It is not mandatory, but if you want to be able to recognize more +formats than WAV, OGG, FLAC and MP3, you should ensure that you have +the `ffmpeg` package installed. + +## Compilation + +(**WARNING**: Remind to compile the code in "--release" mode for +correct performance.) + +### Installing Rust + +First, you need to [install the Rust compiler and package +manager](https://www.rust-lang.org/tools/install). It has been observed +to work with `rustc` 1.43.0 to the current rustc 1.47.0. + +Install Rust and put it in path, for all distributions: + +```bash +curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh # Type +"1" +# Login and reconnect to add Rust to the $PATH, or run: +source $HOME/.cargo/env + +# If you already installed Rust, then update it: +rustup update +``` + +### Install dependent libraries (nothing exotic) + +Debian: + +```bash +sudo apt install build-essential libasound2-dev libgtk-3-dev libssl-dev +``` + +Void Linux (libressl): + +```shell +sudo xbps-install base-devel alsa-lib-devel gtk+3-devel libressl-devel +``` + +Void Linux (openssl): + +```shell +sudo xbps-install base-devel alsa-lib-devel gtk+3-devel openssl-devel +``` + +### Compiling the project + +This will compile and run the projet: + +```bash +# For the stable release: +cargo install songrec +songrec + +# For the Github tree: +git clone git@github.com:marin-m/songrec.git +cd songrec +cargo run --release +``` + +For the latter, you will then find the project's binary (that you will +be able to move or execute directly) at `target/release/songrec`. + +## Sample usage + +Passing no arguments or using the `gui` subcommand will launch the GUI, +and try to recognize audio real-time as soon as the application is +launched: + +``` +./songrec +./songrec gui +``` + +Using the `gui-norecording` subcommand will launch the GUI without +recognizing audio as soon as the software is started (you will need to +click the "Turn on microphone recognition" button to do so): + +``` +./songrec gui-norecording +``` + +The GUI allows you to recognize songs either from your microphone, +speakers (on compatible PulseAudio setups), or from an audio file. The +MP3, FLAC, WAV and OGG formats should be accepted for audio files if +FFMpeg is not installed, and any audio or video formats supported by +FFMpeg should be accepted if FFMpeg is installed. + +The following commands allow to recognize sound from your microphone or +from a file using the command line (`listen` runs while the microphone +is usable while `recognize` recognizes only one song), use the `-h` +flag in order to see all the available options: + +``` +./songrec listen -h +./songrec recognize -h +``` + +By default, only the artist and track name of the concerned song are +displayed to the standard output, and other information may be +displayed to the error output. The `--csv` and `--json` options allow +to display more programmatically usable information to the standard +output. + +The above decribes the newer CLI interface of SongRec, but an older +interface, operating only on audio files or raw audio fingerprints, is +also available and described below. + +The following subcommand will try to recognize audio from the middle of +an audio file, and print the JSON response from Shazam servers: + +``` +./songrec audio-file-to-recognized-song sound_file.mp3 +``` + +The following subcommands will do the same with an intermediary step, +manipulating data-URI audio fingerprints as used by Shazam internally: + +``` +./songrec audio-file-to-fingerprint sound_file.mp3 +./songrec fingerprint-to-recognized-song +'data:audio/vnd.shazam.sig;base64,...' +``` + +The following will produce back hearable tones from a given +fingerprint, that should be able to fool Shazam into thinking that this +is the original song (either to the default audio output device, or to +a .WAV file): + +``` +./songrec fingerprint-to-lure 'data:audio/vnd.shazam.sig;base64,...' +./songrec fingerprint-to-lure 'data:audio/vnd.shazam.sig;base64,...' +/tmp/output.wav +``` + +When using the application, you may notice that certain information +will be saved to `~/.local/share/SongRec` (or an equivalent directory +depending on your operating system), including the CSV-format list of +the last recognized songs and the last selected microphone input device +(so that it is chosen back when restarting the app). You may want to +delete this directory in case of persistent issues. + +## Privacy + +SongRec collects no data and contacts no other servers than Shazam's. +SongRec does not upload raw audio data anywhere: only fingerprints of +the audio are uploaded, which means sequences of frequency peaks +encoded in the form of "(frequency, amplitude, time)" tuples. + +This does not suffice to represent anything hearable alone (use the +"Play a Shazam lure" button to see how much this is different from full +sound); that means that no actually hearable sound (e.g voice +fragments) is sent to servers, only metadata derived on the +characteristics of the sound that may only suffice to recognize a song +already known by Shazam is being sent. + +## Legal + +This software is released under the [GNU GPL +v3](https://www.gnu.org/licenses/gpl-3.0.html) license. It was created +with the intent of providing interoperability between the remote Shazam +services and Linux-based deskop systems. + +Please note that in certain countries located outside of the European +Union, especially the United States, software patents may apply. |