Email the Author

You can use this page to email Andres Mariscal @serialdev about Build-Your-Own: Shazzam!.

Say, you are a developer that has just become comfortable with a language,

now the question is, what now?

We have all been there and my intention with this series is to give a less

complex path into understanding all that it takes to create a proof of concept

fully functional system.

We will build a fully functional API that we can populate with

any desired music, and understand how sub-second retrieval takes place.

This will improve your skills in multiple currently "hot" technologies:

We will use containers for locally deploying our database, this

will also help emulate how real systems would interact with each other.

We will make use of SQL as a language to communicate with the database

We will make use of python, heavily relying on the machine learning stack

of pandas, numpy, pytorch & scikit-learn.

We will use pytorch to create the representations (embeddings) of the

music we will then use as part of the retrieval.

We will have a cursory overview of perceptual hashing, nearest neighbours

and, appropriate approximations to speed this up at scale.

Once our system is up and running we will deploy it using battle tested

frameworks that expose our functionality through the internet.

And most importantly we will deconstruct the problem and see how we come

to each one of these solutions as part of the creation process.

You will learn how to:

- Deconstruct audio data

- Analyse and visualize audio datasets

- Create your own Postgres instance using docker

- Create a database of vector representations of audio-segments

- Understand the similarity retrieval problem

- Perceptual hashing vs Auto-encoder

- Deployment of this system in a RESTful api