Email the Author
You can use this page to email Andres Mariscal @serialdev about Build-Your-Own: Shazzam!.
About the Book
Say, you are a developer that has just become comfortable with a language,
now the question is, what now?
We have all been there and my intention with this series is to give a less
complex path into understanding all that it takes to create a proof of concept
fully functional system.
We will build a fully functional API that we can populate with
any desired music, and understand how sub-second retrieval takes place.
This will improve your skills in multiple currently "hot" technologies:
We will use containers for locally deploying our database, this
will also help emulate how real systems would interact with each other.
We will make use of SQL as a language to communicate with the database
We will make use of python, heavily relying on the machine learning stack
of pandas, numpy, pytorch & scikit-learn.
We will use pytorch to create the representations (embeddings) of the
music we will then use as part of the retrieval.
We will have a cursory overview of perceptual hashing, nearest neighbours
and, appropriate approximations to speed this up at scale.
Once our system is up and running we will deploy it using battle tested
frameworks that expose our functionality through the internet.
And most importantly we will deconstruct the problem and see how we come
to each one of these solutions as part of the creation process.
You will learn how to:
- Deconstruct audio data
- Analyse and visualize audio datasets
- Create your own Postgres instance using docker
- Create a database of vector representations of audio-segments
- Understand the similarity retrieval problem
- Perceptual hashing vs Auto-encoder
- Deployment of this system in a RESTful api
About the Author