Speaker Recognition/Speaker Separation

Speaker identification determines which registered speaker provides a given utterance from amongst a set of known speakers.

Stack (OS & PL & FW):

– OS: Linux

– PL: Python, C++

– Tool: Kaldi


– Speaker Identification is a Voice Recognition task, in which speakers audio files are required.

Hardware (Resources) (Storage & Compute Power & Time):

– Storage: 1TB

– Compute Power: RTX 3090

– Time: Based on Dataset Volume

Workflow (Processing):

– Dataset Preparation

– Model Selection

– Model Training

– Making Predictions

– Model Evaluation

– API Development

End To End (Development & Integration in a System):

– For end to end system integration of speaker identification requires a trained speaker model and API.

Deployment (Server / API):

– Linux Servers

– Cloud Servers Applications (General Real World Use):

– Speaker Identification for Forensic

– Speaker Identification for Access Control

Use Case (Our Specific):

– Speaker Identification of Talkshows

