Viseme

Next-generation Voice Recognition

Using technology: voice recognitionlip reading

Live Video Feed

The live video feed is pulled from the device camera(s) using the Javascript getUserMedia() API.

Amplitude



This displays a live representation of the amplitude of the incoming audio from the device microphone. When the amplitude of ambient noise exceeds the pre-set threshold, our algorithm switches from voice-to-text recognition to lip reading technology.

Image being sent to server



An individual frame is extracted from the webcam feed twenty-four times a second. This is then compressed, encoded, and sent via a Websockets connection for processing on the server.

Transcript