Introducing Llama Vision, a website that detects llamas through your camera
a.k.a. How I got started using Tensorflow.js
https://llama.vision (video shown in the background by Karen Ihrig)
A few months ago, I first tried out Tensorflow.js, the machine learning library for JavaScript. I took a pre-trained model I found called MobileNet and copied some of the example code. One of the lines said “replace this with your image”, but I thought I’d try it out on a video instead. I grabbed a Creative Commons video of some llamas and… I was astonished!
OMG, just trying out TensorFlow.js for the first time and blown away that with just a few lines of code I can detect llamas in a video in realtime in the browser! 🦙 pic.twitter.com/24i75RNhuc
— Peter O'Shaughnessy (@poshaughnessy) July 6, 2018
It gave a probability of over 0.99999 that the video contained a llama. I figured that, to get that level of certainty, perhaps I had just happened to pick one of the exact videos used to train the model. I noticed that the model file was only 31KB. I opened it up and found that it contained a collection that looked like this:
Some of the text from the MobileNet JavaScript file. One of the collection’s entries is “llama”.
Even if I hadn’t been lucky enough to use one of the exact same training videos, I was at least fortunate enough to be trying to detect one of the 1000 objects that this model included (entry 355 from 0 to 999).
Next I wondered if I could switch a pre-recorded video for live camera input, via getUserMedia. At first, it didn’t work, giving me the error: “Requested texture size [0x0] is invalid”. I wondered at first if it was an intentional restriction. Thankfully, the solution was simply to set the ‘width’ and ‘height’ attributes on the video element. I pointed the camera at a picture of a llama on my screen (as I didn’t have a real llama to hand 😞)… And it worked!
Now I’ve added a simple user interface and hosted it up at https://llama.vision, so you can try it too:
The source code is here on Github. The key piece of code is small and straightforward — Tensorflow and MobileNet do all the hard work for us:
// Load the prediction model
mobilenet.load().then(model => {
// Now we can keep checking the camera feed at regular intervals, like this:
model.classify(video).then(predictions => {
const topResult = predictions[0];
if (topResult.className === 'llama') {
// Woo! Llama! NB. We can also get the confidence value from the `probability`
...
}
});
});
One thing to note is that in Android browsers I’ve tried other than Chrome, Tensorflow.js is giving a warning “Extension WEBGL_lose_context not supported on this browser” and the model takes a long time to load and calculate (I guess because it fails to use WebGL for better performance). Hopefully I’ll come across a solution for this soon.
And one day I hope to make it to a llama farm — so I can test it out in real life!
If you’re based near London and interested in machine learning with JavaScript, our friend Asim Hussain runs a regular AI JavaScript London meetup. Maybe I’ll see you there!
By Peter O’Shaughnessy on November 1, 2018.