First look at Mycroft

Fri 24 May 2019 in development

#Mycroft

Smart home technology. Great in theory, kind of scary in practice. In particular, smartspeaker systems like Amazon Alexa and Google Home are quite scary. They're closed devices which virtually everyone knows is spying on them; yet people use them anyway. And honestly, I kind of can't blame them. Smartspeakers are frankly a really neat idea. Kind of like in Star Trek where people were able to speak with the onboard computer to trigger actions and commands.

I thought about getting one of these things for fun and to see what I could do with it. Internally though, I couldn't get the thought out of my mind that I was essentially inviting BonziBuddy to sit in my bedroom. Then I remembered a certain kickstarter project; Mycroft.

So a few weeks back, I purchased the Mycroft Mark 1. Mycroft is a free and open source voice assistant; it can be downloaded as a service on your local machine, embedded into a Raspberry Pi 3 or purchased online. I have numerous Raspberry Pis around my home and I could have just reused one of those, but you can't deny... the shell of the mark 1 is pretty aborable with it's big LED eyes and text display.

Mark 1

What makes the Mycroft different from the other voice assistants is that it's entirely under your control. You can actually SSH into it and change its behaviour. All of its skills, as its behaviours are called, can be freely removed or you can go online and add new skills from the mycroft home service. You can also use the msm tool in ssh (Mycroft Skills Manager) to easily pull in skills from decentralized sources like github.

The skills themselves are developed in Python, as is the core of the service. That means the entire thing is hackable. If you need to change how a particular skill behaves or even how the core itself behaves, you can easily go in with vim, do your changes and that's it. No recompilation and most of the time, you don't even need to reboot the device. The central process automatically reimports skills you've changed.

As for how it works, you call out to it "Hey Mycroft", then give it a command. Your command gets sent to a server and gets turned into text. That text is then locally processed on the device through a series of regex expressions. If the regex for a particular skill matches, that skill gets invoked.

This is the only data they get; your voice for the purposes of speech to text. They don't control what context that speech gets used for or even know what skill you're trying to use. They've also expressed interest in trying to make it easy to host their service locally so you don't need to send them recordings of your voice.

Now, as a product for your grandparents, I would hold back at least until the mark 2. The mark 1 is great, but it's definitely more geared towards developers and people who want to tinker with the platform. The mark 2 is overall a more polished revision.

As a product for a hardware/software developer, I would whole-heartedly recommend it. Having a voice assistant you can freely control and extend is incredible. If something is missing, you can just build it yourself. Or chances are someone in the python community has already got you most of the way there with a library or two. In the future, I'll talk a bit about the youtube skill I developed and what was involved in getting Mycroft to play youtube videos for me.

Spoiler: it's nowhere near as hard to develop skills as you'd think.

For more information: