cross-posted from: https://lemmit.online/post/225981

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/homeassistant by /u/janostrowka on 2023-07-19 12:49:02.

Hopefully this will come in handy for our Year of the Voice.

TL;DR: Justin Alvey replaces Google Nest Mini PCB with ESP32 custom PCB which he’s open-sourcing. Shows demo of running LLM voice assistant paired with Beeper to send and receive messages.

Tweet text thread (I would also highly recommend checking out the video demos on Twitter):

I “jailbroke” a Google Nest Mini so that you can run your own LLM’s, agents and voice models. Here’s a demo using it to manage all my messages (with help from @onbeeper) 📷 on, and wait for surprise guest! I thought hard about how to best tackle this and why

After looking into jailbreaking options, I opted to completely replace the PCB. This let’s you use a cheap ($2) but powerful & developer friendly WiFi chip with a highly capable audio framework. This allows a paradigm of multiple cheap edge devices for audio & voice detection…

& offloading large models to a more powerful local device (whether your M2 Mac, PC server w/ GPU or even “tinybox”!) In most cases this device is already trusted with your credentials and data so you don’t have to hand these off to some cloud & data need never leave your home

The custom PCB uses @EspressifSystem’s ESP32-S3 I went through 2 revisions from a module to a SoC package with extra flash, simplifying to single-sided SMT (< $10 BOM) All features such as LED’s, capacitive touch, mute switch are working, & even programmable from Arduino (/IDF)

For this demo I used a custom “Maubot” with my @onbeeper credentials (a messaging app which securely bridges your messaging clients using the Matrix protocol & e2e encryption) which runs locally serving an API

I’m then using GPT3.5 (for speed) with function calling to query this

Fro the prompt I added details such as family & friends, current date, notification preferences & a list additional character voices that GPT can respond in. The response is then parsed and sent to @elevenlabsio

I’ve been experimenting with multiple of these, announcing important messages as they come in, morning briefings, noting down ideas and memos, and browsing agents. I couldn’t resist - here’s a playful (unscripted!) video of two talking to each other prompted to be AI’s from "Her

I’m working on open sourcing the PCB design, build instructions, firmware, bot & server code - expect something in the next week or so. If you don’t want to source Nest Mini’s (or shells from AliExpress) it’s still a great dev platform for developing an assistant! Stay tuned!

  • RegalPotoo@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    1 year ago

    Same, but lack of an open source Cast receiver is going to make that a hard sell for me. I hate that it’s anticompetitive proprietary bullshit, but it works, and works really well.