- cross-posted to:
- [email protected]
For anyone with existing Home Assistant setup, the Home Assistant Voice Preview is pretty good alternative, when it comes to voice control of HA. The setup is very easy. If you want conversational functionality, you could even hook it up to an LLM, cloud or local. It can also be used for media playback and it’s got an aux out port.
I used to use Google Home Mini for voice control of Home Assistant. The Voice Preview replaced that rather nicely.
That’s tempting, and not a hideous price either.
The difference between a pi and open voice?
Amazon employee with no piss breaks listening in on my echo:
“How many fucking cats does this guy have? Just chose one name and call it that!”
Edit: “I don’t know Jeff, sell him a fucking dr seuss book or something the guys mental.”
The part that really gets me is that you have to opt out to not have everything you say saved. Bonkers that that isn’t the default! There’s no good user-based reason for this. Alexa doesn’t remember shit for users, like any AI there’s no recall feature. You can’t say remember what I told you last night - give the address for that place, I was drunk and don’t remember the name.
Publicly, that is. They have no doubt been doing it in secret since they launched it.
If you look at the article, it was only ever possible to do local processing with certain devices and only in English. I assume that those are the ones with enough compute capacity to do local processing, which probably made them cost more, and that the hardware probably isn’t capable of running whatever models Amazon’s running remotely.
I think that there’s a broader problem than Amazon and voice recognition for people who want self-hosted stuff. That is, throwing loads of parallel hardware at something isn’t cheap. It’s worse if you stick it on every device. Companies — even aside from not wanting someone to pirate their model running on the device — are going to have a hard time selling devices with big, costly, power-hungry parallel compute processors.
What they can take advantage of is that for a lot of tasks, the compute demand is only intermittent. So if you buy a parallel compute card, the cost can be spread over many users.
I have a fancy GPU that I got to run LLM stuff that ran about $1000. Say I’m doing AI image generation with it 3% of the time. It’d be possible to do that compute on a shared system off in the Internet, and my actual hardware costs would be about $33. That’s a heckofa big improvement.
And the situation that they’re dealing with is even larger, since there might be multiple devices in a household that want to do parallel-compute-requiring tasks. So now you’re talking about maybe $1k in hardware for each of them, not to mention the supporting hardware like a beefy power supply.
This isn’t specific to Amazon. Like, this is true of all devices that want to take advantage of heavyweight parallel compute.
I think that one thing that it might be worth considering for the self-hosted world is the creation of a hardened network parallel compute node that exposes its services over the network. So, in a scenario like that, you would have one (well, or more, but could just have one) device that provides generic parallel compute services. Then your smaller, weaker, lower-power devices — phones, Alexa-type speakers, whatever — make use of it over your network, using a generic API. There are some issues that come with this. It needs to be hardened, can’t leak information from one device to another. Some tasks require storing a lot of state — like, AI image generation requires uploading a large model, and you want to cache that. If you have, say, two parallel compute cards/servers, you want to use them intelligently, keep the model loaded on one of them insofar as is reasonable, to avoid needing to reload it. Some devices are very latency-sensitive — like voice recognition — and some, like image generation, are amenable to batch use, so some kind of priority system is probably warranted. So there are some technical problems to solve.
But otherwise, the only real option for heavy parallel compute is going to be sending your data out to the cloud. And even if you don’t care about the privacy implications or the possibility of a company going under, as I saw some home automation person once point out, you don’t want your light switches to stop working just because your Internet connection is out.
Having per-household self-hosted parallel compute on one node is still probably more-costly than sharing parallel compute among users. But it’s cheaper than putting parallel compute on every device.
Linux has some highly-isolated computing environments like seccomp that might be appropriate for implementing the compute portion of such a server, though I don’t know whether it’s too-restrictive to permit running parallel compute tasks.
In such a scenario, you’d have a “household parallel compute server”, in much the way that one might have a “household music player” hooked up to a house-wide speaker system running something like mpd or a “household media server” providing storage of media, or suchlike.
Off-device processing has been the default from day one. The only thing changing is the removal for local processing on certain devices, likely because the new backing AI model will no longer be able to run on that hardware.
With on-device processing, they don’t need to send audio. They can just send the text, which is infinitely smaller and easier to encrypt as “telemetry”. They’ve probably got logs of conversations in every Alexa household.
This has always blown my mind. Watching people willingly allow Big Brother-esque devices into their home for very, very minor conveniences like turning on some gimmicky multi-colored light bulbs. Now they’re literally using home “security” cameras that store everything on some random cloud server. I’ll truly never understand.
Why has no security researcher published evidence of these devices with microphones uploading random conversations? Nobody working on the inside has ever leaked anything regarding this potentially massive breach of privacy? A perfectly secret conspiracy by everyone involved?
We know more about top secret NSA programs than we do about this proposed Alexa spy mechanism. None of the people working on this at Amazon have wanted to leak anything?
I’m not saying it’s not possible, but it seems extremely improbable to me that everyone’s microphones are listening to their conversations, they’re being uploaded somewhere to serve them better ads, and absolutely nobody has leaked anything or found any evidence.
It’s better to be safe than sorry is all I’m saying.
Edit: There’s also this.
I’m not saying it’s not possible
There is no argument from ignorance fallacy in what I said. I am not claiming these devices never send audio without you wanting because there’s no evidence to the contrary.
However, the idea that everyone’s microphones are always listening, and that’s why you saw an ad for whatever after talking to your friend, yet not a single person has observed a device uploading this kind of data, nor has anyone ever leaked any kind of information on this supposed system, is extremely unlikely to be true in my opinion.
They don’t need microphones to do this. Regular tracking is plenty to do a good job at suggesting you a highly relevant ad, and frequency illusion does the rest. You’re not noticing the thousand times you see ads that are irrelevant to whatever you were talking about, but the one time you do notice really sticks out.
Frankly there are plenty of more concerning ways of violating our privacy that are out in the open that I believe are a much higher priority than mics always recording, of which there is no evidence for.
If no proof is offered (in either direction), then the proposition can be called unproven, undecided, inconclusive, an open problem or a conjecture.
Stating that you don’t think that it’s possible is irrelevant. It’s either happening or it isn’t. True or false. P or ¬P.
is extremely unlikely to be true in my opinion.
Is an argument from ignorance. Not trying to be rude, but this is basic logic.
Do you own a smartphone?
Yeah, but it’s rooted and running a custom ROM ;)
Nobody working on the inside has ever leaked anything regarding this potentially massive breach of privacy? A perfectly secret conspiracy by everyone involved?
Sure, but that’s not the commonly repeated conspiracy, even by non technical normal people, that everyone’s mics are listening all the time and they’re being used to serve you ads or whatever. The scale of this is not at all comparable to what I’m talking about. Yeah, I’m sure sometimes devices are inactivated inadvertently, those responses are uploaded, and people have listened to those recordings when they didn’t have permission. That is a far cry from all devices listening nearly all the time, using some surreptitious method to upload the data, and what was being recorded being used for some nefarious purpose.
Again, I’m not excusing these devices for being a privacy nightmare, but I just think it’s extremely implausible that Alexa, Siri, Google, etc. are always listening and nobody has discovered a device uploading.
The real privacy nightmare is that recording your conversations is completely unnecessary to build a richly detailed profile of you and your contacts. Regular old device / browser fingerprinting and a few people in your group sharing contacts with apps is enough for that, and it’s not a top secret conspiracy.
Per that article, it only happens when it thinks it’s been activated, and only when you opt in. Not much of a bombshell.
Emphasis on “when it thinks”. Not much point to a privacy control that the device can just ignore for unspecified reasons, and they had 150+ instances of that occurring in this data set.
Because if they would publish it, the other security experts would say “well, duh, that’s how it works”.
It is just the average people that are unaware of it, or don’t seem to care.
I mean… I 100% agree, and yet you and I and everyone reading this are carrying around a phone that can do the exact same shit
I am not, thank you very much. Even if I wasn’t, you can simply disable the wake word. And you can go into your account (if you have one) and see/listen to any recordings it has made to verify that it has stopped listening.
This is why jailbreaking/rooting your phone is so important.
My mom has one of those Google ones, I hate it.
My brother and a buddy both have Alexas. And yeah, I hate being anywhere near the thing.
I have always told people to avoid Amazon.
They have doorbells to watch who comes to your house and when.
Indoor and outdoor security cameras to monitor when you go outside, for how long, and why.
They acquired roomba, which not only maps out your house, but they have little cameras in them as well, another angle to monitor you through your house in more personal areas that indoor cameras might not see.
They have the Alexa products meant to record you at all times for their own use and intent.
Why do you think along with Amazon Prime subscriptions you get free cloud storage, free video streaming, free music? They are categorizing you in the most efficient and accurate way possible.
Boycott anything Amazon touches
I agree with your sentiment and despise Amazon but they do not own roomba the deal fell through.
Christ, finally a win
That is actually good news to hear. Not completely good on my part for being incorrect about ownership, but once I saw the proposed deal back when it was announced, I immediately added them to the “no I don’t think I will.” list of products I won’t support.
Cheers for the clarification mate
They backed out of the Roomba deal. Now iRobot is going down the shitter.
Amazon really got people to pay to be spied on. Wild world we live in bois
Who pays for Alexa?
Everyone who didn’t get an echo as a gift, I’d imagine
Plenty of people I know have gotten the little echo dots or the bigger alternative with larger speakers for Christmas or birthdays. Technically they didn’t spend money, but their friends and family did.
I see. The initial purchase price is the “payment”. I thought the intimation was some sort of subscription to use Alexa. My bad.
They typed from their device that is also spying on them that they most likely also paid for…
Please, sir I have a pager
People are saying don’t get an echo but this is the tip of an iceberg. My coworkers’ cell phones are eavesdropping. My neighbors doorbells record every time I leave the house. Almost every new vehicle mines us for data. We can avoid some of the problem but we cannot avoid it all. We need a bigger, more aggressive solution if we are going to have a solution at all.
How about regulation? Let’s start with saying data about me belongs to me, not to whoever collected the data, as is currently the case
My clunky old bike ain’t listening to shit bro. Neither is my android phone using a custom rom.
Jam the mic? https://www.amazon.com/gp/aw/d/B08Y5GGP4D
Works on my phone…
the irony of posting an amazon link…
They create a problem, then sell the solution.
If you were using one, you were already okay with this.
Yeah. Hell, chances are they were already
Want to setup a more privacy friendly solution?
Have a look at Home Assistant! It’s a great open source smart home platform that recently released a local (so not processing requests in the cloud) voice assistant. It’s pretty neat!
I have one big frustration with that: Your voice input has to be understood PERFECTLY by TTS.
If you have a “To Do” list, and speak “Add cooking to my To Do list”, it will do it! But if the TTS system understood:
- Todo
- To-do
- to do
- ToDo
- To-Do
- …
The system will say it couldn’t find that list. Same for the names of your lights, asking for the time,… and you have very little control over this.
HA Voice Assistant either needs to find a PERFECT match, or you need to be running a full-blown LLM as the backend, which honestly works even worse in many ways.
They recently added the option to use LLM as fallback only, but for most people’s hardware, that means that a big chunk of requests take a suuuuuuuper long time to get a response.
I do not understand why there’s no option to just use the most similar command upon an imperfect matching, through something like the Levenshtein Distance.
Because it takes time to implement. It will come.
I’ve seen something about this pop up occasionally on my feed, but it’s usually a conversation I’m nowhere close to understanding lol
Could you recommend any resources for a complete noob?
home assistant is amazing but it is not yet an alternative to Alexa, the assistant/voice is still in development and far from being usable. it’s impossible for me to remember the specific wording assist demands and voice to text is incorrect like nine out of ten times. And this includes giving up on terrible locally hosted models trying out their cloud which obviously is a huge privacy hole, but even then it was slow and inaccurate. It’s a mystery to me how the foss community is so behind on voice, Siri and Google Assistant started working offline years ago, and they work straight on a mobile device.
Today: “…they will be deleted after Alexa processes your requests.”
Some point in the not-so-distant future: “We are reaching out to let you know that your voice recordings will no longer be deleted. As we continue to expand Alexa’s capabilities, we have decided to no longer support this feature.”
“We lied and paid a $3M fine.”
And finally “We are reaching out to let you know Alexa key phrase based activation will no longer be supported. For better personalization, Alexa will always process audio in background. Don’t worry, your audio is safe with us, we highly care about your privacy.”
Or simply “…they will be deleted after Alexa processes your request and generates a token for AI training”.
They could also transcribe the recording and only save that. I mean they absolutely will and surely already did do that.
They literally could just leave the feature on the device, but then you can’t force your users to send you all their data, voices, thoughts and first borns
Fuck Amazon, fuck Bezos
Easy fix: don’t buy this garbage to begin with. It’s terrible for the environment, terrible for your privacy, of dubious value to begin with.
If every man is an onion, one of my deeper layers is crumudgeon. So take that into account when I say fuck all portable speakers. I’m so tired of hearing everyone’s shitty noise. Just fucking everywhere. It takes one person feeling entitled to blast the shittiest music available to ruin everyone in a 500yd radius’s day. If this is you, I hope you stub your toe on every coffee table, hit your head on every door jam, miss every bus.
I have a Google home. The only reason I have it is because Spotify gave them away for free back in 2019. It sits unplugged somewhere.
spoiler
It still captures your voice
jk (I hope)
That’s covered by my phone.
I can’t believe people are still voluntarily wire tapping themselves in 2025
Do the device you wrote this on have a microphone?
Yes, but they’ll conveniently ignore that on devices they are addicted to.
None of my devices have one that’s lacking a physical switch to disable it.
be aware, everything you say around amazon, apple, alphabet, meta, and any other corporate trash products are being sold, trained on, and sent to your local alphabet agency. it’s been this way for a while, but this is a nice reminder to know when to speak and when to listen
Everyone literally carries a personal recording device.
not everyone. and some people build their own or have ways of mitigating surveillance capitalism
How the fuck does anyone even buy one of these
You can get them on Amazon.
You can also buy some on Facebook Marketplace.
Well played.
The same people who buy mobile phones; despite those being bugs/spy-devices.
True, but a mobile phone is basically a world brain, calculator, camera, flashlight, you can watch movies on it in hi def, hate it all you want, it’s one of the most versatile tools on the planet. An echo dot, it just spy garbage and nothing else
I mean what better spot to syphon of each and every piece of information about you…
Phones are at least easier to justify since everyone kinda needs one now and there aren’t many great private options, especially for the lay person
If you give up your freedom for convenience, then you will lose both.
I mean, it’s not convience. It’s outright necessary for most jobs.
I mean yeah, but for a lot of people if they ditch their phone they’ll also lose their job and possibly relationships they value.
Cell phones spying on people isn’t good, but most people are simply not informed about how invasive they are and couldn’t make an informed decision if they tried. Pair that with the fact that cell phones are essential for a lot of modern life, and it’s not difficult to see why the average person is generally more wary of smart speakers than cell phones.
The whole damn situation was a trap.
Lmao. Why complain about one and try to justify the other…
I meant they’re easier to justify in the sense that I see why people don’t put much thought into putting a spying device in their pocket, not that I agree with the disregard. Most peoples’ friends, family, employers, etc. all expect them to have a cell phone and be available by it. Additionally, the way most people interact with their phones, the spying is much less obvious. They joke about them “always listening”, but a lot of people don’t understand the privacy concerns of pretty typical internet use, so the fact that the device has more than just a microphone, it appears to be worth it to a more typical consumer than us.
Contrast that with an Alexa, google home, or apple home thing, devices which nobody cares if someone else doesn’t own, which most people only see as a microphone and speaker, and whose primary functionality is to always be listening to you. The skepticism is much easier to arise.
I’m not saying the level at which cell phones spy on their users is acceptable or even worth it, just that I see why the average user who isn’t conscious of their privacy doesn’t regard them with the same concern they do smart speakers.
At least, on mobile devices, it’s typically easier to install a privacy-focused firmware (like LineageOS or GrapheneOS). Those AI assistants are completely locked down.
I am sorry but the telephony system itself is fundamentally a privacy threat.
Wait till you find out about the internet and social media (including here).
So is the internet.
More directly comparable, is the Ring cameras inside the house.
@richardisaguy @Tea sometimes they just come free with stuff. We got given two Google ones when my husband bought a Pixel phone. We were going to sell them on but we never got round to it. You can physically turn off the microphone part though (at least it tells you it’s turned off so fingers crossed) so we use the one with a screen as a digital photo frame (and a speaker) and the other one as just a speaker.
I have a bunch in my house. It’s a glorified radio all I use it for is:
- Set timer for x minute
- What time is it
- Ask CBC to play radio one Toronto
- What is the weather today
For the convenience I accept the mining they may do.
Lists are also very handy!