You are viewing a single thread.
View all comments View context
9 points

My own theory is that they tokenize key words and phrases with an AI so that they’re not sending the actual audio data. Then it’s stored in a form some AI can parse but isn’t technically user data so they can skirt legislation around that.

A tokenized collection of key phrases omitting delimiters in text format is going be much, much less than audio, or a transcript.

permalink
report
parent
reply
12 points

as someone who has played around with offline speech recognition before - there is a reason why ai assistants only use it for the wake word, and the rest is processed in the cloud: it sucks. it’s quite unreliable, you’d have to pronounce things exactly as expected. so you need to “train” it for different accents and ways to pronounce something if you want to capture it properly, so the info they could siphon this way is imho limited to a couple thousand words. which is considerable already, and would allow for proper profiling, but couldn’t capture your interest in something more specific like a mazda 323f.

but offline speech recognition also requires a fair amount of compute power. at least on our phones, it would inevitably drain the battery

permalink
report
parent
reply
2 points

That certainly would make the data smuggling easier. What about battery though? I assume that requires inference and at least rudimentary processing.

How would a background process do this in real time on a mobile device without leaving traceable evidence like cpu time?

permalink
report
parent
reply
3 points

Cox also sells home automation bundles which advertise “smart” features like voice recognition which are always plugged into the wall.

permalink
report
parent
reply
1 point
*
Deleted by creator
permalink
report
parent
reply
6 points

What if its not streaming? What if its just cached for future access, e.g. next time the user opens the app (and network traffic spikes anyways) maybe?

permalink
report
parent
reply
3 points

Or plugs in their phone at night, bypassing energy use concerns?

permalink
report
parent
reply
3 points

That’s possible too, and in general I’d think a foreground application currently in use alleviates most of the technical restrictions mentioned (read: why we never install FB).

But again we must assume some uncommon device privileges and we still haven’t solved the problem of background energy usage required to record and/or process a real time feed.

permalink
report
parent
reply
2 points

Can it be implemented on pc? They often turned on and people speak around them too. Cpu activity much harder to trace when there are a lot of different processes. Someone can blame their phone, while it listening pc near by.

permalink
report
parent
reply
4 points

Yeah outside mobile devices I imagine there’s a lot more leeway technically speaking. I’d be far more inclined to suspect a smart TV or a home assistant appliance like Amazon Echo, for example. And certainly there are plenty of PCs out there that are 100% compromised.

But it’s the phone that people often think of as eavesdropping on their conversations. The idea is stickier perhaps because it’s a more personal violation. And I wouldn’t put it past data brokers by any means. They would if they could. I’ve just yet to hear a feasible explanation of how they can without being caught. Hence my doubt.

permalink
report
parent
reply

Privacy

!privacy@lemmy.ml

Create post

A place to discuss privacy and freedom in the digital world.

Privacy has become a very important issue in modern society, with companies and governments constantly abusing their power, more and more people are waking up to the importance of digital privacy.

In this community everyone is welcome to post links and discuss topics related to privacy.

Some Rules

  • Posting a link to a website containing tracking isn’t great, if contents of the website are behind a paywall maybe copy them into the post
  • Don’t promote proprietary software
  • Try to keep things on topic
  • If you have a question, please try searching for previous discussions, maybe it has already been answered
  • Reposts are fine, but should have at least a couple of weeks in between so that the post can reach a new audience
  • Be nice :)

Related communities

Chat rooms

much thanks to @gary_host_laptop for the logo design :)

Community stats

  • 4.5K

    Monthly active users

  • 2.9K

    Posts

  • 77K

    Comments