AI language models can exceed PNG and FLAC in lossless compression, says study(arstechnica.com)

posted 1 year ago

FlickOfTheBean@beehaw.org

technology@beehaw.org

29 commentshide report

While LLMs have been used for… a lot, it seems like this use might be one where it’s not only reliable but it appears to outperform existing methods of image compression. Being able to cram more data into less space tends to lead to interesting developments, so I will be keeping my eye on this.

What do you guys think? Seem like it’s deserving of less hype than I’m giving it? What kind of security holes do you think this could open?

Sort:

Hot Top Controversial New Old

[ - ]

AutoTL;DR@lemmings.worldB

5 points

1 year ago

🤖 I’m a bot that provides automatic summaries for articles:

Click here to see the summary

When an algorithm or model can accurately guess the next piece of data in a sequence, it shows it’s good at spotting these patterns.

The study’s results suggest that even though Chinchilla 70B was mainly trained to deal with text, it’s surprisingly effective at compressing other types of data as well, often better than algorithms specifically designed for those tasks.

This opens the door for thinking about machine learning models as not just tools for text prediction and writing but also as effective ways to shrink the size of various types of data.

Over the past two decades, some computer scientists have proposed that the ability to compress data effectively is akin to a form of general intelligence.

The idea is rooted in the notion that understanding the world often involves identifying patterns and making sense of complexity, which, as mentioned above, is similar to what good data compression does.

The relationship between compression and intelligence is a matter of ongoing debate and research, so we’ll likely see more papers on the topic emerge soon.

Saved 75% of original text.

permalink

report

[ - ]

kevincox@lemmy.ml

3 points

1 year ago

I think this is a legitimate use case. It shouldn’t have any security vulnerabilities beyond regular compression-related vulnerabilities.

The core to compression is prediction. Most compression algorithms work sort of like this:

Guess what the data is going to be.
Encode the difference from the guess.

If your guess is good it doesn’t take much data to encode the difference. So the data stream is smaller.

AI image generation can be used to guess the data quite effectively, and it can use context that is hard to encode in classic algorithms (such as what a car looks like). This is basically the next step of shared dictionary compression (like what makes Brotli quite effective) where instead of building a dictionary as a simple Huffman table you compress the dictionary into the model weights. Since the model can do a pretty good job at creating “Image of a girl with brown hair looking right” you “just” need to encode the difference.

IIUC neither PNG or FLAC use pre-shared data, so sending a massive set of neural weights can be an advantage (and presumably you only need to send these weights occasionally).

permalink

report

[ - ]

brie@beehaw.org

1 point

1 year ago

An example of a compression algorithm that does support tuning parameters before hand is zstd.

Even if something isn’t in a pre-shared dataset, I wonder if a sufficiently advanced LLM might be able to do well at compressing predictable but non-repeating data, such as “abc, bcd, cde, […]”.

permalink

report

parent

[ - ]

skip0110@lemm.ee

23 points

1 year ago

I think this model has billions of weights. So I believe that means the model itself is quite large. Since the receiver needs to already have this model, I’d suggest that rather than compressing the data, we have instead pre encoded it, embedded it in the model weights, and thus the “compression” is just basically passing a primary key that points to the data to be compressed in the model.

It’s like, if you already have a copy of a book, I can “compress” any text in that book into 2 numbers: a page offset, and a word offset on that page. But that’s cheating because, at some point, we had to transfer to book too!

permalink

report

[ - ]

Coffee Junky ❤️@beehaw.org

2 points

1 year ago

I feel it’s somewhere in the middle. Like your book example only works if you already have the book. If this is a model that is a few gigabytes of data, but it works for every movie or audio file it can still be useable. In that case it’s not that you have to send the book first, but you do need to have the same dictionary.

permalink

report

parent

[ - ]

puttputt@beehaw.org

13 points

1 year ago

Yeah, it’s like saying I can “compress” a png of the Mona Lisa to just the string “Mona Lisa” because I have a database of art.

permalink

report

parent

[ - ]

𝒍𝒆𝒎𝒂𝒏𝒏@lemmy.one

2 points

1 year ago

Once we figure out how to get data out consistently 1:1 without hallucinations, the floodgates will open IMO.

And i’ll be all over it personally, especially with FLAC files that range anywhere from 20MB to 70MB, any savings to rein these in closer to a typical MP3 will be much appreciated by myself. I don’t mind long compression times, as 7zip and the other formats give us long waiting times already.

If AI accelerator hardware is able to speed up the data compression process, this is where I’d maybe start to get a bit suspicious, as these accelerators at the moment are included in various in IoT and camera SoCs. A single exploit is all that would be needed to theoretically allow the user’s personal data to be siphoned off quickly, without noticing a change in the volume of network traffic, or negative impact to the performance of the IoT device

permalink

report

[ - ]

Schmeckinger@feddit.de

2 points

1 year ago

I never really had issues with flac in the last years, since 128+ gb micro sd cards got dirt cheap. But maybe my music taste isn’t general/wide enough to fill up such a card fast.

permalink

report

parent

[ - ]

abhibeckert@beehaw.org

1 point

1 year ago

If someone wants to know what music I listen to… they could just ask. Or point a directional microphone at the window from a distance (technology that has existed for almost a century now, and modern systems can likely work from miles away).

Either would be a lot easier than exploiting a zero day side channel attack which would surely be detected and closed.

Also, it can take years for a music track to go from an idea in the artist’s head to being available for download by a consumer. If it takes a week to compress the file… that’s no biggie at all. Only the decompression side needs to be power efficient.

permalink

report

parent

Technology

!technology@beehaw.org

Create post

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:

This community’s icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

Community stats

2.7K
Monthly active users
3K
Posts
55K
Comments

Community stats

Community moderators