About as open source as a binary blob without the training data(slrpnk.net)

posted 10 days ago

Prunebutt@slrpnk.net

memes@lemmy.world

193 commentshide report

Office space meme:

“If y’all could stop calling an LLM “open source” just because they published the weights… that would be great.”

Sort:

Hot Top Controversial New Old

[ - ]

stinky@redlemmy.com

3 points

10 days ago

permalink

report

[ - ]

Dkarma@lemmy.world

9 points

9 days ago

I mean that’s all a model is so… Once again someone who doesn’t understand anything about training or models is posting borderline misinformation about ai.

Shocker

permalink

report

[ - ]

Prunebutt@slrpnk.netOP

-4 points

9 days ago

Yet another so-called AI evangelist accusing others of not understanding computer science if they don’t want to worship their machine god.

permalink

report

parent

[ - ]

surph_ninja@lemmy.world

1 point

9 days ago

Do you think your comments here are implying an understanding of the tech?

permalink

report

parent

[ - ]

Prunebutt@slrpnk.netOP

-1 points

9 days ago

It’s not like you need specific knowledge of Transformer models and whatnot to counterargue LLM bandwagon simps. A basic knowledge of Machine Learning is fine.

report

[ - ]

4 points

9 days ago

Praise the Omnisiah! … I’ll see myself out.

permalink

report

parent

[ - ]

FooBarrington@lemmy.world

21 points

9 days ago

A model is an artifact, not the source. We also don’t call binaries “open-source”, even though they are literally the code that’s executed. Why should these phrases suddenly get turned upside down for AI models?

permalink

report

parent

[ - ]

intensely_human@lemm.ee

16 points

9 days ago

A model can be represented only by its weights in the same way that a codebase can be represented only by its binary.

Training data is a closer analogue of source code than weights.

permalink

report

parent

[ - ]

SorryforSmelling@lemmy.blahaj.zone

-6 points

9 days ago

what a weird hill to die on

permalink

report

[ - ]

Knock_Knock_Lemmy_In@lemmy.world

1 point

9 days ago

You can do sneaky things with weights that are virtually undetectable.

permalink

report

parent

[ - ]

Treczoks@lemmy.world

14 points

9 days ago

On the contrary. What they open sourced was just a small part of the project. What they did not open source is what makes the AI tick. Having less than one percent of a project open sourced does not make it an “Open Source” project.

permalink

report

parent

[ - ]

randon31415@lemmy.world

2 points

10 days ago

If the Source is Open to copying, and I won’t get sued for doing it, well, then…

permalink

report

[ - ]

Prunebutt@slrpnk.netOP

8 points

9 days ago

You don°t have access to the source.

permalink

report

parent

[ - ]

Fushuan [he/him]@lemm.ee

2 points

9 days ago

The source OP is referring to is the training data what they used to compute those weights. Meaning, petabytes of text. Without that we don’t know which content theynused for training the model.

The running/training engines might be open source, the pretrained model isn’t and claiming otherwise is wrong.

Nothing wrong with it being this way, most commercial models operate the same way obviously. Just don’t claim that themselves is open source because a big part of it is that people can reproduce your training to verify that there’s no fowl play in the input data. We literally can’t. That’s it.

permalink

report

parent