Office space meme:
“If y’all could stop calling an LLM “open source” just because they published the weights… that would be great.”
Seems kinda reductive about what makes it different from most other LLM’s
The other LLMs aren’t open source, either.
isn’t that just trained from the other AI?
Most certainly not. If it were, it wouldn’t output coherent text, since LLM output degenerates if you human-centipede its’ outputs.
And the way it uses that data, afaik, is open and editable, and the license to use it is open.
From that standpoint, every binary blob should be considered “open source”, since the machine instructions are readable in RAM.
-
Well that’s the argument.
-
Ai condensing ai is what is talked about here, from my understanding deepseek is two parts and they start with known datasets in use, and the two parts bounce ideas against each other and calculates fitness. So degrading recursive results is being directly tackled here. But training sets are tokenized gathered data. The gathering of data sets is a rights issue, but this is not part of the conversation here.
-
It could be i don’t have a complete concept on what is open source, but from looking into it, all the boxes are checked. The data set is not what is different, it’s just data. Deepseek say its weights are available and open to be changed (https://api-docs.deepseek.com/news/news250120) but the processes that handle that data at unprecedented efficiency us what makes it special
The point of open source is access to reproducability the weights are the end products (like a binary blob), you need to supply a way on how the end product is created to be open source.
So its not how it tokenized the data you are looking for, it’s not how the weights are applied you want, and it’s not how it functions to structure the output you want because these are all open… it’s the entirety of the bulk unfiltered data you want. Of which deepseek was provided from other ai projects for initial training, can be changed to fit user needs, and doesnt touch on at all how this LLM is different from other LLM’s? This would be as i understand it… like saying that an open source game emulator can’t be open source because Nintendo games are encapsulated? I don’t consider the training data to be the LLM. I consider the system that manipulated that data to be the LLM. Is that where the difference in opinion is?