deepseek is not stolen tech, it was trained using novel innovations that western companies were not doing
I thought the innovative part was using more efficient code, not what it’s trained on.
That’s what they said basically.
Like. You can compile better or more diverse datasets to train a model on. But you can also have better code training on the same dataset.
The model is what the code poops out after its eaten the dataset I haven’t read the paper so no idea if the better training had to do with some super unique spin on their dataset but I’m assuming its better code.
https://arxiv.org/abs/2405.20304 they invented their own reinforcement learning framework called Group Relative Policy Optimization
EDIT: deepseek publicly released and published the model and methods to the global community, and there is now an open effort by researchers to reproduce them https://github.com/huggingface/open-r1 it is like the opposite of stealing
@deranger @theunknownmuncher the US trying to stifle Chinese progress/stop chip exports has had exactly what anyone could see. China is making leaps and bounds in all sorts of tech areas, innovating around obstacles