DeepSeek roundup: banned by governments, no guard rails, lied about its training costs(pivot-to-ai.com)

posted 24 days ago

David Gerard@awful.systemsM

techtakes@awful.systems

56 commentshide report

Sort:

Hot Top Controversial New Old

[ - ]

flizzo@awful.systems

12 points

23 days ago

Not that it should be a regular thing but it is fun to watch this post get swarmed by slopfan reply guys

permalink

report

[ - ]

self@awful.systems

11 points

23 days ago

it’s turning out the most successful thing about deepseek was whatever they did to trick the worst fossbro reply guys you’ve ever met into going to bat for them

permalink

report

parent

[ - ]

bjorney@lemmy.ca

32 points

24 days ago

I’m sorry but this says nothing about how they lied about the training cost - nor does their citation. Their argument boils down to “that number doesn’t include R&D and capital expenditures” but why would that need to be included - the $6m figure was based on the hourly rental costs of the hardware, not the cost to build a data center from scratch with the intention of burning it to the ground when you were done training.

It’s like telling someone they didn’t actually make $200 driving Uber on the side on a Friday night because they spent $20,000 on their car, but ignoring the fact that they had to buy the car either way to get to their 6 figure day job

permalink

report

[ - ]

ebu@awful.systems

22 points

23 days ago

i think you’re missing the point that “Deepseek was made for only $6M” has been the trending headline for the past while, with the specific point of comparison being the massive costs of developing ChatGPT, Copilot, Gemini, et al.

to stretch your metaphor, it’s like someone rolling up with their car, claiming it only costs $20 (unlike all the other cars that cost $20,000), when come to find out that number is just how much it costs to fill the gas tank up once

permalink

report

parent

[ - ]

Soyweiser@awful.systems

7 points

22 days ago

Now im imagining GPUs being traded like old cars.

slaps GPU This GPU? perfectly fine, second hand yes, but only used to train one model, by an old lady, will run the upcoming monster hunter wilds perfectly fine.

permalink

report

parent

[ - ]

bjorney@lemmy.ca

7 points

23 days ago

DeepSeek-V3 costs only 2.788M GPU hours for its full training. Assuming the rental price of the H800 GPU is $2 per GPU hour, our total training costs amount to only $5.576M. Note that the aforementioned costs include only the official training of DeepSeek-V3, excluding the costs associated with prior research and ablation experiments on architectures, algorithms, or data.

Emphasis mine. Deepseek was very upfront that this 6m was training only. No other company includes r&d and salaries when they report model training costs, because those aren’t training costs

permalink

report

parent

[ - ]

ebu@awful.systems

11 points

23 days ago

consider this paragraph from the Wall Street Journal:

DeepSeek said training one of its latest models cost $5.6 million, compared with the $100 million to $1 billion range cited last year by Dario Amodei, chief executive of the AI developer Anthropic, as the cost of building a model.

you’re arguing to me that they technically didn’t lie – but it’s pretty clear that some people walked away with a false impression of the cost of their product relative to their competitors’ products, and they financially benefitted from people believing in this false impression.

permalink

report

parent

Show more comments

[ - ]

msage@programming.dev

0 points

23 days ago

No, it’s not. OpenAI doesn’t spend all that money on R&D, they spent majority of it on the actual training (hardware, electricity).

And that’s (supposedly) only $6M for Deepseek.

So where is the lie?

permalink

report

parent

[ - ]

froztbyte@awful.systems

6 points

23 days ago

shot:

majority of it on the actual training (hardware, …)

chaser:

And that’s (supposedly) only $6M for Deepseek.

citation:

After experimentation with models with clusters of thousands of GPUs, High Flyer made an investment in 10,000 A100 GPUs in 2021 before any export restrictions. That paid off. As High-Flyer improved, they realized that it was time to spin off “DeepSeek” in May 2023 with the goal of pursuing further AI capabilities with more focus.

So where is the lie?

your post is asking a lot of questions already answered by your posting

report

[ - ]

24 points

24 days ago

banned from use by government employees in Australia

So is every other AI except copilot built into Microsoft products. Government employees can’t use chatgpt directly. So this point is a bit disingenuous.

permalink

report

[ - ]

David Gerard@awful.systemsOPM

6 points

23 days ago

They specificallly named this one, you don’t have to make up reasons that somehow it doesn’t count.

permalink

report

parent

[ - ]

Empricorn@feddit.nl