You are viewing a single thread.
View all comments View context
10 points

Can you elaborate on the differences?

permalink
report
parent
reply
20 points
*

Base models are general purpose language models, mainly useful for AI researchers and people who want to build on top of them.

Instruct or chat models are chatbots. They are made by fine-tuning base models.

The V3 models linked by OP are Deepseek’s non-reasoning models, similar to Claude or ChatGPT4o. These are the “normal” chatbots that reply with whatever comes to their mind. Deepseek also has a reasoning model, R1. Such models take time to “think” before supplying their final answer; they tend to give better performance for stuff like math problems, at the cost of being slower to get the answer.

It should be mentioned that you probably won’t be able to run these models yourself unless you have a data center style rig with 4-5 GPUs. The Deepseek V3 and R1 models are chonky beasts. There are smaller “distilled” forms of R1 that are possible to run locally, though.

permalink
report
parent
reply
5 points

I heard people saying they could run the r1 32B model on moderate gaming hardware albeit slowly

permalink
report
parent
reply
1 point

My legion slim 5 14" can run it not too bad.

permalink
report
parent
reply
5 points

32b is still distilled. The full one is 671b.

permalink
report
parent
reply
-2 points

r1 is lightweight and optimized for local environments on a home PC. It’s supposed to be pretty good at programming and logic and kinda awkward at conversation.

v3 is powerful and meant to run on cloud servers. It’s supposed to make for some pretty convincing conversations.

permalink
report
parent
reply
5 points

R1 isn’t really runnable with a home rig. You might be able to run a distilled version of the model though!

permalink
report
parent
reply
1 point

You’re absolutely right, I wasn’t trying to get that in-depth, which is why I said “lightweight and optimized,” instead of “when using a distilled version” because that raises more questions than it answers. But I probably overgeneralized by making it a blanket statement like that.

permalink
report
parent
reply
5 points

Tell that to my home rig currently running the 671b model…

permalink
report
parent
reply
2 points

https://www.deepseekv3.com/en/download

I was assuming one was pre-trained and one wasn’t but don’t think that’s correct and don’t care enough to investigate further.

permalink
report
parent
reply
17 points

Is that website legit? I’ve only ever seen https://www.deepseek.com/

And I would personally recommend downloading from HuggingFace or Ollama

permalink
report
parent
reply

Technology

!technology@lemmy.world

Create post

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


Community stats

  • 17K

    Monthly active users

  • 14K

    Posts

  • 597K

    Comments