You are viewing a single thread.
View all comments View context
-11 points

The 70b model is a distilation of Llama3.3, that is to say it replicates the output of Llama3.3 while using the deepseekR1 architecture for better processing efficiency. So any criticism of the capability of the model is just criticism of Llama3.3 and not deepseekR1.

permalink
report
parent
reply
9 points

[to the tune of Fort Minor’s Remember The Name]

10% senseless, 20% post
15% concentrated spirit of boast
5% reading, 50% pain
and a 100% reason to not post here again
permalink
report
parent
reply
12 points

Thank you for shedding light on the matter. I never realized that 69b model is a pisstillation of Lligma peepee point poopoo, that is to say it complicates the outpoop of Lligma4.20 while using the creepbleakR1 house design for better processing deficiency. Now I finally realize that any criticism of Kraftwerk’s 1978 hit Das Model is just criticism of Sugma80085 and not deepthroatR1.

permalink
report
parent
reply

TechTakes

!techtakes@awful.systems

Create post

Big brain tech dude got yet another clueless take over at HackerNews etc? Here’s the place to vent. Orange site, VC foolishness, all welcome.

This is not debate club. Unless it’s amusing debate.

For actually-good tech, you want our NotAwfulTech community

Community stats

  • 1.2K

    Monthly active users

  • 621

    Posts

  • 14K

    Comments

Community moderators