You are viewing a single thread.
View all comments View context
3 points

I suspect “reasoning” models are just taking advantage of the law of averages. You could get much better results from prior llms if you provided plenty of context in your prompt. In doing so you would constrain the range of possible outputs which helps to reduce “hallucinations”. You could even use llms to produce that context for you. To me it seems like reasoning models are just trained to do that all in one go.

permalink
report
parent
reply
8 points

Neurosymbolic models use symbolic logic to do the reasoning on the data that’s parsed and classified using a deep neural network. If you’re interested in how this works in detail, this is a good paper https://arxiv.org/abs/2305.00813

permalink
report
parent
reply
4 points
*

I appreciate the link but I stand by my point. As far as I’m aware, “reasoning” models like R1 and O3 are not architecturally very different from Deepseek v3 or GPT4, which have already integrated some of the features mentioned in that paper.

Also as an aside, I really despise how compsci researchers and the tech sector borrow language from neuroscience. They take concepts they don’t fully understand and then use them in obscenely reductive ways. It ends up heavily obscuring how LLMs function and what their limitations are. They of course can’t speak plainly about these things otherwise the financial house of cards built up around LLMs would collapse. As such, I guess we’re just condemned to live in the fever dreams of tech entrepreneurs who are at their core are used car salesmen with god complexes.

Don’t get me wrong, LLMs and other kinds of deep generative models are useful in some contexts. It’s just their utility is not at all commensurate with the absurd amount of resources expended to create them.

permalink
report
parent
reply
1 point

The way to look at models like R1 is as layers on top of the LLM architecture. We’ve basically hit a limit of what generative models can do on their own, and now research is branching out in new directions to supplement what the GPT architecture is good at doing.

The potential here is that these kinds of systems will be able to do tasks that fundamentally could not be automated previously. Given that, I think it’s odd to say that the utility is not commensurate with the effort being invested into pursuing this goal. Making this work would effectively be a new industrial revolution. The reality is that we don’t actually know what’s possible, but the rate of progress so far has been absolutely stunning.

permalink
report
parent
reply

technology

!technology@hexbear.net

Create post

On the road to fully automated luxury gay space communism.

Spreading Linux propaganda since 2020

Rules:

  • 1. Obviously abide by the sitewide code of conduct. Bigotry will be met with an immediate ban
  • 2. This community is about technology. Offtopic is permitted as long as it is kept in the comment sections
  • 3. Although this is not /c/libre, FOSS related posting is tolerated, and even welcome in the case of effort posts
  • 4. We believe technology should be liberating. As such, avoid promoting proprietary and/or bourgeois technology
  • 5. Explanatory posts to correct the potential mistakes a comrade made in a post of their own are allowed, as long as they remain respectful
  • 6. No crypto (Bitcoin, NFT, etc.) speculation, unless it is purely informative and not too cringe
  • 7. Absolutely no tech bro shit. If you have a good opinion of Silicon Valley billionaires please manifest yourself so we can ban you.

Community stats

  • 1.7K

    Monthly active users

  • 1.8K

    Posts

  • 22K

    Comments