You are viewing a single thread.
View all comments View context
-2 points

Curious why your perspective is they’re are more of a scam when by all metrics they’ve only improved in accuracy?

permalink
report
parent
reply
2 points

One or two models have increased in accuracy. Meanwhile all the grifters have caught on and there’s 1000x more AI companies out there that are just reselling ChatGPT with some new paint.

permalink
report
parent
reply
1 point

That’s definitely valid, but just because a tool is used for scam doesn’t inherently mean it’s a scam. I don’t call the cellphone a scam because most my calls are.

permalink
report
parent
reply
3 points

Source?

permalink
report
parent
reply
-2 points

Compare the GPT increase from their V2 GPT4o model to their reasoning o1 preview model. The jumps from last years GPT 3.5 -> GPT 4 were also quite large. Secondly if you want to take OpenAI’s own research into account that’s in the second image.

permalink
report
parent
reply
6 points

if you want to take OpenAI’s own research into account

No thank you.

OlympicArena validation set (text-only)

“Our extensive evaluations reveal that even advanced models like GPT-4o only achieve a 39.97% overall accuracy (28.67% for mathematics and 29.71% for physics)”

  • The OlympicArena analysis that you cited.
permalink
report
parent
reply

Technology

!technology@lemmy.ml

Create post

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.


Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.


Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

Community stats

  • 3.7K

    Monthly active users

  • 2.6K

    Posts

  • 41K

    Comments

Community moderators