You are viewing a single thread.
View all comments View context
35 points

I hardly see it changed to be honest. I work in the field too and I can imagine LLMs being good at producing decent boilerplate straight out of documentation, but nothing more complex than that.

I often use LLMs to work on my personal projects and - for example - often Claude or ChatGPT 4o spit out programs that don’t compile, use inexistent functions, are bloated etc. Possibly for languages with more training (like Python) they do better, but I can’t see it as a “radical change” and more like a well configured snippet plugin and auto complete feature.

LLMs can’t count, can’t analyze novel problems (by definition) and provide innovative solutions…why would they radically change programming?

permalink
report
parent
reply
-1 points

I hardly see it changed to be honest. I work in the field too and I can imagine LLMs being good at producing decent boilerplate straight out of documentation, but nothing more complex than that.

I think one of the top lists on advent of code this year is a cheater that fully automated the solutions using LLMs. Not sure which LLM though, I use LLMs quite a bit and ChatGPT 4o frequently tells me nonsense like “perhaps subtracting by zero is affecting your results” (issues I thought were already gone in GPT 4, but I guess not, Sonnet 3.5 does a bit better in this regard).

permalink
report
parent
reply
1 point

Maybe some postmortem analysis will be interesting. The AoC is also a context in which the domain is self-contained and there is probably a ton of training material on similar problems and tasks. I can imagine LLM might do decently there.

Also there is no big consequence if they don’t and it’s probably possible to bruteforce (which is how many programming tasks have been solved).

permalink
report
parent
reply
1 point

I think you’re spot on with LLMs being mostly trained on these kinds of tasks. Can’t say I’m an expert in how to build a training set, but I imagine it’s quite easy to do with these kinds of problems because it’s easy to classify a solution as correct or incorrect. This is in contrast to larger problems which are less guided by algorithmic efficiency and more by sound design/architecture.

Still, I think it’s quite impressive. You don’t have to go very far back in time to have top of the line LLMs unable to solve these kinds of problems.

Also there is no big consequence if they don’t and it’s probably possible to bruteforce (which is how many programming tasks have been solved).

Usually with AoC part 1 is brute-forceable, but part 2 is not. Very often part 1 is to find the 100th number, and part 2 is to find the 1 000 000 000 000th number or something. Last year, out of curiosity, I had a brute-force solution for one problem that successfully completed on ~90% of the input. Solution was multi-threaded and running on a 16 core CPU for about 20 days before I gave up. But the LLMs this year (not sure if this was a problem last year) are in the top list of fastest users to solve the problems.

permalink
report
parent
reply
-4 points

You’re missing it. Use Cursor or Windsurf. The autocomplete will help in so many tedious situations. It’s game changing.

permalink
report
parent
reply
-5 points

ChatGPT 4o isn’t even the most advanced model, yet I have seen it do things you say it can’t. Maybe work on your prompting.

permalink
report
parent
reply
16 points

That is my experience, it’s generally quite decent for small and simple stuff (as I said, distillation of documentation). I use it for rust, where I am sure the training material was much smaller than other languages. It’s not a matter a prompting though, it’s not my prompt that makes it hallucinate functions that don’t exist in libraries or make it write code that doesn’t compile, it’s a feature of the technology itself.

GPTs are statistical text generators after all, they don’t “understand” the problem.

permalink
report
parent
reply
-7 points

It’s also pretty young, human toddlers hallucinate and make things up. Adults too. Even experts are known to fall prey to bias and misconception.

I don’t think we know nearly enough about the actual architecture of human intelligence to start asserting an understanding of “understanding”. I think it’s a bit foolish to claim with certainty that LLMs in a MoE framework with self-review fundamentally can’t get there. Unless you can show me, materially, how human “understanding” functions, we’re just speculating on an immature technology.

permalink
report
parent
reply

Technology

!technology@lemmy.world

Create post

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


Community stats

  • 15K

    Monthly active users

  • 13K

    Posts

  • 570K

    Comments