Did nobody really question the usability of language models in designing war strategies?

You are viewing a single thread.
View all comments View context
20 points

LLM are just plagiarizing bullshitting machines. It’s how they are built. Plagiarism if they have the specific training data, modify the answer if they must, make it up from whole cloth as their base programming. And accidentally good enough to convince many people.

permalink
report
parent
reply
-1 points
*
-1 points

I will read those, but I bet “accidentally good enough to convince many people.” still applies.

A lot of things from LLM look good to nonexperts, but are full of crap.

permalink
report
parent
reply
0 points

https://notes.aimodels.fyi/researchers-discover-emergent-linear-strucutres-llm-truth/

References a 2 author paper. I am not an expert in the field, but it is important to read the papers that reference this one. Those papers will have criticisms that are thought out. In general, fewer authors means less debate between the authors and easier to miss details.

permalink
report
parent
reply
1 point

https://notes.aimodels.fyi/self-rag-improving-the-factual-accuracy-of-large-language-models-through-self-reflection/

A cool paper. Using the LLM to judge value of new inputs.
I am always skeptical of summaries of journal articles. Even well meaning people can accidentally distort the conclusions.

Still LLM is a bullshit generator that can check bullshit level of inputs.

permalink
report
parent
reply
1 point
*

https://adamkarvonen.github.io/machine_learning/2024/01/03/chess-world-models.html

However, this only worked for a model trained on a synthetic dataset of games uniformly sampled from the Othello game tree. They tried the same techniques on a model trained using games played by humans and had poor results. To me, this seemed like a major caveat to the findings of the paper which may limit its real world applicability. We cannot, for example, generate code by uniformly sampling from a code tree.

Author later discusses training on you data versus general datasets.

I am out of my depth, but does not seem to provide strong evidence for the modem not just repeating information that shows up a lot for the given inputs.

permalink
report
parent
reply
1 point

https://poke-llm-on.github.io/

Reinforcement learning. Cool project. Still no need to “know” anything. I usually play this type of have with short rules and monitoring the current state.

permalink
report
parent
reply
1 point

https://arxiv.org/abs/2310.02207

2 author paper with interesting evidence. Again, evidence not proof. Wait for the papers that cite this one.

permalink
report
parent
reply
1 point

Yes. There is self organization and possibility to self reflection going on in something that wasn’t designed for it. That’s going to spawn a lot more research.

permalink
report
parent
reply
9 points

How is that structurally different from how a human answers a question? We repeat an answer we “know” if possible, assemble something from fragments of knowledge if not, and just make something up from basically nothing if needed. The main difference I see is a small degree of self reflection, the ability to estimate how ‘good or bad’ the answer likely is, and frankly plenty of humans are terrible at that too.

permalink
report
parent
reply
1 point

A human brain can do that for 20 watt of power. chatGPT uses up to 20 megawatt.

permalink
report
parent
reply
6 points

Yeah, and a car uses more energy than me. It still goes faster. What’s your point? The debate isn’t input vs output. It’s only about output(the ability of the AI).

permalink
report
parent
reply
1 point
*

I dare say that if you ask a human “Why should I not stick my hand in a fire?” their process for answering the question is going to be very different from an LLM.

ETA: Also, working in software development, I’ll tell ya… Most of the time, when people ask me a question, it’s the wrong question and they just didn’t know to ask a different question instead. LLMs don’t handle that scenario.

I’ve tried asking ChatGPT “How do I get the relative path from a string that might be either an absolute URI or a relative path?” It spat out 15 lines of code for doing it manually. I ain’t gonna throw that maintenance burden into my codebase. So I clarified: “I want a library that does this in a single line.” And it found one.

An LLM can be a handy tool, but you have to remember that it’s also a plagiarizing, shameless bullshitter of a monkey paw.

permalink
report
parent
reply
2 points

“Most of the time, when people ask me a question, it’s the wrong question and they just didn’t know to ask a different question instead.”

“I’ve tried asking ChatGPT “How do I get the relative path from a string that might be either an absolute URI or a relative path?” It spat out 15 lines of code for doing it manually. I ain’t gonna throw that maintenance burden into my codebase. So I clarified: “I want a library that does this in a single line.” And it found one.”

You see the irony right? I genuinely can’t fathom your intent when telling this story, but it is an absolutely stellar example.

You can’t give a good answer when people don’t ask the right questions. ChatGPT answers are only as good as the prompts. As far as being a “plagiarizing, shameless bullshitter of a monkey paw” I still don’t think it’s all that different from the results you get from people. If you ask a coworker the same question you asked chatGPT, you’re probably going to get a line copied from a Google search that may or may not work.

permalink
report
parent
reply
1 point

I would argue that a decent portion of humans are usually ok with admitting they don’t know something

Unless they are in a situation where they will be punished for not knowing

My favorite doctor claimed he didn’t know something and at first I was thinking “Man that’s weird” but then I thought about all the times I’ve personally had or heard stories of doctors that bullshited their way into something like how I couldn’t possibly be diagnosed with ADHD at 18

permalink
report
parent
reply
0 points
Deleted by creator
permalink
report
parent
reply
1 point

It kind of irks me how many people want to downplay this technology in this exact manner. Yes you’re sort of right but in no way does that really change how it will be used and abused.

“But people think it’s real AI tho!”

Okay and? Most people don’t understand how most tech works and that doesn’t stop it from doing a lot of good and bad things.

permalink
report
parent
reply
1 point

I’ve been through a few AI winters and hype cycles. It made me very cynical and convinced many overly enthusiastic people will run into a firewall face first.

permalink
report
parent
reply
4 points

To be fair they’re not accidentally good enough: they’re intentionally good enough.

That’s where all the salary money went: to find people who could make them intentionally.

permalink
report
parent
reply
6 points

GPT 2 was just a bullshit generator. It was like a politician trying to explain something they know nothing about.

GPT 3.0 was just a bigger version of version 2. It was the same architecture but with more nodes and data as far as I followed the research. But that one could suddenly do a lot more than the previous version, so by accident. And then the AI scene exploded.

permalink
report
parent
reply
2 points

It was the same architecture but with more nodes and data

So the architecture just needed more data to generate useful answers. I don’t think that was an accident.

permalink
report
parent
reply

Technology

!technology@lemmy.world

Create post

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


Community stats

  • 16K

    Monthly active users

  • 13K

    Posts

  • 557K

    Comments