You can instantly get whatever you want, only it’s made from 100% technical debt
And then 12 hours spent debugging and pulling it apart.
And if you need anything else, you have to use a new prompt which will generate a brand new application, it’s fun!
But then, as now, it won’t understand what it’s supposed to do, and will merely attempt to apply stolen code - ahem - training data in random permutations until it roughly matches what it interprets the end goal to be.
We’ve moved beyond a thousand monkeys with typewriters and a thousand years to write Shakespeare, and have moved into several million monkeys with copy and paste and only a few milliseconds to write “Hello, SEGFAULT”
If you know what you’re doing, AI is actually a massive help. You can make it do all the repetitive shit for you. You can also have it write the code and you either clean it or take the pieces that works for you. It saves soooooo much time and I freaking love it.
That’s the thing, it’s a useful assistant for an expert who will be able to verify any answers.
It’s a disaster for anyone who’s ignorant of the domain.
Tell me about it. I teach a python class. Super basic, super easy. Students are sometimes idiots, but if they follow the steps, most of them should be fine. Sometimes I get one who thinks they can just do everything with chatgpt. They’ll be working on their final assignment and they’ll ask me what a for loop is for. Than I look at their code and it looks like Sanscrit. They probably haven’t written a single line of code in those weeks.
I turned on copilot in VSCode for the first time this week. The results so far have been less than stellar. It’s batting about .100 in terms of completing code the way I intended. Now, people tell me it needs to learn your ways, so I’m going to give it a chance. But one thing it has done is replaced the normal auto-completion which showed you what sort of arguments a function takes with something that is sometimes dead wrong. Like the code will not even compile with the suggested args.
It also has a knack for making me forget what I was trying to do. It will show me something like the left side picture with a nice rail stretching off into the distance when I had intended it to turn, and then I can’t remember whether I wanted to go left or right? I guess it’s just something you need to adjust to. Like you need to have a thought fairly firmly in your mind before you begin typing so that you can react to the AI code in a reasonable way? It may occasionally be better than what you have it mind, but you need to keep the original idea in your head for comparison purposes. I’m not good at that yet.
I don’t mess with any of those in-IDE assistants. I find them very intrusive and they make me less efficient. So many suggestions pop up and I don’t like that, and like you said, I get confused. The only time I thought one of them (codium) was somewhat useful is when I asked it to make tests for the file I was on. It did get all the positive tests correct, but all the negative ones wrong. Lol. So, I naturally default to the AI in the browser.
Thanks, it makes me feel relieved to hear I’m not the only one finding it a little overwhelming! Previously, I had been using chatgpt and the like where I would be hunting for the answer to a particularly esoteric programming question. I’ve had a fair amount of success with that, though occasionally I would catch it in the act of contradicting itself, so I’ve learned you have to follow up on it a bit.
I haven’t personally used it, but my coworker said using Cursor with the newest Claude model is a gamechanger and he can’t go back anymore 🤷♂️ he hasn’t really liked anything outside of cursor yet
If you’re having to do repetitive shit, you might reconsider your approach.
Depending on the situation, repetitive shit might be unavoidable
Usually you can solve the issue by using regex, but regex can be difficult to work with as well
It’s taken me a while to learn how to use it and where it works best but I’m coming around to where it fits.
Just today i was doing a new project, i wrote a couple lines about what i needed and asked for a database schema. It looked about 80% right. Then asked for all the models for the ORM i wanted and it did that. Probably saved an hour of tedious typing.
I’m telling you. It’s fantastic for the boring and repetitive garbage. Databases? Oh hell yeah, it does really well on that, too. You have no idea how much I hate working with SQL. The ONLY thing it still struggles with so far is negative tests. For some reason, every single AI I’ve ever tried did good on positive tests, but just plain bad in the negative ones.
I knocked off an android app in Flutter/Dart/Supabase in about a week of evenings with Claude. I have never used Flutter before, but I know enough coding to fix things and give good instructions about what I want.
It would even debug my android test environment for me and wrote automated tests to debug the application, as well as spit out the compose files I needed to set up the Supabase docker container and SQL queries to prep the database and authentication backend.
That was using 3.5Sonnet, and from what I’ve seen of 3.7, it’s way better. I think it cost me about $20 in tokens. I’ve never used AI to code anything before, this was my first attempt. Pretty cool.
I used 3.7 on a project yesterday (refactoring to use a different library). I provided the documentation and examples in the initial context and it re-factored the code correctly. It took the agent about 20 minutes to complete the re-write and it took me about 2 hours to review the changes. It would have taken me the entire day to do the changes manually. The cost was about $10.
It was less successful when I attempted to YOLO the rest of my API credits by giving it a large project (using langchain to create an input device that uses local AI to dictate as if it were a keyboard). Some parts of the codes are correct, the langchain stuff is setup as I would expect. Other parts are simply incorrect and unworkable. It’s assuming that it can bind global hotkeys in Wayland, configuration required editing python files instead of pulling from a configuration file, it created install scripts instead of PKGBUILDs, etcetc.
I liken it to having an eager newbie. It doesn’t know much, makes simple mistakes, but it can handle some busy work provided that it is supervised.
I’m less worried about AI taking my job then my job turning into being a middle-manager for AI teams.
I think the further you get out in to esoteric or new things, the less they have to draw on. I’ve had a bit of the same issue building Lora telemetry on ESP32 with specific radio modules because there might be a couple of realworld examples out there of using those libraries.
I’ve been trying to use aider for this, it seems really cool but my machine and wallet cannot handle the sheer volume of tokens it consumes.
I don’t even know what aider is. Lol. There are so many assistants out there. My company created a wrapper for chatgpt and gave us unlimited number of tokens and told us to go ham.
Aider is an LLM agent type app that has a programming assistant and an architect assistant.
You tell the architect what you want and it scans the structure of your code base to generate the boilerplate. Then the coder fills it in. It has command prompt access to then compile and run etc.
I haven’t really figured it out yet.
Not to be that guy, but the image with all the traintracks might just be doing it’s job perfectly.
It gives you the right picture when you asked for a single straight track on the prompt. Now you have to spend 10 hours debugging code and fixing hallucinations of functions that don’t exist on libraries it doesn’t even neet to import.
Not a developer. I just wonder about AI hallucinations come about. Is it the ‘need’ to complete the task requested at the cost of being wrong?
Full disclosure - my background is in operations (think IT) not AI research. So some of this might be wrong.
What’s marketed as AI is something called a large language model. This distinction is important because AI implies intelligence - where as a LLM is something else. At a high level LLMs are using something called “tokens” to break apart natural language into elements that a machine can understand, and then recombining those tokens to “create” something new. When a LLM is creating output it does not know what it is saying - it knows what token statistically comes after the token(s) it has generated already.
So to answer your question. An AI can hallucinate because it does not know the answer - its using advanced math to know that the period goes at the end of the sentence. and not in the middle.
No, it’s just that it doesn’t know if it’s right or wrong.
How “AI” learns is they go through a text - say blog post - and turn it all into numbers. E.g. word “blog” is 5383825526283. Word “post” is 5611004646463. Over huge amount of texts, a pattern is emerging that the second number is almost always following the first number. Basically statistics. And it does that for all the words and word combinations it found - immense amount of text are needed to find all those patterns. (Fun fact: That’s why companies like e.g. OpenAI, which makes ChatGPT need hundreds of millions of dollars to “train the model” - they need enough computer power, storage, memory to read the whole damn internet.)
So now how do the LLMs “understand”? They don’t, it’s just a bunch of numbers and statistics of which word (turned into that number, or “token” to be more precise) follows which other word.
So now. Why do they hallucinate?
How they get your question, how they work, is they turn over all your words in the prompt to numbers again. And then go find in their huge databases, which words are likely to follow your words.
They add in a tiny bit of randomness, they sometimes replace a “closer” match with a synonym or a less likely match, so they even seen real.
They add “weights” so that they would rather pick one phrase over another, or e.g. give some topics very very small likelihoods - think pornography or something. “Tweaking the model”.
But there’s no knowledge as such, mostly it is statistics and dice rolling.
So the hallucination is not “wrong”, it’s just statisticaly likely that the words would follow based on your words.
Did that help?
That’s the problem. Maybe it is.
Maybe the code the AI wrote works perfectly. Maybe it just looks like how perfectly working code is supposed to look, but doesn’t actually do what it’s supposed to do.
To get to the train tracks on the right, you would normally have dozens of engineers working over probably decades, learning how the old system worked and adding to it. If you’re a new engineer and you have to work on it, you might be able to talk to the people who worked on it before you and find out how their design was supposed to work. There may be notes or designs generated as they worked on it. And so-on.
It might take you months to fully understand the system, but whenever there’s something confusing you can find someone and ask questions like “Where did you…?” and “How does it…?” and “When does this…?”
Now, imagine you work at a railroad and show up to work one day and there’s this whole mess in front of you that was laid down overnight by some magic railroad-laying machine. Along with a certificate the machine printed that says that the design works. You can’t ask the machine any questions about what it did. Or, maybe you can ask questions, but those questions are pretty useless because the machine isn’t designed to remember what it did (although it might lie to you and claim that it remembers what it did).
So, what do you do, just start running trains through those tracks, assured that the machine probably got things right? Or, do you start trying to understand every possible path through those tracks from first principles?
Im looking forward in the next 2 years when AI apps are in the wild and I get to fix them lol.
As a SR dev, the wheel just keeps turning.
I’m being pretty resistant about AI code Gen. I assume we’re not too far away from “Our software product is a handcrafted bespoke solution to your B2B needs that will enable synergies without exposing your entire database to the open web”.
It has its uses. For templeting and/or getting a small project off the ground its useful. It can get you 90% of the way there.
But the meme is SOOO correct. AI does not understand what it is doing, even with context. The things JR devs are giving me really make me laugh. I legit asked why they were throwing a very old version of react on the front end of a new project and they stated they “just did what chatgpt told them” and that it “works”. Thats just last month or so.
The AI that is out there is all based on old posts and isnt keeping up with new stuff. So you get a lot of the same-ish looking projects that have some very strange/old decisions to get around limitations that no longer exist.
Yeah, I think personally LLMs are fine for like writing a single function, or to rubber duck with for debugging or thinking through some details of your implementation, but I’d never use one to write a whole file or project. They have their uses, and I do occasionally use something like ollama to talk through a problem and get some code snippets as a starting point for something. Trying to do too much more than that is asking for problems though. It makes it way harder to debug because it becomes reading code you haven’t written, it can make the code style inconsistent, and a non-insignifigant amount of the time even in short code segments it will hallucinate a non existent function or implement something incorrectly, so using it to write massive amounts of code makes that way more likely.
The AI also enabled some very bad practices.
It does not refactor and it makes writing repetitive code so easy you miss opportunities to abstract. In a week when you go to refactor you’re going to spend twice as long on that task.
As long as you know what you’re doing and guide it accordingly, it’s a good tool.