Condorcet sobs “so close”.
If I’m going to use AI for something, I want it to be right more often than I am, not just as often!
It actually doesn’t have to be. For example the way I use Github Copilot is I give it a code snippet to generate and if it’s wrong I just write a bit more code and the it usually gets it right after 2-3 iterations and it still saves me time.
The trick is you should be able to quickly determine if the code is what you want which means you need to have a bit of experience under your belt, so AI is pretty useless if not actively harmful for junior devs.
Overall it’s a good tool if you can get your company to shell out $20 a month for it, not sure if I’d pay it out of my own pocket tho.
GitHub Copilot is just intellisense that can complete longer code blocks.
I’ve found that it can somewhat regularly predict a couple lines of code that generally resemble what I was going to type, but it very rarely gives me correct completions. By a fairly wide margin, I end up needing to correct a piece or two. To your point, it can absolutely be detrimental to juniors or new learners by introducing bugs that are sometimes nastily subtle. I also find it getting in the way only a bit less frequently than it helps.
I do recommend that experienced developers give it a shot because it has been a helpful tool. But to be clear - it’s really only a tool that helps me type faster. By no means does it help me produce better code, and I don’t ever see it full on replacing developers like the doomsayers like to preach. That being said, I think it’s $20 well spent for a company in that it easily saves more than $20 worth of time from my salary each month.
I used ChatGPT once. It created non functional code. But, the general idea did help me get to where I wanted. Maybe it works better as a rubber duck substitute?
I did my first game jam with the help of chat gpt. It didn’t write any code in the game, but I was able to ask it how to accomplish certain things generally and it would give me ideas and it would be up to me to implement.
There were other things I knew my engine could do but i couldn’t figure out using the documentation, ao I would ask chat gpt “how do you xyz in godot” and it would give me step by step. This was especially useful for the things that get done in the engine ui and not in code.
Yeah, generating some ideas to get you going might be the best use for this kind of stuff.
That’s how I view AI generated art. It can come up with some really cool mash ups. But you have to do the rest. Anyone just using what it outputs like that’s the end of the story isn’t ‘using it right’ in my opinion.
Right, I expect stuff like stable diffusion will become a part of the toolkit actual artists use. The workflows with this stuff are already getting pretty intricate where people use control net for posing, and inpainting of specific details, and so on. I would liken it to doing photography. You can’t just give a camera to anybody and get good results, it takes a person with a skill and taste to produce an interesting image.
Wait a second here… I skimmed the paper and GitHub and didn’t find an answer to a very important question: is this GPT3.5 or 4? There’s a huge difference in code quality between the two and either they made a giant accidental omission or they are being intentionally misleading. Please correct me if I missed where they specified that. I’m assuming they were using GPT3.5, so yeah those results would be as expected. On the HumanEval benchmark, GPT4 gets 67% and that goes up to 90% with reflexion prompting. GPT3.5 gets 48.1%, which is exactly what this paper is saying. (source).
I’m talking about the models and how they’re written about in the literature. I don’t care how OpenAI brands their products.
From the paper itself:
For the additional 2000 SO questions, ChatGPT 3.5 Turbo API is used.
Whatever GitHub Copilot uses (the version with the chat feature), I don’t find its code answers to be particularly accurate. Do we know which version that product uses?