You can ask it to write the code and some unit tests to check normal and edge cases.
Determining the edge cases is even harder. I have to look at the code and see where there might be error-prone operations, and then backtrack to figure out which inputs cause the control flow to reach those branches. Again, if I'm diving so deep, what's even the point?
AI also just has made it so easy to jump into new languages. I don't have to look up documentation that's full of jargon, I can just ask it and have the answer with a code example that relates to my situation.
Another thing LLMs often do is call functions that don't exist. Hopefully you'll know enough about the new language to figure out if it's you doing something wrong or the LLM not wanting to admit that it doesn't know how to do it.
Well one can ask it to write bash scripts, not sure if it can do PowerShell.
Grok is the only AI that's open source with the weights, I tried to run it... but my computer wasn't powerful enough, no surprise. If I could run it locally, I'd write a program that it could interact with through text to accomplish actual requests on my computer (with limits clearly).
Instead of me asking it to do something, it writing a bash script, then I have to run it, I can just ask it and that's the end of my job.
Determining the edge cases is even harder. I have to look at the code and see where there might be error-prone operations, and then backtrack to figure out which inputs cause the control flow to reach those branches. Again, if I'm diving so deep, what's even the point?
Well, it wouldn't be the edge cases of the code but the edge cases of the problem itself which you should know at least superficially. If it runs through the test cases and can solve the problem in various ways, it's probably fine.
Definitely look over the code, and I think looking over the code would probably take less time than trying to figure it out from scratch depending on the problem.
Another thing LLMs often do is call functions that don't exist. Hopefully you'll know enough about the new language to figure out if it's you doing something wrong or the LLM not wanting to admit that it doesn't know how to do it.
I think that's much less of an issue these days. AI responses with those hallucinations have been beaten out of them. It can be wrong obviously, but they're much less likely to just make things up - especially when a real answer exists.
AI still obviously has a long way to go before being reliable.
Well, it wouldn't be the edge cases of the code but the edge cases of the problem itself which you should know at least superficially. If it runs through the test cases and can solve the problem in various ways, it's probably fine.
Your test is going to run on the actual implementation, not on the problem in abstract. You need to test the code's edge cases, not the problem's.
I think that's much less of an issue these days. AI responses with those hallucinations have been beaten out of them. It can be wrong obviously, but they're much less likely to just make things up - especially when a real answer exists.
They will usually answer correctly if it's a common question that it's seen lots of times. The more obscure your question the more likely it is it's going to produce nonsense.
You need to test the code's edge cases, not the problem's
Even so, in many cases would be faster than writing the code yourself. Ask it to explain the logic of the code, the write plenty of helpful comments, etc.. There's a lot of ways to make the code easier to go through.
I don't see much of a problem with it.
The more obscure your question the more likely it is it's going to produce nonsense
They're like us, they don't know everything about everything. The only difference is that the AIs don't really know when to say they don't know. If an answer can be reached, why not give that answer?
To solve this, there should probably be confidence values integrated somehow, so it knows when an answer is likely unhelpful.
AI is still new, there's a lot of progress to made still before we hit big diminishing returns.
Even so, in many cases would be faster than writing the code yourself.
I don't think so. In my example I said it took me 20 minutes to interpret the code, but that was because I had already written a solution, so I knew what a correct implementation looked like. I very much doubt it would have been faster if I was starting from zero. I also already had unit tests written. Worse still, imagine if I had gone with the first solution it produced. I would have introduced a subtle bug in my code.
Ask it to explain the logic of the code, the write plenty of helpful comments, etc.
So I'm asking a blind person to explain what they're stepping on?
AI is still new, there's a lot of progress to made still before we hit big diminishing returns.
AI companies have already sucked up pretty much all the text data there is. That's basically the low-hanging fruit of the tree; just get a really, really big training corpus so that you can analyze statistically what humans say. To get better performance (i.e. effectiveness), all they can do now is massage the data more and more so they can get something useful out of it. Whether that's actually possible or not is unknown. It could be that the problem is intrinsically a waterbed (https://en.wikipedia.org/wiki/Waterbed_theory ), where making the model better at programming necessitates making it worse at both copywriting and summarizing (and so on fractally for everything). And I will argue that it is because making an LLM is a form of data compression. You're compressing a large corpus into a smaller dataset that can be processed by a computer in real time. Putting two bits into one bit and then getting them back out without losing anything is mathematically impossible.
I don't believe LLMs will get much better than they already have. That's where I'm placing my chips. I think other forms of generative and non-generative AI may have more room to improve before running into practical limitations.
So I'm asking a blind person to explain what they're stepping on?
Well it would explain the logic of the code, so you could see if the logic is an adequate solution.
That's actually up for debate, even among researchers
Maybe, but I think there's a lot of improvement to be had. Like I said, the data changes weights in the model, but I think the data itself should have weights.
It just seems to me that there's more ways to improve the quality of the AI models than just ramming more data in.
where making the model better at programming necessitates making it worse at both copywriting and summarizing
I believe that the final LLM may be trained by more specialized models. So you can improve the individual models separately.
Im out of the AI game by many, many years but unless I have missed a 'real' breakthrough (not minor stuff like the same thing another way or the obvious hardware we have now that allows for ginormous constructs) its just a form of curve fitting, where you keep throwing data at it to refine the curve fit to reduce error. There is a lot of hoopajoo around it -- eg decision trees mixed with the sums of products or weighted graphs and so on -- but even that is just step function fitting of a sort. It always comes back to making math that generates the same output for similar input. Heated arguments on whether this or that sigmoid or other near-step function was the best one aside, the answer is not, as you said, throwing more data at it. That may train what you have better, but if you are trying to approximate a 10th degree polynomial with a cubic, its gonna suck no matter how much you move the coefficients around. Except the new stuff is trying to approximate the sum of all human knowledge with a few billion points, but you may as well try to represent the universe with that 10th degree polynomial. I don't have an answer, but I do agree that more data won't help past a certain point. Your masters surely know this, even as they encourage you to keep the data flow going.
and yes, some types of data should have weights of a sort, that govern their 'importance'. Eg if you have a red sports car, it being red is FAR less important than it being a car. Discovery and elimination of coincidence and irrelevant features should be a major part of your training.