GPT-4 solved it on the first try (it's much smarter and better at coding since the latest update).
Normally, I code all day at work. A year ago, a tough coding problem might have taken me a week to solve through trial and error, searching the internet, etc. Now, it takes only hours, or at most a day, with GPT-4's help.
I'm mostly interested in those complex problems involving multiple platforms and libraries, and when I'm facing some issues even GPT-4 can't help and solve alone, and I'm trying to solve those with this tool. I first try to solve the problems I couldn't fix with GPT-4, then pass the problem and solution to Mistral 8x7b for verification. After that, GPT-4 considers any correction suggestions, makes changes if needed, and this process repeats with Llama2-70b and Claude-3. The final corrections are always made with GPT-4 if possible, and then I send the answer back to the chatbot user.
In the end, I might have four different solutions: the original from GPT-4 and three improved versions, each rated (1-5 stars) on how good AI thinks they are compared to each other. I then test all four solutions. If the solution works, I move on; if not, I send the solution back to the Coding Agents with error messages for another round, now with more info on whatās not working.