- Published on
Why ChatGPT Highlights Copilot's Fundamental Flaw
- Written by
- Anshul Ramachandran
With all the recent AI hype, you may have forgotten about Github Copilot's launch a year and a half ago. It was nuts - an AI product that kinda worked for code finally brought the power of generative ML to software development. It wasn't perfect, and definitely was sometimes more annoying than helpful, but I could not put my finger on whether there was something fundamentally wrong with the product or just my unluckiness getting bad / unwanted completions.
I finally could express what was off a couple months ago, when two things happened:
- ChatGPT came out
- I read this paper
I'll start with the second. This study on how people interacted with Copilot verbalized an important distinction on how we code, namely that there are two "modes" in which we code and what we want from a coding assistant in each mode is slightly different:
- Acceleration: The programmer knows what they want to do next and the assistant just does it faster. It is important for interactions to be fast so as to not break the flow, and long suggestions were often seen as a hindrance breaking the flow (even if correct!)
- Exploration: The programmer is trying to do something unfamiliar, and the assistant is used to explore options and get a starting point. Interactions are slow and deliberate, often involve explicit prompting that isn't present in the existing context, and results require more validation
The fundamental flaw with Copilot is that it tries to be the assistant for both acceleration and exploration, ending up being imperfect for both.
When I'm in acceleration mode, the long suggestions that Copilot provides are at best distracting. Slightly worse, my limited human pattern matching accepts a suggestion that seems fine, only for it to have some minor bug that requires hours to finally debug. At absolute worst, it becomes much easier to accept in code that is a security vulnerability or a long code block that is actually verbatim from some training data. So while the interaction UI is fine, a better solution for acceleration mode is to provide smaller chunks, which would be faster to generate, quicker to reject, and less prone to these human acceptance errors.
When I'm in exploration, I'm stuck with trying to prompt Copilot with natural language comments to provide additional context and getting blocks of code with often little to no explanation, forcing me to open up Google and search, which is antithetical to having a code assistant that lives in your IDE. And what would be a better solution? This is where ChatGPT comes in. And I'm not talking about the underlying models, just the UI. A conversational UI for exploration is ideal because it is great for prompting with both explicit and implicit context and goals, as well as iteration, explanation, and answering questions.
I've talked to a lot of developers on what AI tools they use today. A lot of developers respond that they haven't disabled Copilot entirely, but have started using ChatGPT in some situations. With a little more prodding, it becomes clear that ChatGPT is used for exploration while Copilot for acceleration, since the latency and out-of-IDE experience of ChatGPT makes it unusable in the latter mode.
Now is ChatGPT as a model good for exploration? As many a Twitter thread will point out, ChatGPT can be incredibly confident with wrong answers, so there's a lot of work to be done to improve quality and add validation. But when it comes to UI? There's something there.
So, wrapping up, Github Copilot undoubtedly started moving the gears towards an AI-based revolution of software development, but unless it is fundamentally changed to not try to do it all under a single UI, it will be flawed from a product perspective.
Shameless plug: Here at Codeium, we are building our own AI-powered code acceleration toolkit. We have started with autocomplete, but unlike Copilot, focusing on just the "acceleration" mode, prioritizing speed and reasonably chunked completions (O(few) lines at a time) unless we are incredibly confident in a longer completion (ex. very standard ways of doing something). We already have thousands of developers using Codeium, with an active community and timely support on Discord, which has been lacking recently with Copilot.
We have also made this AI-powered autocomplete free forever, so you can get both ideal AI-powered assistants for both acceleration and exploration (via ChatGPT) for free. Well that is until ChatGPT becomes monetized...