Published on

Why your AI Code Completion tool needs to Fill in the Middle

Written by
Rahul Sridhar
An artist is about to fill-in-the-middle of a mural.

Code Completion Models

Large language models have been trained over billions of bytes of data to perform exactly one task extremely well: given the preceding N characters, predict the next one. The driving force behind the AI revolution we're currently experiencing is that being able to predict the character with high accuracy is an incredible superpower. It allows you to build chatbots like Bing and ChatGPT, copywriting assistants like Jasper, and code completion tools like Codeium and Copilot.

The models powering code completion tools know how to complete entire functions just from their signatures:

Codeium completes a merge sort function

They can see your imports and predict what task you're trying to complete:

Codeium uses the Tweepy library

But there's a problem: the model only knows about the code before your cursor. What about everything that's after? The existing code there can be incredibly useful when programming, providing information about potential functions to call, coding practices to emulate, and approaches to take.

So, what's the solution? Enter Fill in the Middle (FIM). Introduced in a paper last year by OpenAI, FIM is an under-discussed technique that allows language models to incorporate the context that comes after the cursor during training.

How Fill-in-the-Middle works

It's quite simple: let's say we have a training example that looks like this:

Training example

and we want the model to learn to predict the middle text jumps over from the prefix The quick brown fox and the suffix over a lazy dog. First, we make two cuts to separate these sections, introducing new tokens <PRE>, <MID>, <SUF>, and <EOM> (end of middle):

Training example before FIM transformation

Then we simply transpose the middle and suffix:

Training example after FIM transformation

Now, we train exactly like we did before, predicting the following text jumps over<EOM> from the earlier text <PRE>The quick brown fox <SUF> a lazy dog<MID>. The model automatically learns the meaning of the special tokens and learns that it is expected to generate text that makes sense after the prefix but before the suffix!

At inference time, if we're trying to infill a document like the following:

Document before FIM inference

we can present it as

Document transformation when sent to FIM model

to the model and request characters until the model emits an <EOM> token, at which point it has successfully joined the prefix with the suffix.

FIM vs non-FIM models

With FIM, we can greatly improve the accuracy of code completion tools by providing context to the model that would otherwise be missing. Let's see some examples comparing two different code autocomplete tools, Codeium and Tabnine Pro.

Codeium is a free code completion product used by tens of thousands of developers around the world. Codeium's enterprise offering allows customers to self-host Codeium in their virtual private cloud or on-premise to ensure that no data is sent outside of the company. Tabnine is an AI code assistant that also offers self-hosting for enterprises.

Here are two suggestions with the same prompt for each tool. Codeium, on the left, is using a FIM model which can see the usage of the distance function below the cursor and is able to infer that it is supposed to compute the edit distance between a and b. Tabnine Pro, on the right, likely doesn't use FIM, and gives a worse suggestion as a result.

Codeium uses FIM


TabNine doesn't use FIM.

TabNine Pro

In this Golang code, Codeium understands that it needs to initialize the messages channel, while TabNine just outputs Hello World:

Codeium uses FIM to initialize the messages channel.


TabNine prints Hello World

TabNine Pro

Codeium can even generate an accurate docstring for an already-implemented function:

Codeium generates an accurate docstring.


TabNine generates an inaccurate docstring

TabNine Pro


Software engineering is rarely a linear task: programs are usually not written in one shot from start to end. Most day-to-day programming involves adding functionality, refactoring code, and fixing bugs—all tasks that benefit greatly from context after the cursor.

It should be no surprise then that code completion models trained with FIM capabilities easily outperform simple left-to-right models. Indeed when we deployed FIM for all Codeium users we saw large increases in our acceptance rates and user satisfaction.

Off-the-shelf code completion models like Salesforce Codegen (which powers FauxPilot) have not been trained with FIM, so code completion tools that want to use FIM need to train their own models. This is harder than it may seem—there are some subtleties involved in choosing where to cut the document and in ensuring that your model's left-to-right performance does not suffer.

If you'd like to try out Codeium's FIM code completion model, head over to our playground or try us out in your IDE of choice.