Security
Last updated: March 11, 2025
We recognize that we handle important intellectual property for our customers, both individuals and enterprises, so we aim to be overly comprehensive and transparent with how we approach security and privacy throughout our development and deployment.
Our prioritization of security & compliance has already instilled confidence in hundreds of thousands of developers and thousands of companies, including some of the world’s largest regulated enterprises. We plan to continue to maximize the value of our tools under any set of constraints that a customer may have.
If at any point you identify potential vulnerabilities or have security-related questions, please contact us at security@codeium.com.
Certifications and Third-Party Assessments
Codeium has SOC 2 Type II certification, and conducts annual third-party penetration testing (last completed on February 13, 2025). To receive copies of these documents, please fill out the form on our Trust Center.
Codeium also has available FedRAMP High accreditation. While this is a requirement for working with federal agencies and government-adjacent enterprises, it is an important vote of confidence for all of our customers, even if they do not use our FedRAMP’d deployment. This is because FedRAMP requires a number of secure development and company practices that are not requirements for SOC 2 Type II compliance. These include:
- Code review process that highlights security impact of changes, enforces a requirement for number of reviewers, and other compliant procedures
- Company MDM in place with posture management and active EDR on all employee devices (S1)
- Zero trust VPN for access to remote resources
- OWASP ASVS Level 1 Compliance (includes tooling such as Snyk), with path to Level 2 and Level 3 compliance over time
- Training relevant developers for disaster recovery and information security contingency planning
- Both tabletop and functional vulnerability testing
HIPAA compliance: In most cases, the data that a customer provides to us is not Personal Health Information (PHI) and does not need special compliance considerations in order to use our platform, even if you are a healthcare organization. This is particularly true for code, which does not carry any PHI itself. That said, our platform is maintained as HIPAA compliant and for significant implementations, we will entertain a Business Associate Agreement (BAA) to confirm HIPAA compliance.
Deployment Options
Unlike most AI tools, Codeium provides a variety of deployment options to match the security needs of any organization.
On our cloud tiers (individual plans, teams plans, and Enterprise Cloud plans), any AI requests are processed and routed on servers managed by Codeium, and depending on the operation, may be executed on servers managed by Codeium or by one of our subprocessors. For any teams or enterprise plans, all inputs and outputs to these requests follow zero-data retention policies by default. For any individual plan, users can opt-in to zero-data retention mode from their profile page. A large fraction of individual users have zero-data retention mode enabled. Read more about zero-data retention mode below. Enterprises can enable functionalities that require data retention (ex. remote codebase indexing, memories, recipes, and web retrieval) or to functionalities that require subprocessors where we don’t have zero-data retention guarantees with (ex. web search and MCP servers on Cascade).
On our Enterprise Hybrid tier, all functionalities that require data retention occur on a CPU-and-storage-only tenant that is managed by the customer. This component is provided as a Docker Compose application that can be deployed on an instance (EC2, GCE, Azure VM, on-prem) within the customer’s cloud or network. Any communication between the customer’s data-retaining instance and Codeium’s compute layer will only require outbound communication and is handled through a Cloudflare Tunnel Client to establish a persistent, secure tunnel between the two. Cloudflare handles client requests and forwards them through this daemon, eliminating the need to open firewall ports and allowing the customer’s origin to remain as secure and closed as possible.
For both our Cloud and Hybrid tiers for enterprises, we offer multiple underlying deployments for the pieces managed by Codeium in order to meet requirements of various sectors and countries around data processing and residency:
- Standard: Servers managed by Codeium, located in the United States
- FedRAMP High: Servers managed in an AWS GovCloud through Palantir’s FedStart Program
- Zero trust VPN for access to remote resources
- EU: Servers managed by Codeium, located in Frankfurt, Germany
On our Enterprise Self-hosted tier, all compute and data retention happens within a GPU-enabled tenant that is managed by the customer. The application is provided as a Docker Compose application or via Helm chart (for a Kubernetes deployment), and can be deployed within a customer’s private cloud (AWS, GCP, Azure) or on-prem datacenter. This tier supports connecting to a customer’s private trusted LLM-endpoint (ex. AWS Bedrock, Azure OpenAI, Google VertexAI). No traffic is ever routed past the customer’s firewalls except to this trusted endpoint. Even the installation and updates can be performed without a direct connection by locally downloading the images from Codeium’s container registry, uploading them to a private container registry, and performing deployment from that location. For full transparency, the Self-hosted tier, while providing maximum security, does not support a large number of Codeium’s cutting-edge products and capabilities, such as the Windsurf Editor or Cascade.
On all Enterprise tiers, Codeium supports Single Sign-On (SSO) via SAML, such as Microsoft Entra, Okta, Google Workspaces, or another SAML-supporting identity provider.
The most popular Enterprise deployment method is the Hybrid deployment to balance security needs on data retention with the ability to benefit from Codeium’s latest-and-greatest capabilities such as the Windsurf Editor and Cascade. If you are an organization that has more developers than the self-serve limit (200 developers), please [reach out](/contact/enterprise) to work with an account specialist to determine the proper deployment approach for your organization.
Data Flows
Note that most of the following details around our servers and infrastructure are relevant only to the Cloud and Hybrid deployments.
The following are all causes for requests to be made to our servers:
- Passive Experience: For Autocomplete, Supercomplete, and tab-to-jump (i.e. passive predictive AI suggestions), a request is made on every keystroke to the Codeium servers.
- Instructive Experience: For Command and Chat (i.e. experiences that require the user to manually write out a prompt for the AI), a request is made on every user instruction.
- Agentic Experience: For Cascade (i.e. agentic experience where the AI can take multiple “steps” independently), requests are made on every triggering user instruction, every reasoning step the agent makes, and on most tool calls. See further information about the Agentic experience below.
- Real-time Personalization: Even without a trigger such as a keystroke or user prompt input, requests are made in the background to build context, understand developer intent, or scan for potential next steps.
- Ahead-of-time Personalization: To build state on the existing codebases and other data sources, requests are made to perform embedding computations.
Within each of these requests, the client machine sends a combination of context, such as relevant snippets of code, recent actions taken within the editor, the conversation history (if relevant), and user-specified signals (ex. rules, memories, context pinning, etc). No single request contains entire codebases or large contiguous pieces of code data. Even for ahead-of-time personalization, any codebase parsing happens on the client machine and individual code snippets are sent to compute the embeddings so that the server is not receiving a single request with the entire codebase.
This data is sent to our infrastructure on GCP, which pulls precomputed information from client-independent sources such as remote indexing and combines all of these to a model runner that may perform inference on our managed infrastructure or route the inputs to the appropriate inference provider. The result is then returned back to the client machine to be displayed to the user, while usage analytics (no code data, only usage metadata) are logged to BigQuery within our GCP instance. If on an individual plan without Zero-data retention mode, logs that may contain code snippets and user trajectories could also be stored.
All data is encrypted via TLS between the client machine and our servers. We currently support only multitenant infrastructure and do not yet have a single-tenant option with our infrastructure.
Agentic Experience
Note that the agentic experience is only available on Cloud and Hybrid plans, currently only within the Windsurf Editor, not the IDE extensions.
Since the term “agentic” is relatively overused, we will define what it means for Codeium’s products. We define “agentic” as a system that is capable of multi-step reasoning and actions through a sequence of interspersed calls to large language models and invoked “tools” (ex. grep, ls, embedding search, web search, edit file, add file, etc). This is differentiated to the more “assistant” or “copilot” style of AI systems, where there is guaranteed to be a maximum of a single large language model inference call before requiring human intervention to accept a suggestion or continue the conversation.
Codeium’s current agent is named Cascade and can be classified as a “collaborative agent” as opposed to an “autonomous agent.” A collaborative agent operates on a surface that is visible and introspectable by the user, in our case the IDE surface, as opposed to an autonomous agent where the work happens asynchronously, perhaps on a remote machine. With a collaborative agent, a human is still entirely in the loop. The default behavior is that the collaborative agent can take multiple steps with safer tools (ex. grep, ls, embedding search, edit file, add file), but that the human has to explicitly approve actions such as terminal commands that could have side effects. Any state changes such as file edits are not immediately committed to the codebase, and require explicit review and acceptance by the user, maintaining the human-in-the-loop flow. This approach allows for much more capable AI systems while still maintaining the same levels of observability and human validation as “assistant” or “copilot” AI code assistants that have been widely adopted across companies of every size and industry.
With this understanding, in many ways, the data being sent to the Codeium servers for agentic experiences is similar to that of the passive and instructive experiences. Under each turn of the agent, this data is sent to the user-specified third-party inference provider to determine what action the agent should take (see next paragraph). Once the action is taken, the results become part of the conversation history that is incorporated into the data sent as part of the next request to the Codeium servers for the next turn of the agent. This alternating reasoning and tool-based action pattern creates the agentic experience, which ends when the reasoning step determines that no further actions need to be taken at the time. At periodic intervals, a request is made to summarize and checkpoint earlier parts of the conversation to prevent an unbounded explosion in conversation history and to improve performance.
Depending on the tool being called within the agentic step, a variety of actions could be taken:
- Some tools such as making code edits or performing an LLM-based search (Riptide) require additional model inferences and similar data is used as in the reasoning step.
- Many tools (ex. add file, grep, ls) will run a terminal command automatically using the client’s IDE’s native terminal. These are known, safe, constrained terminal commands with minimal, if any, side effects.
- Another tool suggests arbitrary terminal commands for the user to accept before being executed, which could include actions such as compilation, binary execution, infrastructure inspection, and more. These also use the client’s IDE’s native terminal. There are various modes for this tool, including an opt-in mode that will auto-run every command, independent of risk (unavailable for any Teams or Enterprise user), as well as controls to whitelist or blacklist various commands. By default, no suggested terminal command auto-runs for customer infrastructure security reasons.
- The web search tool is a Teams and Enterprise opt-in that constructs a search query that is sent to the Bing API to retrieve up-to-date website data. This query is derived from the user’s inputs, past conversation history, and potentially code data.
Contractors and Subcontractors
Depending on your choice of plan (and thus deployment), we may use some or all of the following subcontractors. In some cases we have listed contractors that form a part of our infrastructure but are not subcontractors with respect to our customers.
- Google Cloud Platform (GCP) Stores code data only if Cloud and relevant features are opted-in, sees code data: Usage analytics and logs are primarily hosted on GCP. These are located in the same region as the compute used for model inference. We also use GCP to host retained data under Enterprise Cloud plans if the opt-in has been selected for corresponding features (ex. remote indexing, organizational best practices, etc). This data retention happens in the customer tenant for the Enterprise Hybrid plans and therefore not within our instance of GCP.
- Crusoe Sees code data for inference: We manage Crusoe's compute for training some of our custom models, as well as hosting some of our custom models.
- Oracle Cloud Sees code data for inference: We manage Oracle Cloud's compute for training some of our custom models, as well as hosting some of our custom models. Our cluster in Frankfurt, Germany runs on Oracle Cloud.
- Palantir Sees code data for inference: We have utilized Palantir's FedStart program to achieve FedRAMP High accreditation, and serve our FedRAMP High customers through FedStart.
- AWS Sees code data for inference: We utilize AWS GovCloud within Palantir's FedStart program to serve our customer models for our FedRAMP High customers. We also leverage AWS Bedrock to serve some of Anthropic's models.
- OpenAI Sees code data for inference: We have a zero data retention agreement with OpenAI. Enterprise administrators can disable use of OpenAI models for their organization. We offer the optionality of using OpenAI's models for various AI requests. We may leverage OpenAI models independent of user selection for processing other tasks (e.g. for summarization).
- Anthropic Sees code data for inference: We offer the optionality of using Anthropic’s models for various AI requests. We may leverage Anthropic models independent of user selection for processing other tasks (e.g. for summarization). We have a zero data retention agreement with Anthropic. Team and Enterprise administrators can disable use of Anthropic models for their organization. Enterprise customers using our EU cluster will be utilizing Anthropic models served from an AWS Bedrock instance in Zurich, Switzerland. Enterprise customers using our FedRAMP environment will be utilizing Anthropic models served from an AWS Bedrock instance in an AWS GovCloud region.
- Google Cloud Vertex API Sees code data for inference: We offer the optionality of using Google Cloud Vertex API’s models for various AI requests. We may leverage these models independent of user selection for processing other tasks (e.g. for summarization). We have a zero data retention agreement with Google Vertex Cloud. Enterprise administrators can disable use of these models for their organization.
- xAI Sees code data for inference: We offer the optionality of using xAI’s models for various AI requests. We may leverage xAI’s models independent of user selection for processing other tasks (e.g. for summarization). We have a zero data retention agreement with xAI. Enterprise administrators can disable use of these models for their organization.
- Fireworks Sees code data for inference: We offer the optionality of using DeepSeek models for various AI requests. We have a zero data retention agreement with Fireworks. Enterprise administrators can enable use of these models for their organization.
- Bing API Sees text potentially derived from code data: Used for web search functionality. The search query that is sent to the Bing API to retrieve website data is derived from the user’s inputs, past conversation history, and potentially code data. We do not have a zero data retention agreement with Bing, so this must be explicitly enabled by Team and Enterprise administrators.
- Grafana Sees no code data: We use Grafana for logging and monitoring. These do not contain any code data.
- PagerDuty Sees no code data: We use PagerDuty for alerts and on call. This has no access to customer data of any form.
- Slack Sees no code data: We use Slack for internal communications. We may discuss logs of data for debugging purposes from users that are not using Zero-data retention mode.
- Google Workspace Sees no code data: We use Google Workspace for collaboration. We may discuss logs of data for debugging purposes from users that are not using Zero-data retention mode.
- Firebase Sees no code data: We use Firebase for customer authentication (without SSO). Firebase may contain some personal data (name, email address).
- Okta Sees no code data: We use Okta for internal identity and access management to maintain security of all internal systems. Okta does not have access to customer data of any form.
- Stripe Sees no code data: We use Stripe to handle billing. Stripe may contain your personal data (name, credit card, address), but cannot access code data.
- Vercel Sees no code data: We use Vercel to deploy our website. The website cannot access code data.
- Mintlify Sees no code data: We use Mintlify to deploy our docs site. The docs site cannot access code data.
- Zendesk Sees no code data unless provided by user: We use Zendesk for customer support. Zendesk has no direct access to code data or logs, but may store logs provided by users for debugging purposes.
- Retool May see code data if not on zero-data retention: We use Retool for dashboards to view usage analytics and aggregate statistics. We may expose logs of data for debugging purposes from users that are not using Zero-data retention mode.
- Metabase May see code data if not on zero-data retention: We use Metabase for dashboards to view usage analytics and aggregate statistics. We may expose logs of data for debugging purposes from users that are not using Zero-data retention mode.
- Tableau May see code data if not on zero-data retention: We use Tableau for dashboards to view usage analytics and aggregate statistics. We may expose logs of data for debugging purposes from users that are not using Zero-data retention mode.
- Salesforce Sees no code data: We use Salesforce for enterprise customer account management. Salesforce may contain personal data (ex. name, email), but cannot access code data.
- Hubspot Sees no code data: We use Hubspot for marketing efforts. Hubspot may contain personal data (ex. name, email) for marketing campaign purposes, but cannot access code data.
- Brevo Sees no code data: We use Brevo for email campaigns. Brevo cannot access code data.
Attribution and Compliance
You own all of the code generated by Codeium’s products, to the extent permitted by law.
We recognize deeply the contribution of public open-source software to the progress of generative AI and the software industry at large. Within public code, there are various levels of licensing. While permissively licensed code can be used in other works, including commercially licensed works, non-permissively licensed code is not as forgiving.
To the best of our ability, we have sanitized any of the public data that we use for training by removing any non-permissively licensed code, or code that is similar to the non-permissively licensed code via Jaccardian edit-distance. We recognize that for some of our models built on top of third-party large language models, we cannot make representations as to all the data that has been used to train the model overall as we are subject to the practices of the model builders. We also cannot make representations as to code generated by these models because of their intrinsic nondeterminism. This is why we have also built state-of-the-art attribution filtering that is run on every generation of autocomplete, command, or chat.
Any generated code that is similar to non-permissively licensed code is intercepted and not shown to the user to minimize any chances of non-permissive code being accepted by an unaware user. We compute similarity via a line-by-line fuzzy matching algorithm of hashes of the lines of generated code against precomputed hashes of the corpus of existing public code, a more robust detection algorithm than naive multi-line exact string matching. This is done automatically, for any user on any Codeium plan. For enterprises, we are able to complement these technical solutions with industry-leading indemnity clauses to provide piece-of-mind from a compliance perspective.
On our Enterprise Hybrid and Self-hosted deployments, we are able to further compliance by providing attribution logging of any of the generated code, even for matches the permissively licensed matches. Having a log of such snippets can further an enterprise’s comfort with generative AI from a compliance standpoint. This log is stored entirely within the component of Codeium that is hosted in the customer’s private tenant for these deployment methods, an advantage of these non-Cloud deployment methods.
We also provide audit logs for the Enterprise Hybrid and Self-hosted deployments. Today, this means that every accepted autocomplete suggestion and every chat conversation is logged to a database so that the enterprise can have a trail of AI generations for potential audit purposes. Again, these logs are stored entirely within the component of Codeium that is hosted in the customer’s private tenant. With both attribution and audit logs, there is still zero data retention of code snippets or code-derived data within Codeium’s servers or subprocessors.
Client Security
The Codeium extensions are proprietary extensions into various existing IDE platforms, such as Visual Studio Code, the JetBrains Suite, Eclipse, and more. To see the full list of IDE platforms supported, please visit our download page.
The Windsurf Editor is a fork of the open-source Visual Studio Code (VS Code), maintained by Microsoft. We regularly merge the upstream `microsoft/vscode` codebase into the Windsurf Editor fork to incorporate general updates and upstream security patches. On top of this, we will immediately cherry-pick and high-severity security related patch in the upstream Visual Studio Code codebase and release a new version of the Windsurf Editor immediately. You can check which version of VS Code that your Windsurf Editor version is based on by clicking "Windsurf > About Windsurf" in the app, and refer to Visual Studio Code’s GitHub security page to be aware of any corresponding security advisories.
For both the Codeium extensions and the Windsurf Editor, we make requests to the following domains as part of our Cloud and Hybrid deployments. If you're behind a corporate proxy, please whitelist these domains to ensure that our products work correctly.
server.codeium.com
: Used for most API requests.web-backend.codeium.com
: Used for requests from the Codeium website.inference.codeium.com
: Used for certain inference requests.codeiumdata.com
,*.codeiumdata.com
: Used to host language server and Windsurf downloads.
Codebase Indexing
Codeium allows for a personalized experience by offering indexing of private codebases to be used at inference time to retrieve potentially relevant snippets of code from across the codebases, which are then appended to the original request to further ground the LLM’s responses.
There are multiple forms of codebase indexing offered by Codeium. In general, codebase indexing is done upon an abstract syntax tree (AST) representation of the codebase, which provides superior performance than file-level indexing or naive chunking, especially with large files seen in enterprise work. This is because each indexed “entity” is a semantic block of code (ex. function, method, class, etc) as opposed to an entire file which may contain multiple semantic blocks or an arbitrary chunk of code that could contain many parts (or just a subset) or a single semantic block. This does not change much from a codebase security perspective, but it is important context for how we’ve architected the system.
The first method is local indexing, where the repository in the editor workspace is preprocessed (up to a fixed, configurable number of files to prevent memory issues). For this preprocessing, Codeium’s client on the user’s machine generates the AST representation of the codebase, chunks the code according to the AST representation, passes these chunks independently to our server to compute the embedding, and then receives and stores the computed embedding with a pointer (file path, line range) to the code snippet within a custom vector store index on the user’s machine. Files and subdirectories specified by .gitignore
or.codeiumignore
are ignored by the embedding service. As code changes are made, a background process at regular intervals makes the corresponding changes to the AST are made and the corresponding embeddings are recomputed and updated so that an accurate representation of the codebase is reflected.
The second is remote indexing. The benefits of remote indexing are (a) to store an index for a larger codebase, which might be too large for the user’s client machine and (b) provide the user context from repositories other than the one currently active in their IDE. For remote indexing, a read-access token to the repository is provided to the Codeium’s embedding service (hosted by Codeium for Cloud and part of the customer’s private deployment for Hybrid and self-hosted), but otherwise the preprocessing is generally equivalent as with a local index, except there is no limit to the number of files as the index is stored on the server side as opposed to client side. The other difference is that the original code entity is stored with the corresponding embedding vector since the raw code may not be available on the user’s machine. For Cloud deployments, as this does require retention of code snippets and code-derived information, it is required for code-snippet telemetry to be turned on or explicit admin enablement of this capability. When we do store this information, it is securely encrypted at rest. The remote index can be updated at a frequency specified by the administrator on the indexing control page. For Hybrid and Self-hosted deployments, this is not an issue as these indexes are stored in the component of the deployment that lives in the customer’s private tenant (the “data plane”). This is a major advantage for the Hybrid deployment over hosted solutions, both Codeium’s Cloud deployment and pretty much every other major AI code assistant, as it provides maximal personalization and value (as well as future proofs for any other personalization features that require data retention) while not having any code snippets or code-derived information being retained on Codeium servers or subprocessors.
At inference, we compute an embedding, and then use nearest neighbor search across both the local index and remote index to capture both any local changes to the codebase and any relevant code in other repositories, respectively.
Zero Data Retention
Zero-data retention mode is a mode that guarantees that code or code-derived data is never serialized and stored in plaintext at our servers or by our subprocessors. Zero-data retention mode is the default for any user on a team or enterprise plan and can be enabled by any individual from their profile page. This automated zero-data retention guarantee is what allows us to be trusted by the largest Fortune 500 organizations with enterprise-wide rollouts, even in highly regulated environments, so we naturally treat it as a critical promise to our users.
With zero-data retention mode enabled, code data is not persisted at our servers or by any of our subprocessors. The code data is still visible to our servers in memory for the lifetime of the request, and may exist for a slightly longer period (on the order of minutes to hours) for prompt caching The code data submitted by zero-data retention mode users will never be trained on. Again, zero-data retention mode is on by default for teams and enterprise customers.
That said, For cloud implementations only (not hybrid or self-hosted), we may store profile data for authentication and to operate the service for you, and we may store inputs if flagged as potentially violating our Acceptable Use Policy.
Account Deletion
You can delete your account at any point from your profile.
Vulnerability Disclosures
If you believe you have found a vulnerability in Codeium, please email us at security@codeium.com.
We commit to acknowledging legitimate vulnerability reports within 5 business days, and addressing them as soon as we are able to. Critical incidents will be communicated via email to all users.