LLM Vendor Lock-in: How OpenAI and Anthropic Trap Enterprise Customers
1,383 documented OpenAI outages. 15-hour service failures. Prompts that break across providers. How to architect AI systems that survive vendor chaos.
January 4, 2024. OpenAI retired 33 models in a single day, including GPT-3 and every fine-tuned model built on the deprecated base. Teams that had spent months tuning text-davinci-003 for their specific use cases woke up to migration notices.
August 28, 2025. Anthropic announced consumer Claude data would be used for training by default, with retention extended from 30 days to 5 years unless users opted out by September 28.
December 11, 2024. An OpenAI configuration error brought down the entire platform for 4.5 hours. Two weeks later, an Azure datacenter power failure caused a 9-hour outage.
These aren't edge cases. They're the operating reality of building on third-party AI infrastructure. The question for enterprise teams isn't whether to evaluate lock-in risk. The question is how to measure it systematically.
The Lock-in Scorecard
We evaluated five major LLM providers across six dimensions that directly affect enterprise AI operations. Each dimension scored 1-5, where 5 means lowest lock-in risk. Total possible score: 30.
| Provider | Data Policy | Model Stability | Fine-Tune Portability | API Compat | Operational Risk | Contractual Risk | Total |
|---|---|---|---|---|---|---|---|
| OpenAI | 3 | 2 | 1 | 4 | 2 | 4 | 16/30 |
| Anthropic | 3 | 4 | 1 | 3 | 3 | 4 | 18/30 |
| Google (Vertex) | 4 | 3 | 2 | 3 | 3 | 4 | 19/30 |
| Mistral | 4 | 4 | 4 | 4 | 3 | 3 | 22/30 |
| Cohere | 4 | 4 | 3 | 3 | 3 | 4 | 21/30 |
Lower scores indicate higher lock-in risk. Now let's break down what these numbers actually mean.
Dimension 1: Data Policies
Your prompts, outputs, and fine-tuning datasets sit on provider infrastructure. What happens to them matters.
Training usage:
- OpenAI: API data not used for training by default. Consumer ChatGPT data used unless opted out.
- Anthropic: API and commercial tiers don't train on data. Free/Pro/Max plans opt-in by default since September 2025.
- Google Vertex: Does not use customer data for training without explicit permission.
- Mistral: API data not used for training by default. Zero data retention available.
- Cohere: Opt-out toggle in dashboard. Enterprise customers can fully disable.
Retention periods:
- OpenAI: API logs retained 30 days (moving to 7 days in some tiers). Zero data retention available for enterprise.
- Anthropic: API logs 7-30 days. Consumer plans with training enabled: 5 years.
- Google Vertex: In-memory caching only (24-hour TTL). Grounding features retain 30 days.
- Mistral: 30 days for abuse monitoring. Zero data retention available.
- Cohere: 30 days by default. Zero data retention available.
Export capability: All providers allow you to download your uploaded datasets. None allow export of derived embeddings or model-internal representations. This matters less if you're building RAG applications where your data stays in your own vector store.
Score rationale: Google and Mistral score highest because zero data retention is straightforward to configure. OpenAI and Anthropic lose points for recent policy changes that expanded data usage in consumer tiers. Teams concerned about data privacy in RAG architectures should weight this dimension heavily.
Dimension 2: Model Stability
How often do models disappear, and how much notice do you get?
OpenAI deprecation history:
- January 4, 2024: 33 models retired in one day, including all fine-tuned models on deprecated bases
- June 2024: GPT-4-32k and GPT-4-vision-preview announced for deprecation
- November 2025: GPT-4o-realtime-preview, DALL-E snapshots, chatgpt-4o-latest all announced for retirement
- Assistants API deprecated August 2025 with one-year notice
Notice periods:
- OpenAI: GA models get 60-day minimum notice, often 6-12 months. Preview models: 30-90 days.
- Anthropic: No published deprecation policy. Historically conservative about model changes.
- Google: GA models guaranteed 12 months availability. Preview models 90-120 days.
- Mistral: No published deprecation policy. Open-source models remain available indefinitely.
- Cohere: No published deprecation policy.
Score rationale: OpenAI scores lowest because they've deprecated more models with less notice than any other provider. Anthropic and Mistral score highest due to conservative release practices and (in Mistral's case) open-source availability of key models. For teams building production systems, model reliability and evaluation becomes critical when provider model changes can break your application.
Dimension 3: Fine-Tuning Portability
You invest months building fine-tuned models. Can you take them with you?
Weight export:
- OpenAI: Fine-tuned weights cannot be downloaded. You get a model ID that only works through their API.
- Anthropic: Fine-tuning available in limited beta. No weight export.
- Google Vertex: Fine-tuned adapters stay on Google infrastructure. No export.
- Mistral: Fine-tuned models through La Plateforme are not exportable. However, open-source base models (Apache 2.0) can be fine-tuned locally with full weight ownership.
- Cohere: Fine-tuned models stay on Cohere infrastructure. Datasets can be exported.
Training data ownership: All providers confirm you retain ownership of your training data. But that data sits on their servers during fine-tuning, and in some cases (OpenAI's terms until recently) could theoretically be used to train competing models unless you opted out.
What actually transfers: When you leave a provider, you take your original training datasets. You don't take the fine-tuned weights. For OpenAI, Anthropic, and Google, this means starting fine-tuning from scratch on your new platform.
Score rationale: OpenAI and Anthropic score 1 because fine-tuning creates complete lock-in. Mistral scores 4 because you can fine-tune their open-source models locally with full control. Cohere and Google score 2-3 because while weights aren't portable, the fine-tuning process is more standardized. Teams pursuing enterprise fine-tuning strategies should consider whether weight portability matters for their use case.
Dimension 4: API Compatibility
How hard is migration from a technical standpoint?
OpenAI SDK compatibility: Most providers offer OpenAI-compatible endpoints. This means changing a base URL and API key can theoretically migrate basic chat completions. In practice:
- Anthropic: Separate SDK with different message format. Migration requires code changes.
- Google Vertex: Different SDK and auth model. OpenAI compatibility layers exist but are incomplete.
- Mistral: OpenAI-compatible endpoints available. Migration relatively straightforward for basic usage.
- Cohere: Separate SDK. No direct OpenAI compatibility.
What breaks across providers:
- System prompts behave differently
- Function calling / tool use syntax varies significantly
- Token limits and pricing models differ
- Response formatting and JSON mode implementations vary
- Rate limits and error handling patterns differ
Migration effort estimate: Based on community reports, migrating a production application from OpenAI to another provider typically consumes 20-50% of the original development time. Most of that goes into prompt re-engineering and behavioral testing. This is why observability and evaluation tooling matters before and after migration.
Score rationale: OpenAI scores 4 because it's the de facto standard others imitate. Mistral scores 4 for genuine compatibility. Anthropic, Google, and Cohere score 3 because migration requires meaningful code changes.
Dimension 5: Operational Risk
When the provider goes down, your product goes down.
Outage data:
- OpenAI: 1,383 documented outages since August 2021 (StatusGator data). Major incidents: December 11, 2024 (4.5 hours), December 26, 2024 (9 hours), June 10, 2025 (12+ hours). Stated uptime ~99.3%, which equals roughly 5 hours of downtime per month.
- Anthropic: Fewer documented outages. Generally more stable but lower visibility into historical data.
- Google Vertex: Enterprise SLAs available. June 2025 outage affected multiple dependent services.
- Mistral: Smaller customer base means fewer outage reports. No enterprise SLA documentation.
- Cohere: Enterprise-focused with SOC 2 compliance. Limited public outage history.
SLA commitments:
- OpenAI: 99.9% SLA for Enterprise tier only
- Anthropic: SLAs available for Enterprise tier
- Google Vertex: 99.9% SLA with financial credits
- Mistral: No published SLA
- Cohere: Enterprise SLAs available
Score rationale: OpenAI scores 2 due to documented frequency of outages. Others score 3-4 based on available data and SLA offerings.
Dimension 6: Contractual Risk
What legal protections do you actually have?
IP Indemnification:
- OpenAI: Copyright Shield for ChatGPT Enterprise and API customers. Covers output IP claims with standard exclusions.
- Anthropic: Enterprise tier includes IP indemnification for authorized use.
- Google: Indemnifies Workspace and Cloud customers for both training data and output claims.
- Microsoft (Azure OpenAI): Copilot Copyright Commitment extends indemnification with guardrail requirements.
- Mistral: No published indemnification policy.
- Cohere: Indemnification provisions in enterprise agreements.
Terms change frequency:
- OpenAI: Multiple terms updates per year. API terms changed significantly in May 2025.
- Anthropic: Major consumer terms overhaul in September 2025.
- Google: Regular updates with enterprise stability guarantees.
- Mistral: November 2025 commercial terms update.
- Cohere: Relatively stable terms structure.
Score rationale: Most major providers now offer comparable IP indemnification. Mistral scores lower due to less mature enterprise legal structure.
How to Use This Scorecard
For procurement teams: Weight dimensions by your specific risk tolerance. If you're in regulated industries, Data Policy and Contractual Risk matter most. If you're shipping consumer products, Operational Risk and Model Stability drive decisions.
For technical teams: API Compatibility and Fine-Tuning Portability determine your migration cost. Score these higher if you anticipate provider changes within 2-3 years.
For risk management: Model Stability history predicts future disruption probability. OpenAI's aggressive deprecation schedule isn't a bug. They optimize for iteration speed over backward compatibility.
Multi-provider strategy: The highest-scoring providers (Mistral, Cohere) aren't necessarily the best choice. They're the lowest lock-in choice. If you need the most capable models, accept higher lock-in scores and invest in abstraction layers. Some teams use small language models for cost-sensitive workloads while routing complex tasks to frontier models.
PremAI takes a different approach to this problem entirely. Rather than choosing between providers with varying lock-in risks, self-hosting infrastructure eliminates provider dependency. Fine-tuning on sovereign infrastructure means your weights stay under your control. And comprehensive evaluation tooling lets you benchmark model performance before committing to any provider. Teams with strict compliance requirements can deploy entirely on-premise with zero external dependencies.
Provider Quick Profiles
- OpenAI (16/30) : Strongest capabilities. Highest lock-in. Aggressive deprecation schedule. Fine-tuned models non-portable. 1,383+ documented outages since 2021. Best for: teams that prioritize frontier model performance and can absorb migration costs. Consider cost optimization strategies to manage spend.
- Anthropic (18/30) : Conservative model updates. Recent consumer data policy changes. Strong enterprise protections. Fine-tuning limited. Best for: teams prioritizing stability over bleeding-edge features.
- Google Vertex (19/30): Enterprise-grade SLAs. Zero data retention options. Google cloud ecosystem integration. Best for: teams already invested in Google Cloud infrastructure.
- Mistral (22/30) : Lowest lock-in due to open-source model availability. Smaller model selection. European data sovereignty focus. Open-source models enable local deployment options without API dependency. Best for: teams that want portability optionality.
- Cohere (21/30): Enterprise-focused from day one. Strong RAG and embeddings capabilities. Flexible deployment options including private cloud. Best for: teams building search and retrieval applications or advanced RAG implementations.
FAQ
Which provider has the best data privacy?
For zero data retention, Google Vertex and Mistral are most straightforward to configure. All providers offer enterprise-tier options that don't train on your data.
Can I export my fine-tuned model from OpenAI?
No. OpenAI fine-tuned models exist only as API endpoints. You cannot download weights. If you leave, you restart fine-tuning on a new platform with your original training data.
How often does OpenAI deprecate models?
OpenAI has announced deprecations for 50+ model versions since 2023. The January 4, 2024 retirement event removed 33 models simultaneously. GA models typically get 6-12 months notice. Preview models get 30-90 days.
What happens to my fine-tuned model when the base model is deprecated?
With OpenAI, your fine-tuned model continues to work until its own deprecation date. But you cannot create new fine-tunes on deprecated bases. Plan migration to new base models before deprecation.
Is IP indemnification standard across providers?
Yes, for enterprise tiers. OpenAI, Anthropic, Google, and Microsoft all offer copyright indemnification for API customers using their products as intended. Exclusions exist for misuse, tampering with safety systems, and knowing infringement.
Which provider is most reliable?
Based on documented outages, OpenAI has the most frequent disruptions. However, they also handle the highest traffic volume. Enterprise tiers with SLAs provide financial credits for downtime but don't prevent the outage itself.
Should I use multiple providers?
Yes, if your application can tolerate it. Multi-provider routing reduces single-point-of-failure risk and can reduce costs through arbitrage. Expect 20-30% of development time to go into abstraction layer maintenance. Some teams use agentic frameworks to route requests intelligently across providers.
For teams evaluating sovereign AI infrastructure that eliminates provider lock-in entirely, explore PremAI's enterprise platform. See also: Enterprise AI Trends for 2025.