It's clear to me that OpenAI is quickly realizing they have no moat. Even this obfuscation of the chain-of-thought isn't really a moat. On top of CoT being pretty easy to implement and tweak, there's a serious push to on-device inference (which imo is the future), so the question is: will GPT-5 and beyond be really that much better than what we can run locally?
I wonder if they'll be able to push the chain-of-thought directly into the model. I'd imagine there could be some serious performance gains achievable if the model could "think" without doing IO on each cycle.
In terms of moat, I think people underestimate how much of OpenAI's moat is based on operations and infrastructure rather than being purely based on model intelligence. As someone building on the API, it is by far the most reliable option out there currently. Claude Sonnet 3.5 is stronger on reasoning than gpt-4o but has a higher error rate, more errors conforming to a JSON schema, much lower rate limits, etc. These things are less important if you're just using the first-party chat interfaces but are very important if you're building on top of the APIs.
I don't understand the idea that they have no moat. Their moat is not technological. It's sociological. Most AI through APIs uses their models. Most consumer use of AI involves their models, or ChatGPT directly. They're clearly not in the "train your own model on your data in your environment" game, as that's a market for someone else. But make no mistake, they have a moat and it is strong.
There are countless tools competitive with or better than what I use for email, and yet I still stick with my email client. Same is true for many, many other tools I use. I could perhaps go out of my way to make sure I'm always using the most technically capable and easy-to-use tools for everything, but I don't, because I know how to use what I have.
This is the exact dynamic that gives OpenAI a moat. And it certainly doesn't hurt them that they still produce SOTA models.
That's not a strong moat (arguably, not a moat at all, since as soon as any competitor has any business, they benefit from it with respect to their existing customers), it doesn't effect anyone who is not already invested in OpenAI's products, and because not every customer is like that with products they are currently using.
Now, having a large existing customer base and thus having an advantage in training data that feeds into an advantage in improving their products and acquiring new (and retaining existing customers) could, arguably, be a moat; that's a network effect, not merely inertia, and network effects can be a foundation of strong (though potentially unstable, if there is nothing else shoring them up) moats.
Yeah but the lock-in wrt email is absolutely huge compared to chatting with an LLM. I can (and have) easily ended my subscription to ChatGPT and switched to Claude, because it provides much more value to me at roughly the same cost. Switching email providers will, in general, not provide that much value to me and cause a large headache for me to switch.
Switching LLMs right now can be compared to switching electricity providers or mobile carriers - generally it's pretty low friction and provides immediate benefit (in the case of electricity and mobile, the benefit is cost).
You simply cannot compare it to an email provider.
It was pretty simple for me to switch email providers about ~6 years ago or so when I decided I'd do it. Although it's worth noting that my reasons for doing so were motivated by a strong desire around privacy, not noticing that another email provider did email better.
Everyone building is comfortable with OpenAI's API, and have an account. Competing models can't just be as good, they need to be MUCH better to be worth switching.
Even as competitors build a sort of compatibility layer to be plug an play with OpenAI they will always be a step behind at best every time OpenAI releases a new feature.
Only a small fraction of all future AI projects have even gotten started. So they aren't only fighting over what's out there now, they're fighting over what will emerge.
This is true, and yet, many orgs who have experimented with OpenAI and are likely to return to them when a project "becomes real". When you google around online for how to do XYZ thing using LLMs, OpenAI is usually in whatever web results you read. Other models and APIs are also now using OpenAI's API format since it's the apparent winner. And for anyone who's already sent out subprocessor notifications with them as a vendor, they're locked in.
This isn't to say it's only going to be an OpenAI market. Enterprise worlds move differently, such as those in G Cloud who will buy a few million $$ of Vertex expecting to "figure out that gemini stuff later". In that sense, Google has a moat with those slices of their customers.
But I believe that when people think OpenAI has no moat because "the models will be a commodity", I think that's (a) some wishful thinking about the models and (b) doesn't consider the sociological factors that matter a lot more than how powerful a model is or where it runs.
Doesn't that make it less of a moat? If the average consumer is only interacting with it through a third party, and that third party has the ability to switch to something better or cheaper and thus switch thousands/millions of customers at once?
LiteLLM proxies their API to all other providers and there are dozens of FOSS recreations of their UI, including ones that are more feature-rich, so neither the UI nor the API are a moat.
Branding and first mover is it, and it's not going to keep them ahead forever.
I don't see why on-device inference is the future. For consumers, only a small set of use cases cannot tolerate the increased latency. Corporate customers will be satisfied if the model can be hosted within their borders. Pooling compute is less wasteful overall as a collective strategy.
This argument can really only meet its tipping point when massive models no longer offer a gotta-have-it difference vs smaller models.
On-device inference will succeed the way Linux does: It is "free" in that it only requires the user to acquire a model to run vs. paying for processing. It protects privacy, and it doesn't require internet. It may not take over for all users, but it will be around.
This assumes that openly developed (or at least weight-available) models are available for free, and continue being improved.
Based on their graphs of how quality scales well with compute cycles, I would expect that it would indeed continue to be that much better (unless you can afford the same compute locally).