book call

Current Japanese Micro-Season

Loading...

Loading...

Loading...

Loading...

The Light from a Dying Star is from the Past

You're not taking AI overhangs seriously enough

Trey Causey

If frontier AI development stopped today, we’d see years of continued progress. It wouldn’t come from new capabilities, but from the realization and implementation of capabilities that already exist. This is the overhang problem: the gap between what is already possible and what is currently deployed. Capability that exists but hasn’t yet converted to widespread use. The overhangs associated with AI coding agents are so large today that many parts of knowledge work are irreversibly changed even if that change has yet to become legible.

This is such an important concept to understand the world right now, but I see evidence that it’s still widely unrecognized (is that an overhang overhang…?) Overhangs are clearly under-appreciated in the anti-AI discourse, but being a job seeker right now has also made it abundantly clear to me that most organizations, especially in their recruiting / hiring, have not internalized it.

This post is primarily talking about the coding abilities of models especially the agentic coding capabilities that have been developed on top of these models. I’ve written this post to be as accurate as possible, not based on any personal wishes or hopes about where AI will go. If you want to be convinced of that line of thought, Kelsey Piper has an absolutely outstanding piece about how to think carefully and logically about AI even if, especially if, you hate AI.

Coding is an important thing to focus on because:

  1. It produces verifiable outputs that models love to use.
  2. It’s arguably the domain where the most progress has been made already.
  3. Many, many problems can be converted into things that software can solve.

And coding is how we get software.

Overhangs

I see four different kinds of overhangs that currently exist:

  1. Capability overhangs (those continue to be discovered and explored)
  2. Tooling overhangs
  3. Organizational overhangs
  4. Knowledge overhangs

Capability overhangs

Right now, models are being released with tremendous capabilities. GPT-5.2 was released last week with extremely impressive benchmarks. Anthropic’s Opus 4.5 has been glowingly reviewed and benchmarked. Gemini 3 was only a couple of weeks ago!

If these were the last models that were ever released by these labs, we’ve only scratched the surface of the capabilities of these models. We know that how they are prompted, the tools they have access to, the data they have access to, and so on, all continue to reveal new things that they are good at or not good at. Models are simply released too fast and benchmarks take a long time to develop to accurately cover the entire universe of things that models can do. So therefore when models are released we often have a very rough estimate of what they are capable of. I don’t say this speculatively or out of evangelism. It’s simply empirically true and well-documented.

However, I don’t expect that these will be the last models that are ever released. I suspect the labs will continue to release very impressive models and I suspect a good bet that you could make would be that every model that’s released over the next year will continue to either saturate or make significant progress on both existing and new benchmarks.

Tooling overhangs

How models’ capabilities are realized and deployed depend a lot on the specific tools that work with or on top of the models. Indeed, many of the tools we see being developed right now are a direct result of trying to close a capability overhang by making latent capabilities explicit. IDE integrations, command-line tools, MCPs, and so on all fall into this category. These are very, very early days and it’s not clear what the best / right tools are for many, or if they are even available to use. These tools, though, are all kinds of scaffolding that help available models become more effective in their current state. It’s a mutually symbiotic relationship, too. Some tools aren’t ready for prime time below a certain level of model capability; Cursor famously didn’t ‘work’ until GPT-4 was released.

And other tools make existing models super-effective. Coding agents like Claude Code, Codex, Gemini, and Amp all create the harnesses1 on top of models that turn them into coding agents. These continue to change and improve the performance of a model undergirding it despite the model itself not necessarily changing day-to-day.

Knowledge overhangs

Most people are unaware of how large the capability overhang is. If I had to make a specific prediction, I would say that the majority of people have formed their opinion about AI capabilities through a very limited number of unfocused trials, often with a model that is 2+ generations behind the frontier. To be fair, it’s a fast-moving space, older models are almost always cheaper, and engineers famously hate to use the most recent version of things, preferring instead to wait for a stable release with fewer bugs. (I honestly think this is exactly wrong with AI models, but I’ll leave that for now.) People are making decisions about not only how useful AI might be, but how useful AI actually is today for software development based on outdated information.

There’s a lot of entrepreneurship in the #thoughtleadership space right now when it comes to effective AI usage for coding. Some of it is quite good! Very accomplished and ambitious engineers like Steve Yegge, Armin Ronacher, and Simon Willison are all providing tremendous advice and wisdom to anyone who will listen. A lot of the thought leadership and advice on how to code with AI is .. not very good, though.

This knowledge is extremely unevenly distributed. Even assuming the best possible intent, where an individual wants to use coding agents effectively, it’s hard to find accurate information, and “accurate” changes monthly. If you then factor in individuals who are anti-AI for a variety of legitimate reasons, their motivation to seek out accurate information is quite low. All in all, it means that quite powerful information is concentrated in a very small number of people worldwide. This is a really challenging problem to solve although I’m optimistic that it is solvable.

Organizational overhangs

One primary way knowledge overhangs are overcome is via organizations. Most organizations are slow to adapt, just by nature, and are juggling many priorities simultaneously with various levels of incomplete information. Often times, the level of incompleteness seems critical, so the rational move for them is to wait and see what happens. See if another model becomes “the standard”, see if the pace of change slows down, see what their competitors are doing, and so on. Better to adopt the “right” tools than move too quickly. This is going to be fatal for some organizations, precisely because capability overhangs are growing and because organizations that recognize this will completely outmaneuver those who don’t.

It takes a long time for big companies, really even mid-sized companies, to do things. There’s security, there’s privacy, there’s compliance, there are trainings. There’s the purchasing cycle. There’s the changes to how teams are incentivized to operate. And all of the individual variations in knowledge and willingness that are distributed across the company and aggregate up with the organizational level.

We’ve seen a lot of really ham-fisted rollouts of AI where those instructions are “you must use AI or get fired” which has created a lot of resentment as we’ve seen here in Seattle as well. But these major missteps don’t change the fact that these are critical organizational survival questions. Some organizations are going to figure out how to overcome the organizational overhang. They’re going to follow what people who already are quite sophisticated at deploying coding agents are doing and they’re going to adopt those practices.They’re going to see increased velocity and output, and they’re going to see the ability for individuals to ship things that were previously not possible at all for individuals, much less teams.

Solving the organizational overhang is tremendously hard and will be fatal for some organizations. But if they don’t solve it and if they don’t change who they are recruiting, how they are recruiting, and what they are incentivizing, that is the likely outcome for many of them.

Not convinced? Some helpful analogies

Climate Climate-aware readers already understand overhangs. Even if carbon emissions stopped today, significant warming is already “committed” in that the CO2 already in the atmosphere continues to affect the system regardless of what we do next. This is the entire argument behind carbon removal or sequestration in addition to reducing our greenhouse gas output. There’s sufficient carbon in the atmosphere that warming will continue until some equilibrium state is reached.

For this argument, it doesn’t matter how much warming will happen, or if you think it’s going to go to 1.5º or 2º Celsius, the important part is that you recognize that the inputs to the atmosphere aka greenhouse gases have their own dynamics semi-independent of the climate mechanics due to the CO2 already in the atmosphere.

Demography If that analogy doesn’t work for you, or you’re too angry at the fact that I used a climate example for an essay describing the capabilities of AI, you could think about demographic momentum. This is a hot topic right now for good reason. If you look at countries like Japan or Korea, it’s well-established that many fewer children are being born than were two generations ago. Even if people suddenly started having kids at much higher rates, a demographic shift in those societies over the next 20 to 50 years is unavoidable. It can’t be undone because you can’t go back in time and grow the population through children that were never born.

I’m not making a normative or natalist argument about the “right” population growth rate. I’m simply stating what’s empirically true and that the consequences of this will require solutions even if the underlying process (fertility) changes dramatically today. A population that has a high average age requires a lot of labor to provide services, medical care, and so on. That labor will need to come from somewhere. If it doesn’t come via new babies that grow up into adults, it will require immigration or automation or other forms of innovation.

What does this mean for recruiting and hiring?

How does all of this show up concretely? As I said above, I’m a job seeker for the first time in a long time. How companies recruit and screen potential hires and the roles they open say a lot about the problems they see as critical to solve. And I’m not seeing a lot of companies trying to solve these problems.

Many of the interview processes still begin with some version of leetcoding, demonstrating that you can produce specific answers to specific coding exercises in some kind of shared-screen environment and without the assistance of AI. This has always been a bad way to find talent, but we standardized on it as an industry because it’s legible.

It’s particularly bad, though, for the world that we already live in exists but is unevenly visible. Memorizing specific coding problems and then reproducing them is going to be of almost no value in the near future.

I see a lot of complaints about coding agents producing a lot of code with repeats, or choosing the wrong abstraction, or being too verbose. Those are all valid criticisms of code that is written for humans to consume. But if coding agents are the primary consumers of code in addition to the producers of code, it probably isn’t. In that case, abstractions may be illegible to models. And they have no problem reading a lot of code quickly to get the full context, which humans are bad at and thus try to reduce the amount of code produced. Of course, this can improve with model capabilities but thinking that good code and well-written code are the same for humans and for coding agents is probably a mistake. But coding practices that still screen for LeetCode and human preferences miss that.

Organizations need to be finding role entrepreneurs, those individuals who care more about building than about protecting swim lanes and role boundaries. Product managers that can build prototypes of their ideas, who can explore architectural and design decisions without involving engineers. Data scientists who can ship their own instrumentation and experiments. Engineers who are comfortable spending time up front thinking, designing, and planning.

You really want to be looking for is someone who can “collapse the talent stack” as Scott Belsky has said. Someone who can remove layers, play multiple roles, and tighten decision-making and feedback loops. Coding agents do this naturally, and anyone who uses them for a sustained period of time comes to realize what a force-multiplier they are.

Of course, identifying people that can do this is very hard! Doubly so when you expect recruiters to do so, as they almost never work directly in the space. They’ve always relied on partnerships with their hiring managers within the engineering & product teams to guide them on who they need to hire. Asking them to identify and evaluate individuals who can work in totally new ways is a really tough ask, but whoever gets it right is going to do very well.

(By the way, if this sounds like something you’d like to try, discuss, or hire for, book a call with me or get in touch!)

Counterarguments

It’s definitely true that analogies are analogies. They’re not destinies; they’re not even really falsifiable predictions but they do provide ways of reasoning about how things might go.

AI is a bubble and will fade, just like crypto did. I’m belaboring these points precisely because a lot of AI criticism seems to be built on the assumption that since AI could be / is a bubble, it’s just going to go away if we wait long enough, potentially after cratering the economy. Not only is this wishful thinking, it’s also extremely short-sighted. Waiting to see if AI goes away is going to be a losing proposition certainly for individuals and likely for many organizations.

Capabilities are mostly hype and overstated. It could be that models are benchmaxxing (i.e., overfitting to benchmarks in ways that look impressive on paper but in reality don’t represent progress on real-world scenarios). This is far too individualized of an argument to provide a lot of analytical clarity because it is clear that lots of really smart capable accomplished engineers are finding the opposite. It’s also clear that models are making unambiguous progress on mathematics thanks to their performance in competitions. It is entirely possible that agents are not good at the tasks that you care about, but it’s not credible to assert that they’re not really progressing.

Capability overhangs sometimes reverse. It is possible sometimes that capabilities don’t continue to progress in fact go backwards. Or the capability overhang is never actually realized; the gap has never closed. Nuclear is a good example of where we actually had the capability to deploy nuclear energy widely but rolled it back. This wasn’t a capability problem as much as it was a risk perception and political and organizational problem but, that being said, nuclear just didn’t continue to grow and get better.

Virtual reality is another example; VR has had several false starts and has been “the next big thing” a few times now. I am personally a big fan of VR and consider it a pretty magical technology, but it is just true that the market has never been the size that VR aficionados would like it to be despite even major investments like Google Glass and Meta Quest.

Theoretically, this could happen with coding agents. However, open source models like Kimi K2 makes this argument hard to stand by. Open source models are at or near state-of-the-art performance, the weights have been published, and in some cases the data sets have been published. You can’t un-publish those weights or un-release those models.

The worst-case scenario is that AI is a huge bubble, all of the major labs shut down because they can’t afford the infrastructure budgets to continue training models, they never release another model, and they take all their models offline and never release the weights. In that scenario, there are still models out there like Kimi and DeepSeek that were trained on a much less capable set of hardware, for much less money, are widely available, and provide the recipe for building more models. It’s not like some other advancements like nuclear or VR where the costs to continue developing the hardware or capabilities is prohibitive for all but a select few number of ultra-capitalized innovators.

Another AI winter is due. This is a common “conversation-ender” counterargument. All previous AI booms have been followed by a period of stalled advancement and lack of commercial viability. So the argument is that this could be happening here again or that we have reached the limits of progress.

This is also possible, although the capability overhang means that we don’t actually know where the wall is currently. Secondly, this is a ubiquitous possibility for any innovation and therefore I don’t actually think provides any analytical clarity for where AI may go or may not go. It’s not a falsifiable prediction. In this sense “AI winter is coming” arguments have a bit of a wishcasting smell to them.

The important part about all this is that, regardless of if you think that AI is good or bad, or whether you think that accelerating AI is good or bad, understanding that this overhang exists matters because basically it means is that we are kind of “living in the past” today. As people solve these organizational, tooling, and knowledge gaps, that reveal capabilities that we already have.

Denying that these things are already baked in will only leave you behind and only leave your organization behind. It’s your choice what to do with this. You don’t need to become a full-time AI coder or abandon your preferred way of developing software as an individual. But organizations cannot afford to “wait and see” if this moment lasts or if AI continues to improve. They cannot wait to update how they source and screen and ultimately hire talent.

Organizations need to change that today. Maybe not universally across roles, or all at once. Take a diversified portfolio strategy, let some parts of the organization accelerate or find some roles where this will be easier. See how they play out within your organization and see how you need them if you must. I don’t think those partial approaches are likely to be successful, they create “classes” of employees and promote resentment2, but it may provide some much-needed data about overhangs.

Organizational change is hard, but acting as if the need to change isn’t already baked into the next 12 months is a fundamental error.


Footnotes

  1. Harnesses are things like the system prompt, the instructions the model receives, tools available for it to call, and so on.

  2. Sean Goedecke has an amazing post on this phenomenon, Seeing Like a Software Company.

Subscribe for new posts

ESC
Type to search...
↑↓ to navigate to select