I hope all 9480 of you had a productive week 🔥
Welcome to the 51 new members of the TOC community.
If you're enjoying my writing, please share Terminally Onchain with your friends in Crypto!
Happy Friday TOC fam!
Since Tuesday's post, I've been out and about. Lots happening in NYC this week. Normally, I try to avoid conferences or events because of how draining they can be, but I was excited by the line up this week.
Here's the rundown.
On Tuesday, I went to the Archetype Agent day where friends from Pond, AICC, Coinbase, Clanker, and Tenbase gave presentations on what they're building and how they think the agent vertical progresses. To be honest, the speakers were great, but content wise nothing really stood out to me. I didn't come out of agent day with a takeaway that I was excited to share with all of you. However, the presentations did reassure some of my takes I've shared here on TOC the last couple of weeks (gaming, TEE, security, etc).
On Wednesday, the roles were reversed. I was invited by VectorDAO to give a presentation on crypto x AI at their monthly lunch 'n learn. I was pretty happy with my rundown. I specifically didn't want to make slides but instead created this primer that I want to share with all of you as well. Think of it like a quick overview on the crypto x AI space with bullet point takeaways and relevant links to my posts.
And yesterday, I was at the Nous x Solana event to learn more about the Psyche launch which is bringing some of their training efforts onchain on Solana. There was a panel with folks like Shaw from ai16z, somewheresy, and a few others. I wasn't the biggest fan of how the discussion was curated. Other than a few practical examples of jailbreaking that caught my attention, the rest felt a bit tooo philosophical and "up in the air" for my taste.
But I definitely enjoyed the other two talks. One was by sxysun who works at Teleport and is experimenting with TEEs in order to better understand agent autonomy. And the latter presentation was by Bowen Peng who is the chief scientist at Nous. He covered the Nous roadmap and covered what to expect with Distro and Psyche in 2025.
With that being said, let's continue this Nous thread and dive into today's post.
Technical breakthrough in December, Mental breakthrough in January
In Tuesday's post I gave a rundown of my thoughts on Deepseek and mentioned how it was a bullish signal for decentralized AI.
To be clear, this is also a huge win for anyone excited about decentralized AI. I'm not saying everything is solved now but there's some new found hope that the vertical's biggest bottleneck (centralized hardware clusters) might be getting smaller over time. Of course, the capital that the big dogs have is still going to pour into making their general models 10x better, but that doesn't mean distributed players can use less capital to focus on specialized, niche models.
To be honest, I thought I was done with that train of thought but then on Wednesday I kept going down the rabbithole.
Ben Thompson published this Deepseek FAQ which was helpful to understand the implications of the new model. The key takeaway, in my opinion, was in this section:
This blew my mind.
A bunch of researchers were given a hardware constraint (not having access to H100s) and found a way around it.
But! This was old news for the most part. A lot of researchers (clearly not me lmao) who were paying attention to Deepseek's v3 launch in December already had their "woah moment".
And that's when I realized...December of last year had a series of breakthroughs that were critical for open source & decentralized AI.
On November 29th, Prime Intellect announced a 10 billion parameter model that was trained from hardware spread across 3 continents
On December 2nd, Nous announced they pre-trained a 15 billion parameter language model over the internet using heterogeneous hardware (not just H100s)
On December 27th, Deepseek announced v3 which was trained in a centralized cluster BUT with H800s which lack the interchip bandwidth of H100s (what OpenAI, Anthropic, X use)
Now, you might be thinking..."YB it feels like you're stretching this whole Deepseek - DeAI link a bit too much". And maybe you're right. But, here's my counter.
All of the examples above showed that it's possible to train large and effective models without having the optimal hardware setup that American AI companies have. Optimal meaning gigantic, centralized warehouses of H100s.
I believe that the Deepseek R1 craziness this past week was less of a technical breakthrough (that was v3) and more of a mental breakthrough on the future of AI.
The R1 news cycle finally helped everyone realize that there may actually be a future of AI that is not solely dependent on OpenAI, Anthropic, Google, and the other 2-3 behemoths that have hundreds of billions of dollars to spend on compute.
It gave hope not only to the teams building distributed training processes, but also to the independent researchers and smaller enterprises that were restricted to approaching AI as to how big tech wanted them to.
And that's why my takeaway from all the Deepseek craziness is that it's time to start taking DecentralizedAI very seriously. Because it's only a matter of time before these projects hit an inflection point and the majority of people looking for AI solutions will start shifting over to solutions that provide more freedom and economic benefits on how they train and use models.
Heck, let's take it a step further. It won't be long before people stop caring about the underlying models used. OpenAI's god like stronghold over the industry might start to slowly wither away as the availability and utility of distributed, open source solutions become more cost-efficient.
This time last year, the tweet below by Balaji would have been an absurd statement. Now, we're starting to see a real narrative shift on where value will move to next in the AI stack.
The first moat of AI was the model/training layer. To continue breaking the walls down depends on three things:
Reducing the need for communication frequency between GPUs during training so there's no need to have one huge central node in the system
Being able to efficiently use multiple hardware solutions together in order to be less dependent on access to H100s specifically
Properly aligned economic incentives to make sure the independent nodes in the distributed training process are not degrading quality
Now, this may be obvious, but I can't help but point out the fact that the points above look exactly like the requirements that went into the Bitcoin whitepaper...
No central nodes. No dependencies on a single piece of hardware (and hardware provider). And economic consequences incorporated to enforce accuracy.
With that being said, it's not surprising that Nous - a company in the "AI trenches" - announced their partnership with Solana to build an end to end distributed, open source AI stack. To be clear, this is significantly different than Meta's Llama. With Llama, the weight are open source but Meta controls the training process, data, and hyperparameters.
I haven't gotten a chance to finish going through the Nous Psyche docs so I'll save the details for Tuesday's post to better explain how exactly the Solana integration will happen.
But it's starting to seem that I have no choice but to understand the key technical implementations that teams like Nous and Prime are delivering.
Why?
Because I want be able to provide an accurate forecast to all of you on how I think the growth of DeAI looks like this upcoming year. What are the margins? What are the bottlenecks? And how can onchain verifiable states and tokenomics help accelerate progress?
Not only that, but I want to understand the open source AI lore more in depth as well. This week, I started reading the following papers to get better context:
Google's DiLoCo (distributed low communication) paper which was released in December 2023 and seems to be the major breakthrough of hardware optimization
The Nous DeMo (decoupled momentum optimization) paper which was released in March 2024 and a separate implementation to reduce inter-hardware communication
I wanted to end today's post with this video snippet of Dylan Patel (founder of SemiAnalysis and the goto semiconductor guy) who said the following:
"I think there's a lot of interesting stuff happening on the distributed training side like with Nous and Prime Intellect..."
Apologies if today's post was more narrative driven and less rooted in technical details. Like I said earlier, I was out most of Wednesday and Thursday so this write-up was a bit last minute. However, I do think the core thesis in the section above is critical to understanding how the crypto x AI vertical progresses in the coming months.
I have a lot of reading ahead of me, but honestly feel so excited to get through all of these papers! The bull case is starting to get more and more clear for the parts of the crypto x AI stack that aren't strictly agent based.
Have a great weekend and I'll see all of you on Tuesday!
- YB