GPT3 Meet Web3
AI has a bad press problem. While it wows us, it also scares us. Both GPT3 and DALL-E 2 broke the internet when they came out, but the awe was mixed with concerns. Those concerns range from the commonplace (human obsolescence) to the obscure (deceptively-aligned mesa-optimizers). At the same time, a growing discourse has been pitting crypto against AI, hailing the former as some kind of savior. Punk6529 argues “we have to decentralize the world with crypto… before it is irrevocably centralized with AI”. Peter Thiel, in the penultimate paragraph of his preface to the Sovereign Individual, famously wrote: “If AI is communist, crypto is libertarian.” More recently, Vitalik Buterin went deep into the comparison, contrasting crypto’s ethos of principled theorizing with AI’s “unprincipled tinkering”, resulting in opposite outcomes in terms of public trust.
In the table below I summarize some of the common contrasts I’ve encountered here and there.
Is there something to this dichotomy? What do people mean when they say crypto will save us from AI dystopia? What happens when GPT3 meets Web3? In this post, I walk through the common apprehensions surrounding the development of AI and examine whether the visions for a new web spurred by the crypto revolution can help mitigate those concerns. In particular, I argue that our fears of AI-driven obsolescence are likely overblown, and that we should be more concerned about AI’s centralizing tendencies. Finally, I contend that data dignity–the idea that people should be compensated for the data they produce–will be a key ingredient of the future of the web, and suggest that, to realize that vision, we need to adopt a web architecture centered on identity hubs mediated by self-sovereign agents.
The way of the horse
The top AI FUD is probably fear of obsolescence. Typical of this discourse is a 2018 Atlantic piece by Yuval Noah Harari, in which he worries that AI, unlike previous technological shocks, will put us out of work permanently: “Perhaps in the 21st century, populist revolts will be staged not against an economic elite that exploits people but against an economic elite that does not need them anymore.” He goes on to predict the growth of a “useless class” by 2050, lacking “relevant education” and sufficient “mental stamina to continue learning new skills.”
This tired meme of human obsolescence fundamentally hinges on your position on two related questions: (1) does technological progress imply the need to acquire higher cognitive skills? And, if you believe so, (2) can we all adapt? The discourse tends to focus on the second question, with the first usually assumed to be answered in the affirmative. Technological progress is thus seen as this rising IQ floor for job market entry, leading to the anxiety of being born on the wrong side of the bell curve, with the inevitable conclusion that humans are headed the way of the horse.
However, the first question is actually more debatable than it seems. It’s unclear that today’s “bullshit jobs” engage higher cognitive functions than was required of medieval farmers. Higher productivity does not inevitably map to higher cognition, however you define cognition. Progress means greater freedom to pursue whatever appeals to us more, not necessarily more cognitive occupations. Those two may have been correlated to some extent in the past, but that does not guarantee it’ll persist. With respect to the rise of AI, I find Chris Beiser’s take–“I don’t think we’re prepared for how many jobs AI is going to create”–to sit squarely in the camp of those who don’t think we’re about to hit a cognitive glass ceiling.
Harari’s “useless class” is also economically inconsistent. The existence of such a class would only be possible if consumption goods were to be essentially free, in which case we have just reached abundance, and that’s kind of a good problem to have. Furthermore, as long as the theory of comparative advantage applies to AI, then the fact that AIs can do everything better than us may not mean that we won’t have anything left to do. And as long as AIs run on some kind of hardware substrate, involving rivalrous capital investment, they are subject to opportunity costs and therefore don’t escape the theory of comparative advantage. In fact, I’m pretty sure there already are entities out there (of the homo sapiens type) better than me at literally everything, and yet somehow I’ve been able to make a living thus far. Those people are to me what hypothetical superintelligent AIs will be to everyone.
The black box
Serena Booth, a PhD student at MIT, recently exclaimed: “AI is utterly infuriating. I just changed a hyperparameter from 0.99 to 1, and something I’ve been tinkering with for WEEKS just started working.” If you’ve ever worked in the industry, you can definitely relate. And although there has been much progress lately in interpretable AI, the resolution of the scientific understanding of a neural network is nowhere near that of a crypto system. The uncanniness of AI has a lot to do with its blackboxiness. Unlike in other fields, where technology only seems like magic to the uninitiated, in AI, even its high priests marvel at the unreasonable effectiveness of neural nets. Vitalik isn’t wrong to point out that AI is a lot of tinkering. Despite the sophistication of the field, there remains a lot of finger-crossing going on.
This idea of unreasonable effectiveness has led some to theorize the paperclip maximizer problem, whereby an AI would become so good at whatever objective it was given, as innocuous as that of making paper clips, that it would turn the Earth into a giant ball of fire to maximize its output. This is the same problem that we encounter when people follow the letter of the law instead of its spirit, or when algorithms are gamed, but turbocharged by superhuman skills. Some claim this is humanity’s biggest existential threat, even leading some prominent researchers to believe AI will extinguish humanity by 2030.
Lest you think that surely the programmers can adjust the program’s “drive” (objective function) or its training process before the paperclip maker gets out of hand, some argue that, no matter how encompassing its training and how well-railed-off its objective function, an AI could spontaneously spawn “mesa-optimizers” (emergent submodules with delegated proxy drives), over which the programmers have no control, that are not aligned with the welfare of the programmers, yet capable of deceiving them that they are, and therefore wreak havoc once deployed in the wild. Such mesa-optimizers are akin to humans’ ability to be driven by things other than sexual reproduction, arguably our base objective function. Such drives are necessarily a side effect of our relatively narrow primary interest, and God, when He designed replicators, could not have foreseen that it would one day lead to creatures looking at the sky and wondering about stars instead of being busy humping one another. Much like God was powerless in foreseeing our weird quirks, we are powerless in anticipating AI behavior.
In fact, one might argue something kind of like a mesa-optimizer emerged in GPT3. Indeed, the paper argues that “a language model with sufficient capacity will begin to learn to infer and perform the tasks demonstrated in natural language sequences in order to better predict them, regardless of their method of procurement.” What this is kind of saying is that, in order to predict the ending of a sentence that begins with “I pushed a glass of water off the edge of my desk and then…”, GPT3 will go to the immense trouble of encoding the laws of physics in its neurons, i.e. developing a kind of theory of gravity, for the sole purpose of completing the stupid task of predicting the next word in a sentence, which is GPT3’s base objective function, much like our own objective function is simply to replicate. Arguably, this mesa-optimizer has taken a step towards looking at the sky and wondering about stars.
So AI practitioners come across as tinkering around with a blackbox that could at any time unleash a monster. In contrast, the ethos surrounding crypto systems couldn’t be more different. As Vitalik says, “it’s all about avoiding ugly unprincipled exceptions”. Crypto systems are typically highly legible, grounded in carefully proven mathematical theorems, and developed slowly. Crypto’s design principle is rooted in credible neutrality and is the opposite of a blackbox. And while AI’s failure mode is fundamentally self-reinforcing, or explosive, in the sense that it has to do with optimizing something, crypto’s failure mode is fundamentally self-dampening: a smart contract gets hacked, money evaporates, end of story.
Breakaway rent
Another common strand of criticism of AI relates to its centralization tendencies, and this is where the contrast between AI and crypto is greatest. The first kind of centralization is political. Harari’s Atlantic piece warns that AI may erode democracy: “We tend to think about the conflict between democracy and dictatorship as a conflict between two different ethical systems, but it is actually a conflict between two different data-processing systems.” Where Soviet communism fails, AI will succeed, by solving central planning’s informational and computational bottleneck. And because data and compute are concentrated in a few nodes, they can be co-opted by authoritarian governments, as many argue is happening in China. In contrast, crypto is famously focused on “censorship resistance”.
More interesting is economic centralization. While the rise of crypto has tended to build the wealth of a fairly large constituency, one can argue that the rise of AI has mostly benefited the shareholders of a relatively few AI-intensive tech companies.The root cause of this tendency of AI towards centralization is that it typically requires loads of data and computational power. Chris Beiser illustrates this well when he points out that “the cost of compute to train a state of the art AI model right now is in the same order of magnitude as the cost to raise a child and send them to a top art school”. Such high costs naturally favor centralization, which leads Russell Kaplan, of the company Scale AI, to predict a future of “compute-rich” and “compute-poor”, with the latter becoming “existentially dependent” on the former.
This in turn would lead to a self-sustaining economic rent unlike other technological advantages present or past, if we believe, as Weyl and Posner argue in Radical Markets, that there are increasing returns to data in machine learning. The figure below, taken from the book, captures that idea:
Their argument is that, while for each individual machine learning task, there are diminishing returns to data, if you zoom out and conceive tasks as building upon each other hierarchically, with higher-level tasks being more valuable than lower-level tasks, then you actually end up with increasing returns to data, and therefore a risk of breakaway economic rent. Suggestively, 7 out of the 10 most valuable companies in the world today would not exist without the internet and the web. This is related to Jaron Lanier’s concept of a “siren server”, which he defines as “the biggest and best computer on a network”, adding that “whoever has the most powerful computer would be the most powerful person, whether they plan to be or not.”
You might think that surely there is more to AI than access to raw data and computational power. After all, technological cemeteries are full of powerful incumbents who had locked up their supply chain and yet were defeated by competitors who found a way around those incumbents’ resource advantage and, through cleverness, found an alternative way to do the same thing better. Not so fast. The bitter lesson of machine learning is that more data and more compute outcompete programmer cleverness. In meme format:
And yet, while Sutton’s bitter lesson focuses on the algorithms, Weyl and Posner argue that AI capability may be constrained by the quality of its input. Narrow AI tasks like computer vision and speech recognition, near the bottom of the AI ladder, may be learnable with freely harvested data, from likes and tweets and utterances to a smart speaker. But what if broader AI tasks require a larger quantity of higher-quality data?
AI glass ceiling
There’s much talk about a “glass ceiling” for humans, but what if there was an AI glass ceiling? At first glance, this seems like a strange proposition. AI seems to keep hitting breakthroughs, despite persistent skepticism toward the “just throw more compute at it” camp.
Just ask the AI itself:
Notwithstanding, Weyl and Posner argue that passive data harvesting (likes and retweets) won’t get us to artificial general intelligence (AGI). AI will hit a ceiling if it can’t summon more sophisticated data from humans, which would require some kind of “homework” and, therefore, compensation. This leads them to advocate for a scheme which Jaron Lanier dubbed “data dignity”, whereby producers of data like you and me would be compensated for their “data labor”, and by the same token, incentivized into more sophisticated types of data production. This view assumes the path to AGI goes through human supervision in the form of a steady supply of examples for how to accomplish tasks.
It’s worth noting in passing that an alternative paradigm is pushed by noted AI researcher Yann LeCun in what he calls self-supervised learning, which is the idea that the path to AGI doesn’t require human supervision as much, mostly raw data. Self-supervised learning was pioneered in natural language processing (NLP) and made a splash in the AI scene with the publication of the BERT paper. The basic idea is that you artificially hide words in a sentence and train the algorithm to guess the word. No need for humans to provide examples, the algorithm creates its own. This simple technique turns out to be extremely successful at endowing neural nets with a good “model of the world”–in the case of language, a sense of what a meaningful and grammatical sentence looks like. Transplant that technique into computer vision, by occluding parts of a picture and asking the neural net to guess them, and you end up with an AI that builds a good internal visual model of the world. Self-supervised algorithms in both computer vision and NLP, which traditionally use quite different techniques, tend to use transformers, the closest thing we have today to a universal primitive in AI.
In reality, the path to AGI will likely be a mix of human supervision and self-supervision, rendering the question of data dignity pretty central.
Data dignity
The logical end of today’s paradigm of free data harvesting is paradoxical. As Russel Kaplan points out, “StackOverflow [a Q&A website for programmers] is valuable, but why would you visit it when your editor already knows the answer to your question?” Similarly, Roon flippantly predicts “[Github] Copilot [an AI pair programmer based on a GPT3 sibling] will fully kill StackOverflow”. But of course Copilot needs StackOverflow, hence the paradox.
More generally, if language models replace traditional web search, which redirects you to actual websites for the answer, and if instead they give you the answer right away, so that you don’t have to visit those websites anymore, there goes the incentive to create a website in the first place. Similarly, if AI gets so good at producing art that humans find it pointless to produce art themselves, then AI will have drained its source of inspiration (at least until it closes its own creative loop, or in other words, feeds on its own output).
This leads Russel to predict: “Web properties with user-generated content will change their licensing terms to demand royalties when their data is used to train AI models.” Which is essentially what Weyl and Lanier have been advocating for.
Whether because AI is prone to breakaway economic rents, or because it would otherwise hit a glass ceiling or lead to paradoxical endgames, or because we want decentralized control over the data that ultimately determines AI capability, data dignity–the idea that people should be compensated for the data they produce–will need to be part of the future of AI.
Yet Jaron Lanier has been beating this drum for a while, to no avail. I believe the reason for that lack of progress is that the set of technologies that could shift the balance of power between siren servers and users is still nascent, and in my view, they are to be found in the underappreciated self-sovereign identity (SSI) movement.
Self-sovereign identity
I will now take a small detour to explain what SSI is all about before going back to how it helps us achieve data dignity.
SSI is rooted in the observation that online identity management sucks. Accessing online services requires us to juggle dozens of username-password pairs, which, against best practices, we inevitably reuse across services. Web properties therefore become honeypots for hackers, with sensational breaches like the 2016 Yahoo hack exposing 3 billion Yahoo accounts, or the 2017 Equifax breach that ended up costing the company over $4 billion.
One “solution” that evolved over the years is the so-called “federated identity” model, whereby a few powerful web properties such as Facebook and Google become the trusted intermediaries (known as “identity providers”) for authenticating yourself to a variety of online services (the famous “sign in via” buttons). Highly convenient, but of course the downside is that every time you use some online service, Facebook or Google gets pinged.
SSI proposes a third way for managing our identity online. The vision is to transplant our offline authentication regime to the online world. When you access services offline, say a bar that serves alcohol, you walk in, grab your wallet, take out your driver’s license to establish you’re older than 21, and off you go. You don’t start whispering passwords into the bouncer’s ears.
Let’s model this interaction: an issuer (the DMV) issued a bundle of claims, known as a credential, to some subject (you). Both you and the issuer are specific entities, and as specific entities, you require unique identifiers. In the analog world, those identifiers are names or social security numbers. An important feature of this setup is that the DMV doesn’t get pinged whenever you use its card to identify yourself. That is because the driver’s license is relatively hard to fake, and therefore the service provider (the bar) doesn’t need to call the DMV every time someone walks in. The very fact that the credential is hard to forge constitutes in itself a proof of the authenticity of the credential.
So why can’t we do something similar online? Why can’t we ditch the password and, whenever we want to access some online service, just show some “object” that would be the digital equivalent of a hard-to-forge identity card? Just replace all the fancy physical features of a card with a cryptographic signature and call it a day. If anything, the digital equivalent should be easier to make, shouldn’t it?
Not so fast. How does the verifier verify the signature? For that, it needs the issuer’s public key. Well, can’t the issuer publish their public key somewhere? Where? Some non-standard proprietary database, which we can’t know if it’s been hacked or not? And what if the issuer rotates their cryptographic keys?
Deus ex machina
You guessed it. This is where we introduce the blockchain.
Any credential-issuing entity generates an identifier that is resolvable to an address on a public ledger, where you can find the public keys they use to sign their credentials. That resolvable identifier is known as a decentralized identifier (DID) and doesn’t change. The document it points to on the blockchain, though, is mutable, allowing the DID holder to rotate their cryptographic keys while persisting their identity.
Whoever needs to verify the authenticity of a credential can query the public ledger using the issuer’s DID and fetch the associated public keys to check the credential’s signature. The credential is thus verifiable, which is why it is known as a verifiable credential (VC).
The entity the credential was issued to also has a DID, and so the verifier checks that whoever is presenting this credential does indeed control the DID the VC was issued to.
At no point in this process was the issuer pinged about the subject they credentialed. At no point in this process did the subject have to enter a password. In fact, all this happened frictionlessly via so-called SSI agents.
A little help for cryptography
Many of the above steps involve a series of cryptographic operations (signing and encrypting), which the human brain doesn’t exactly excel at. We therefore need to delegate those tasks to some software agent that acts as a fiduciary on our behalf by holding the cryptographic keys that make up our online identity and by using those to manage interactions with other agents. As Daniel Hardman puts it: “On the digital landscape, humans and organizations (and sometimes, things) cannot directly consume and emit bytes, store and manage data, or perform the crypto that self-sovereign identity demands.”
Inter-agent communications follow standardized protocols interoperable across platforms. An example of such a protocol would be that of issuing a credential from one agent (eg, an institutional agent) to another (your personal agent).
A web of hubs and agents
Now that we know a bit more about SSI, how does it help us realize Jaron Lanier’s vision of data ownership?
First, imagine how radically different such a world would be. If you’re going to own your data, that means apps (including websites), instead of fetching your data from their own servers, would fetch it from you somehow. If the app is social in any way, the app’s agent will need to negotiate not only with your agent but with your connections’ agents in order to render the right experience. When in the process of using the app you generate new data, the app will need the ability to write to your personal data store. That data will need to be encrypted by keys you own, which essentially means that one of your agents will need to shadow you as you use the app and encrypt data creation events for you. In this world, apps essentially become pure front-ends and the data they access is, properly speaking, decentralized into personal data stores.
People don’t want to run their own servers, and never will
In many ways, this adds complexity and transaction costs. What’s more, the idea of personal data stores has been around for a while and yet it hasn’t taken off. Some, like Moxie Marlinspike, co-founder of Signal, argue that “people don’t want to run their own servers, and never will.” He adds: “Even nerds do not want to run their own servers at this point. Even organizations building software full time do not want to run their own servers at this point. If there’s one thing I hope we’ve learned about the world, it’s that people do not want to run their own servers.” This seems to clash with the vision of data dignity.
It seems undeniable that few people are willing to run their own on-premise servers. But almost everyone has some kind of personal cloud in the form of an iCloud, a Google drive, or a Dropbox, and many are paying customers. Those personal clouds, however, have very limited interfacing capability and no built-in smarts. But this will change with the proliferation of identity hubs, a concept evangelized by the Decentralized Identity Foundation (which recently renamed them “decentralized web nodes”). Identity hubs are essentially next-gen personal clouds that can secure, manage, and transact your data with others via standardized interfaces and routing mechanisms.
Importantly, hubs are meant to work closely with agents: hubs are “data-oriented”, “model[ing] operations as commits to a data object, or as reads of an object state”, while agents are “flow-oriented”, “support[ing] protocols for issuing credentials, negotiating payment, or dozens of other personal and business processes”. “Most work that agents need to do is rooted in and informed by data; an agent that has a hub to work with is likely to be far more useful to its master.”
Despite Moxie’s cutting remarks, I don’t see why we shouldn’t expect today’s personal clouds to evolve into identity hubs with mediating agents working for you. Properly designed, this infrastructure would go a long way toward mitigating the friction of a topological shift away from the client-server model toward the peer-to-peer model. Most likely we’ll land on something in-between, something that I would paradoxically call a mediated peer-to-peer topology.
Mediated peer-to-peer
A key design question that we will need to get right is the size of agents. One failure mode is for agents to become too bloated, to do too much. This is a natural temptation if you increase their responsibility beyond key management, agent-to-agent communication, and credentials management, and start putting them in charge of your personal data store. Agent developers might be tempted to give them capabilities matching the variety of data they interface with. That, however, would be unsustainable. Instead, personal agents will need to outsource skills to third-party agents, and let them render services to you by accessing your data on your behalf. This is achieved by giving power of attorney to third-party agents, using–you guessed it–verifiable credentials, so as to ensure personal agents remain lean and mean.
Third-party agents can be any type of software. Say Alice gives some scheduling app access to her calendar and contacts by issuing a credential to the app’s agent. Alice wants to schedule a meeting with Bob, so the calendar app initiates, on Alice’s behalf, the “rendezvous protocol” with Bob’s agent. In effect, Alice delegated her identity to the scheduling app’s agent. Superficially, this might feel like a return to the classic client-server model, but it comes with a crucial twist: Alice here is explicitly delegating, with her own terms of use, power of attorney to the third party.
At first glance, this new agent- and hub-oriented, decentralized web architecture seems much more complex. And we may, in fact, trade some efficiency, at least in the short term, for greater ownership. A parallel I like to make is to that of smart electric grids, in which the end consumer shoulders more responsibility and is asked for greater flexibility in their electricity consumption. That said, as the technology of agents and hubs matures, we can expect friction to diminish, and complex and sophisticated apps to flourish.
TBDex, the decentralized exchange the company Block is currently working on, gives us a glimpse of what this new web architecture might look like. In their white paper, they describe a protocol that would lie on top of hubs and use DID and VC technologies to allow counterparties to negotiate and establish the minimum information acceptable to exchange cryptocurrency for fiat currency or vice-versa, eschewing “the need for centralized intermediaries and trust brokers”. If we can build such a fairly complex protocol in this new web architecture, I’m bullish many existing apps can be effectively decentralized.
Data markets
More generally, with this new web architecture, we create a true market for data. No need to rely on GDPR-like regulation to enforce your rights. Imagine if there was a TBDex-like protocol for trading data seamlessly, encompassing data discoverability, querying, and trading. This also fits nicely with the recent surge in interest in federated learning, a branch of machine learning focused on training algorithms with data distributed across multiple decentralized edge devices, without moving the data around. A market for data built on an agent- and hub-oriented web architecture, combined with federated machine learning, makes for a paradigm that could play a large part in counteracting AI’s tendency toward economic centralization.
In the limit, there are two models to achieve shared prosperity with the rise of AI: decentralize-and-trade or centralize-and-tax. The traditional web architecture, and its model of unfettered data harvesting, favors the latter. The new web architecture, which you may call “web3”, favors the former. But note that this is a different vision of what’s usually hailed as “web3”. In this alternative formulation, your data lives in your hub rather than on-chain, your credentials are truly soulbound and held in your wallet instead of being NFTs on a blockchain, and your identity is actually private (unless you are an issuer, in which case you can make it public).
Daniel Buchner does a good job laying out this alternative vision in which the blockchain, while playing a crucial role, is not the end-all be-all. There’s no tokenomics. It’s traditional governance, in the sense that credentials are rooted in traditional institutions: the value of credentials derives from the trust that the public bestows on the issuing institutions, and their terms of use are reliant on external enforcement. But it’s traditional governance made super-efficient and highly flexible. In contrast, true cryptoeconomic designs typically aim for trustlessness and self-enforcing rules–a very valuable design, but, I’d argue, only appropriate for a limited number of very important use cases.
Finally, a web of hubs and agents would lend itself well to the development of data unions, something which Weyl and Posner advocate for in their book Radical Markets, and which could constitute a further overlay on top of data markets. The idea, a clin d’oeil to traditional trade unions, is for a group of data producers, who agree on common policies, to pool their data for efficiently trading it with AI customers. And hubs’ standard interfaces and agents’ standard protocols are the right primitives to effect such consolidation.
Envoi
AI is unlikely to lead to massive unemployment and the development of a “useless class”. But it does have centralizing tendencies, both political and economic, that should worry us. One way to counteract those tendencies is to re-architect the web in a more decentralized way–something that has been trumpeted by many for some time, most prominently by Tim Berners-Lee, the father of the world wide web.
In my view, the self-sovereign identity movement has to-date the most fully-articulated vision for how to accomplish that and deserves more attention. What may unfold is a fundamental shift in the balance of power between a few platforms harvesting tons of data, and the multitude of internet users unwittingly giving it away. The outcome is a move away from a few powerful siren servers to a redistributive infrastructure where data, while decentralized, can be seamlessly traded, and the prosperity from the development of AI applications be shared.
That said, there are still many technological improvements needed before we get to a web of hubs and agents that can rival the convenience of today’s web experience. And perhaps most importantly, we will need to coordinate the adoption of those technologies from both end users and online service providers. LFG.