Data Ownership in the Age of AI

👁 0 views



You’re swimming in knowledge. You’re creating new knowledge day-after-day. If your well being app counts your steps? That’s new knowledge. The Oura ring that’s monitoring your bio-metrics? Valuable knowledge. Your social media posts, even the silly jokes that obtained zero likes? More knowledge.

This is all knowledge that AI firms would love to reap. You can’t construct good AI with out good knowledge, which is why many view knowledge as the “new oil’ in the race for AI. The problem, though, is that while your data is valuable in theory, the reality is that it’s hard to monetize your own personal data, as you have no leverage as an individual. (Open AI isn’t knocking at your door to buy your old tweets.)

Enter Vana. “I think data is this fundamental resource powering the next generation of AI, and really the next generation of our digital economy,” says Anna Kazlauskas, co-founder of Vana and CEO of Open Data Labs. “A lot of people frankly just don’t realize that they actually own their data.”

But you do personal your knowledge. And it’s useful… when you can one way or the other be part of forces with hundreds of thousands of others who additionally personal their knowledge. This would offer you bargaining energy. And that’s the mission of Vana: To create an ecosystem for user-owned knowledge, which in flip fuels user-owned AI.

That ecosystem entails a mixture of Data DAOs (a “labor union” for knowledge), decentralized knowledge marketplaces, the not too long ago launched VRC-20 token, and a brand new collaboration with Flower Labs to construct the world’s first user-owned foundational mannequin. (Exhibit A that Decentralized AI is creeping into the mainstream: The Vana/Flower collaboration was coated by WIRED.)

Kazlauskas will give a keynote at the AI Summit at Consensus 2025 outlining this imaginative and prescient, and he or she offers a glimpse right here. And she sees the momentum shifting. “We’re already starting to see this shift where more people realize that, ‘My data is really important to AI’ and ‘I’m actually the owner of that.’” She predicts that in a couple of years, over 100 million customers might be onboard. In 10 years? “World population. Above 10 billion.”

Interview has been condensed and evenly edited for readability.

Why is user-owned knowledge so necessary to you?

Anna Kazlauskas: Most folks assume knowledge is owned by the platforms that it is sitting on, however that is not the case. In the similar manner that if you put your automotive in a car parking zone, the car parking zone does not personal your automotive. You can all the time take it again. You have full possession over it.

And there’s an enormous quantity of cash being made at this time, largely by large tech firms, off of that knowledge, however customers are the authorized homeowners. So I believe it is necessary that we restore that possession, each from a person perspective and from a developer’s perspective.

Can you join the dots of how this helps builders?

As a developer, particularly in an AI world, getting access to the proper knowledge is actually necessary. And it is tremendous arduous to do proper now, as a result of most of the knowledge is locked up inside the walled gardens of large tech. So many of my actually sensible pals who do stuff in AI go work at the large labs, as a result of that is the place the knowledge is and that’s the place the compute is. But that does not should be the case.

How do Data DAOs match into this imaginative and prescient precisely?

So a DataDAO is type of like a labor union for knowledge. Where principally you might have a big group of individuals who pool their knowledge collectively, after which could make collective choices over what occurs to that knowledge.

The purpose why that is necessary is that your knowledge, by itself, just isn’t that helpful, proper? It’s rather more helpful when there is a large pool of it. When there’s sufficient of it to coach an AI mannequin.

What are some of the Data DAOs you’re most excited by?

There are a couple of in the well being house which can be actually fascinating. There’s an early one which’s really doing full exports of affected person medical data, which I believe can actually assist advance so much of analysis in the house. There’s some associated to biometrics, sleep, and well being. There’s one with the DLP [Driver Loyalty Program] Labs; they’re constructing automotive knowledge. And inside their data-set, the Tesla knowledge is actually fascinating as a result of most individuals take into consideration Tesla as useful as a result of they’ve an information lead, proper? Actually, the customers can get so much of that data-set.

You’re pivoting from principle to follow with the new collaboration with Flower Labs to construct COLLECTIVE-1. What’s the aim there?

COLLECTIVE-1 is the first user-owned basis mannequin. Usually when folks take into consideration a basis mannequin, they usually suppose of one firm operating a really giant coaching job in a single knowledge middle, proper? Like OpenAI. And the purpose why it is usually finished in a centralized manner is as a result of it requires, one, an entire lot of compute energy, and two, an entire lot of knowledge.

Flower AI is type of the chief in federated [decentralized] coaching. They’ve finished a very nice job of constructing these nice open supply libraries. They’ve come in from the coaching aspect and the algorithm aspect. And with Vana, we actually concentrate on that knowledge piece, proper? So we principally have all this knowledge that folks can practice on. Then you give customers end-ownership of the mannequin, and customers can resolve on what the mannequin is allowed to do? So that is the first basis mannequin of its type.

And the principle is that finally, with higher knowledge, you’ll be able to construct AI that’s not simply aggressive with the central gamers however higher, is that proper? So it’s not nearly ideology, but in addition efficiency.

Exactly, yeah that’s 100% proper. From a decentralized context, I believe typically folks agree in precept that, “Yes, we should have AI that’s owned by the people. We should have decentralized AI.” But what’s the factor that we are able to really do higher in a decentralized context? Data is the reply. For every firm, they solely have their single slice of a data-set. Apple’s obtained their knowledge. Google’s obtained their knowledge. But when you’re going via the person, you’ll be able to reduce throughout platforms and really construct higher data-sets than any single firm. Data is the secret sauce that makes all of it work.

Love it. Thanks Anna, see you at the AI Summit in Toronto.

Jeff Wilser will host the AI Summit at Consensus 2025, and is host of The People’s AI: The Decentralized AI Podcast.



Loading Next Post...
Loading

Signing-in 3 seconds...

Signing-up 3 seconds...