Last week we invited Paul Frazee, co-founder of Beaker Browser to speak at our very first We Love meet up about peer to peer. Our Tech Lead Viktor had previously E-Met Paul on Twitter when his blog 'The end of the cloud: A truly serverless web' went viral.
We couldn't help ourselves but ask them to pick up where they left off about peer-to-peer and the end of the cloud in our Badger HQ so we could share their thoughts and opinions with you.
So here's the fireside chat, we hope you enjoy it.
Otherwise if you're on the go we've also transcribed the video for you below.
Fireside chat with Viktor and Paul Frazee
Viktor - Hi everyone, I’m Viktor from Red Badger and we’re here with Paul Frazee - co-founder of Beaker Browser. A little while ago I wrote a blog about all things around peer-to-peer web which got picked up by Venture Beat - it caused a bit of a stir on Twitter and then we started talking and eventually we invited you to come to London to give a talk, so thanks for coming and welcome.
So what is the Beaker Browser?
Paul - So Beaker is a peer-to-peer web browser I started with a colleague Tara Vancil. We put together a company Blue Link Labs, and we’ve been working on it for a year and a half. We took Chrome and put it into a new web browser. We’ve been trying out some new technologies and seeing what we can get out of it.
Viktor - So what makes Beaker different from other browsers? You mentioned that it is peer-to-peer?
Paul - Yeah that’s right. So the way the web works you use a server for everything, you want to make a website, you’ll create a server up in the cloud, you’ll put a database on it, put all your code on there, so that server ends up running the show. With beaker we’re playing around with the idea - maybe we could get rid of servers and just have people’s computers doing the work. So you just go into your browser and say I’d like to make a new website and it makes a new address for you and you can just share and transact between your computers instead of having some kind of service being involved at all. That way you have a much more personal connection with what’s going on.
Viktor - So how does that work and is it just for browser? Or can it do other things as well? Because that seems like a fairly fundamental shift from what the web is doing.
Paul - There’s a bunch of different technologies that are involved in this and it doesn’t just have to be the browser, it can be put into desktop applications, it can be put into the cloud or mobile, things like that. The main technology we’re using is called the Dat protocol and it is actually a little bit like BitTorrent. I guess you might call it a cryptographic network where you can use these new tools that make it very easy to create web domains or shared folders and things like that - share that between computers and have other people’s devices help contribute bandwidth so that everybody is acting together which is something that BitTorrent actually did 10-15 years ago. Now what’s happening is we’re looking at those old ideas and going ok we can modernise them a little bit, add a few new tools and actually can be pretty mainstream. So going past the old file sharing stuff and instead saying can we build entire applications off of this sort of bittorrent idea – it actually looks like the answer is yes, there’s a lot of cool things you can do with it.
Viktor - Fantastic, so if you compare this peer-to-peer web with the web of today what would be the main difference and advantage of doing this?
Paul - One of the things I like to focus on is this idea that the web is almost like a little society and you have two kinds of citizens in your society. You have the server and you have the client and we're all clients. These servers they have a lot more control over everything, they have these rights to do the publishing and they set the code, they moderate everything and then we’re the ones trying to use this stuff. But were just the clients and we can only just browse around and run an ad-blocker maybe but we don’t have a lot of control over how this works.
Viktor - And someone else owns those servers and runs the show and sets the rules...
Paul - Exactly, so if we could get rid of this sort of two tiered citizenship and make it so that everybody is just one kind of citizen and that’s just people’s computers. Then the people that are actually using this stuff can have a say in how the web works. They can change the code, they can see how these algorithms work and decide maybe I would just like to get a chronological feed and not have to have ads being injected and all that kind of thing. Or I’d like to know who gets my data and whether or not that’s being sold to somebody or if it’s actually private.
Viktor - That’s quite a current problem.
Paul - If we can make it so the people have a better representation in the web and on online spaces, that’s really where we’re trying to get with it.
Viktor - Right, so if everyone’s equal then how do you decide what the current truth is? If we all decide that we will register a Twitter handle called 'Red Fox' in this environment where there is no central authority how do you decide which one...?
Paul - That’s a sophisticated question actually, how technical do we want to get?
Viktor - I think we can go fairly technical.
Paul - Let’s see, that’s a tricky question. How do you deal with what you might call canonical information. It’s sort of like the verbatim truth that everybody’s going to go with. Traditionally when you use a service you just say the service is the authority, whatever the service says. So Twitter for instance tells you who is who and when you’re trying to get rid of any appointed authority you have to come up with some other way of saying what the truth is. There’s two different schools of thought - there’s the service and peer-to-peer hybrid, where you say will have a service that will help us with those names but nothing else. We'll use peer-to-peer for everything else, but we will have a service, and that service will just tell us the names. So that service is almost just keeping a bunch of pointers for you. and that points out to the peer-to-peer network, so people are doing most of the work off of the peer-to-peer stuff but then these services can help you with hard question how do you decide the canonical truths.
Viktor - Because it’s actually similar to how state is maintained in applications that traditionally you have stateful things all over the place - but turns out you can actually pull that all to the side into almost like a single reference that goes this is the current state and everything underneath is just data that sort of moves along. So it’s quite similar in that sense that you have a central authority that decides the now, but all of the data just sort of lives in the swarm.
Paul - If you’ve used immutable systems like GIT, they are pretty comfortable with that idea that whole set of data and then we’re just moving pointers along that log. There’s two schools of thought, the second one is to use blockchains and to do that you have to have some kind of what you might call a decentralised consensus algorithm - the most well known right now is proof of work and it’s an interesting approach it’s got some performance problems.
Viktor - Energy consumption problems as well.
Paul - Right, so the efficiency of decentralised proof of work - if they can’t solve that then it may not work out. But things like proof of stake are being worked on to see if maybe they can figure out. And actually with proof of work they’re even in some cases suggesting we use like lightning networks which try to move as much as possible off of the blockchain, which is sort of what we’re talking about, just putting the pointers inside the blockchain which will possibly help with the efficiency problem, so maybe that will work.
Viktor - Would you say that this problem is in all of the applications that you would build in the peer-to-peer space or is that more kind of a special thing that not everything needs to solve?
Paul - It depends actually, if you’re just publishing media, blog post or videos then you don’t need to have any source of truth, you can just put out a static website on a peer-to-peer network and you’re good. We have a little clone of Twitter that’s purely peer-to-peer.
Viktor - What is it called?
Paul - It’s called 'Fritter' - a little joke but it’s designed to say, ok we have nothing but peer-to-peer right now in our technology stack, we don’t have a solution yet for the centralised state, so how do we deal with that and how far can we go? It’s an interesting experiment, you don’t have any names, you have URLs which are these cryptographic URLs and they’re 64 character hex strings, so not pretty.
Viktor - Not easy to remember.
Paul - You can share it, you'll share it over email or something like that, or SMS.
Viktor - Scan a QR code off someone’s phone.
Paul - There are interesting things you can do - hardware maybe that does a handshake in person, things like that. So there are interesting ideas about how you can deal with that but without that, you find some way to share it and then you follow the person, then you publish that you follow them. And that’s another interesting way to find people - you look at who is following the people you follow.
Viktor - It becomes this sort of web of trust like in PGP. If I follow you, and you follow a bunch of interesting people and so I start following them and they follow other people...
Paul - If you turn the PGP web of trust and you took that and you merged it with Twitter in a way - turn it into an application network and it works when you’re trying to transact with people you're connected too. That’s no problem because you know who they are, you know that they’re trying to talk to you so we have notifications that somebody’s mentioned you and things like that. That all works fine, but if you’re trying to talk to somebody that you don’t know yet, that you’re not following that’s hard to do, because how would you know that they’re trying to talk to you. There’s a global network but it's decentralised and so that’s how far we can get with the peer-to-peer stuff. Once you have an established connection with somebody that works fine, but how do you get notified if there’s someone you don’t follow talks to you or how do you look somebody up reliably by a short name? That would be nice to have, that’s when you need to have something like...?
Viktor - You need consensus again for that because people want the same names. Is the protocol purely designed for a web-like content, or can it do large volumes of data or streaming media?
Paul - The main engineer behind the protocol goes by Mafintosh, he’s been very interested in making sure that both what you just talked about, large sets of data and streaming data are pretty big priorities for him and so you can stream video and audio off of it and have dynamic data sets, so that you can keep appending information to it and that will sync out as it goes so important use-case.
Viktor - That’s a feature in Beaker isn’t it? You can follow a page and it live updates as it’s being edited?
Paul - Since we can sit there and listen for updates, we thought why not put live reloading just right into the browser? Because that’s a fun thing to have so you can turn that on.
Viktor - That's a good demonstration of what’s possible with a different protocol than what you’re used to.
Paul - Live data is an important part, it’s also great for things like 'Fritter' because we want to be able to know when somebody posts a new update, so we can just sit back and listen. Lots of data is also really important. There’s some new protocol work being done right now specifically focused on this. Mafintosh has been taking Wikipedia and dumping it in just one giant folder, millions of files.
Viktor - That’s actually really important because part of the problem with the current web is things like link rot and content drift, where theoretically if Wikipedia somehow runs out of money we may lose it overnight, they turn off the servers and we're done.
Paul - The archival question - that’s actually one of the things that motivated the Dat project in the first place. They were concerned about scientists, academics and also cities that have civic data, this is a different sort of measurement and things like that. They wanted to make sure that you could keep that stuff online, even if for instance a university published a paper, that they have an admin group that’s going to try and keep things online but eventually those websites pretty frequently go away and so how do you let someone publish something and then not have to worry that the university IT apartment is going to stay funded? So the answer to that is the sort of BitTorrent style of network, because the address is not connected to any one computer, it’s this public key you can actually use DNS with it.
Viktor - And it’s linked to the content itself, so you’re sure that address will only ever point to that particular piece of content and not a different one underneath.
Paul - So there are two kinds of ways that these distributive file systems are designed and the Dat protocols that distributive file system. So there are two main approaches - there’s what you call the content hash addressing and the public key addressing. So with content hash addressing, you create a hash of the content which creates a unique number.
Viktor - Like in Git?
Paul - Like in Git, exactly. Specifically references that file and so it’s actually very static which has a nice benefit which is whenever you look at the file you can check the hash and make sure that it’s the correct information. So as a linking structure on the web the content hash address is really powerful because it captures the content you’re pointing at and makes it so it can’t be anything else.
Viktor - Just the same as GIT, which is one of the great features that you know that set of characters identifies your files and the entire history and it can’t possibly be anything else.
Paul - So that’s fantastic for archival. You can use those sorts of systems but then the constraint is that it’s static, so sometimes, in fact most of the time you want to be able to change your data over time, that’s how we think of websites. So as a primary addressing scheme with the Dat protocol, we've instead chosen to use public key addressing. Public key addressing is where you create a key pair and you use the public key as the URL and then you can make changes. What you do is actually to have content hashes and make the public key the pointer to the most recent content hash and then you sign the pointer with the private key, so when somebody receives the content they can check the signatures correct.
Viktor - And they can tell it’s you not someone else.
Paul - So the Dat protocol we’re using the public key addressing and its actually still quite good for archival because anybody can host the content because it’s all signed data. So I can get the files for a Dat archive from you, I can get it from my friend down the street, it doesn’t matter. I can check that signature and be like yep that’s correct, and that’s fantastic for archival because you can publish and use the university resources at first, and then over time some other company can come along or a charity and say we need to make sure this data stays online and they can take over the serving duty for some scientific work or anything really.
Viktor - So if anyone decides that a particular piece of content is socially important and significant they can just take on the responsibility of the server and making it available, rather than relying on the particular person or entity?
Paul - And there are some really interesting implications for that. When we put that into a web browser because we can actually automate that process. So in Beaker the way it works for the moment is when you visit a site you’ll actually share those files for a little bit, contribute some bandwidth back, and it’s almost altruistic really, because you’re saying this is something that I’ve benefited from, I want to help keep it online, I want to help serve it for a little bit and it helps keep costs low. A big part of what we want to be able to do is make publishing independent so that you don’t have to rely on really big cloud providers to make a website or share a video.
Viktor - Or try and pay for it with advertising.
Paul - Yeah exactly. So if we can offset the costs of making the website work then we get much more freedom as users don’t have to worry about the economics, and so a big part of that is saying, ok well why don’t we all help each other out and if you use something then contribute a little bit of bandwidth and that’s why you can publish a video from home, and then if it goes viral, you’re not going to get slammed at home with every one of those requests.
Viktor - And you can decide how much bandwidth you’re willing to spare on your home internet. How do you see the adoption of these technologies going? You must be thinking of that all the time?
Paul - Yeah, you have to make it accessible. It’s about making sure that you can put it into something that you don’t have to be technically savvy to really get the benefit from it.
Viktor - That to me is the main thing about Beaker is that it’s just a web browser. It’s the same experience as getting firefox: download it, you start browsing, it does normal internet. But it also does this new thing.
Paul - In a way we're just trying to surgically insert these new things and change as little as possible along the way, so that it does feel familiar and easy to use and so that’s probably the biggest challenge early on - to make sure the tools are completely approachable and if you can do that, then from there it’s about adding better features and more capabilities.
Viktor - The main one seems to be the ability to publish a website just from your browser, that’s a new big thing. Do you think there are use cases behind the scenes as well for big companies that need to serve a lot of data, they would use Dat or some of the other protocols within their data centres to synchronise things, and that kind of thing to reduce costs? So you would sneakily almost make these protocols a tool for everyone, make it so prevalent that you can just be there and everyone can pick it up. Kinda like Git, people use Git for other things than it was designed for just because everyone’s familiar with it because all the developers use it.
Paul - Sharing files has always been a pain for everybody, that just continues to be the case, there’s a couple of interesting advantages, the configuration you have to do is really minimal, you share around the links and then you can start syncing from one device to another, that’s very handy. The archival we were talking about even within a company is actually quite handy as well.
Viktor - Audit trail is a really typical requirement, that almost everyone has to be able to see what content was over time and who actually contributed and who did what so when something does go wrong you can tell what it was and how to prevent it next time. So that’s a big thing.
Paul - So the audit trail’s useful, the less configuration, easier sharing of files between machines – that’s great. And then within applications, you should be able to see some cost cutting for people that have services already because this peer-to-peer network is almost sort of like a CDN. And to some degree, we want to find out how much you can offload your bandwidth costs to the network itself.
Viktor - If you think of someone like Netflix, who serve a quarter of the network traffic on the internet, and they have to serve that all centrally from one place, that must be an insane network around that data centre. So they could actually spread that around and if they weren’t concerned with DRN which is probably not as easy to solve in that situation, that could really help. So that could be a killer app as well, some kind of video streaming service which doesn’t actually rely on a huge infrastructure. YouTube but not actually controlled by a company, an independent forum where people could publish video content.
Paul - They would still use their service to give the canonical backing, but then you offload it as much as possible.
Viktor - But the content itself is actually not a huge problem, so what you’re selling is a fairly small standard web app which goes this is the truth, and it’s out there somewhere.
Paul - You get a couple of neat advantages too, you’re offloading the hosting to multiple people, in some cases you're maybe getting better connection times because you have co-located people sharing data, so even within the same WiFi if people are looking for the same thing so their computers might actually just...
Viktor - What I really like is that the hosting capacity scales with the numbers of users interested in the piece of content, so it’s quite natural it doesn’t have this weird disbalance of 'I am the one author and now I need to support millions and millions of people looking at the thing'.
Paul - if you think about it in a way there is kind of a curse of making a successful web application, you’re going to have to support everybody that could ever want to use it.
Viktor - Suddenly you can’t really spend time working on new features of the app because you need to stabilise it for hundreds and hundreds of thousands of people.
Paul - And that’s when the VC comes along and helps you grow your business and everything
Viktor - And then you give your money to Amazon. So what are you working on at the moment in Beaker?
Paul - Usability, accessibility, we had a prototype a year and a half ago that did all the basic things, being able to browse the sites, create sites that are peer-to-peer, being able to just sit down and immediately get it, that takes a lot of work. So we're in that process of making sure that everything is just super easy and obvious and that’s all the different flows where it’s got a little tooltip telling you exactly where to go and you don’t have to know anything ahead of time.
Viktor - The little annoying things that actually make it great.
Paul - That takes so much more time than you ever expect it to, getting that usability really good, that’s a hard job.
Viktor - do you have people interested in building apps that are like specifically leveraging Dat and work in Beaker?
Paul - Yeah, John Kyle is working on a CMS that’s totally built on top of Beaker, and it’s a very neat file-based design. It’s really cool, he actually just did a live stream talking about it. He’s trying to take that ease of use even further, as the browser we have to stay a little bit un-opinionated we try to work at just the file level and be good for advanced people and then he’s putting GUI’s on top of every step. Kind of like the WordPress of peer-to-peer.
Viktor - So it can work for everyone. And browser feature wise apart from accessibility, do you have anything interesting in planning,
Paul - Yeah, the pipeline is that Mafintosh does protocol work and we stick it into the browser somehow, and so he’s doing a lot of really great work adding in the ability for multiple people to collaborate on a single Dat archive, it’s going to be really important.
Viktor - So you could have sort of Google-Docs style collaborative editing but peer-to-peer?
Paul - So that’s coming along, new primitives to a key value database so it’s sort of like LevelDB, that’s being implemented on top of the Dat network, that’s a part of the same protocol update and so we'll be able to give a really good primary data store, so we can build applications on...
Viktor - It makes it way more approachable for app developers that will wonder 'where do I store things or just files'. Like a database...
Paul - He’s building some really amazing features into this, it can work over the network it can do random access reads over the network, no problem you don’t have to download the entire thing ahead of time. So it’s very efficient over the network and that will have the multi-writer as well, so that’s going to be a fantastic update to the stack. So that’s actually, other than getting the usability right, the stack of technologies that you need to build all the applications you expect to build on the web is really where the focus is going to be for the next 5 years.
Viktor - So what key things do you think are missing then? From that sort of stack. What stops you from building everything on the internet again on these things. I can think of one which is a search engine for content hosted on Dat or even IPFS or other places, something that actually indexes the public content so that you can find it and play with it and see what it does, essentially - I tried beaker and the only site that is served over Dat I know is your homepage, I looked at that and I thought that is cool: pulling that from someone somewhere, no idea where, but other than that it’s quite difficult to find where things are at the moment.
Paul - The funny is there is only a handful of websites already, so we'll probably start with something like Yahoo, get a little portal going.
Viktor - Replicate the history of the internet again.
Paul - Just read the history books and copy everything. Discovery is a hard problem still. I guess if you’re looking at how to build applications you have to start with a way to publish and synchronise data, and that’s what the Dat protocol is giving you, so then the key value database is sort of a variation on it, makes it easier for certain use cases. You need to be able to get good connections to somebody that’s not on a publishing network but just a synchronous channel, and you might be able to use WebRTC for that but the reliability is a bit questionable, so we might just build something of our own for that. Then you need a way to handle a canonical state like you were talking about, so we may need either a blockchain to handle that, or we might end up creating a service layer that’s designed to be lightweight.
In a way, one of the things that’s most important is that you just get away from having services that are hardcoded. An endpoint that runs things. And so maybe what we can do is have public key address services so there’s that level of indirection, and then with discovering search, what we need is aggregators and that’s a whole layer of services...there’s some interesting potential with that because you can actually, crawling on the peer-to-peer network we have a little more information than you do on HTTP, because these data archives actually have a full manifest of all the files that are inside of them, which you don’t get on HTTP. And so crawlers are a little bit easier to write, you can list all the files and look at specific paths for different kinds of things. If you publish JSON or once we get this key-value store in there, the key-value stores are actually kind of websites but instead of serving files they have key-values. So crawling these sorts of networks of those sites it’s possible to do it in just your browser by itself because you’re not parsing HTML, and you can pretty quickly pull down a site manifest and say "Okay, I’m interested in those little pieces and nothing else". So what we might be able to try and do is get a crawler built into the browser.
Viktor - Because you can also take that search index that it builds and serve it over Dat in the same way so then the browser itself has access to all of the search index, so it doesn’t even have to talk to a service or fetch much to actually do a search.
Paul - The key-value store we're talking about, that’s one of the things we're looking at, maybe you can start to have shared. You can start to get this distributed search architecture and sharing the computed indexes and have everybody contributing.
Viktor - You're slowly getting back the original model of the web, which is, "Everyone has a server it’s fine". Turns out it’s not as easy...
Well, thank you very much for your time, I'm looking forward to your talk tonight which we will link in the description somewhere below. Thank you for coming to London to speak.
Paul - Thanks for having me.