Let’s fix this thing: Results from our first hackathon
Dan Kaminsky on the progress made at O’Reilly Security’s first hackathon to make web security easier.
There’s a lot to be said about Internet security, and “could be easier” is definitely one of them. So, White Ops Labs and O’Reilly Security put out the call last week: Let’s get together, let’s write some code—all open source, with one clear directive—let’s make security easy.
I’d like to specifically thank Code for America, who took our hacker invasion in stride. They were fantastic hosts, and like us, they’re trying to make a lot of unnecessarily difficult things much more achievable.
So what did we nerd out on at the hackathon? It was just a week, and there was no advance notice to speak of, but it worked. We made progress on three key projects: Jump to Full Encryption (JFE), ratelocking, and Overflowd.
Jump to Full Encryption
JFE—our full system encryption daemon—is now a proper daemon, with the beginnings of proper error handling. Systems that mysteriously fail are systems that never get traction, so this sort of scut work is actually the point. Security doesn’t get to just work on the fun parts. We are part of IT. David Strauss, the founder of Pantheon, came by and we bashed on what I think is the biggest bug in JFE right now: that when we automatically wrap a connection in with TLS, addressing gets changed to localhost.
Network servers do need to know who they’re communicating with. There are designs that strip that knowledge, and they are deployed in the field, but you have to be pretty careful. That’s not easy.
Linux can sort of do what we want, with the TPROXY support in iptables, its old firewall layer. Well, it’s the old layer for a reason. Linux has been moving towards a new solution to packet handling, nftables, and David and I have been exploring it. We’re hopeful it will give JFE the proper socket interposition powers it needs to silently and efficiently upgrade insecure connections to full, modern TLS. I think clever things are possible using nft with IP interface aliases. For instance, you can give an interface (eth0) another address (127.255.254.2), and then maybe do interesting things with traffic routed to that alternate address. David notices that nftables actually lets you encapsulate entire execution environments declared in Linux “cgroups”, which is what containers use to associate resources, and then apply firewall and address translation rules per cgroup.
The interesting thing about cgroups, over address aliases, is that they integrate closer with how container systems like Docker manage networking today. At the end of the day, as cool as full system ambient encryption is, full container support is almost more important. Containers have completely eaten how code is deployed, and an elegant solution automatically encrypting container communication by default would be pretty compelling.
Ratelocking
Systems get compromised. Then what? Ratelocking is about putting a cap on how much data can be lost and how quickly, by separating data at rest from whatever might put it in motion. I’ve been working on this project with Mark Shlimovich, as well as Andy McMurry from getmedal.com. They secure medical records (clearly something we’d rather not lose en masse).
In the real world, risk management is not an all or nothing affair. There’s twenty dollars in the gas station cash register, not all corporate payroll for the month of July. We must scale our defenses to our potential losses. But we have this assumption in system design that, once an attacker hacks any part, they can just dig under our protection layers and always hit the underlying data store.
What if that data store, and the rules enforcing access to it, live somewhere else?
At the hackathon, we demonstrated how to factor a password database out into Lambda and DynamoDB. A client that can invoke a Lambda function can’t at all necessarily alter that function or retrieve stuff directly from its DynamoDB backend.
Some other hackers (paranoid to the end, those guys) got a little nervous at the idea of using these serverless functions to enforce security policy. Maybe some customer security representative could be convinced to muck with the source. Regardless of whether I think that’s possible now, it’s certainly something Amazon could be paid to actively prevent. Cloud function security is also something that somebody at Google has quietly implemented, as per this feature in App Engine:
Permanently prohibit code downloads? Whoever built that is pretty clearly one of us.
But there’s no sense waiting for the future to do what’s clearly possible today. We ended up building a ratelocking demo, not off of the abstract capabilities of Lambda but rather the zealously defended properties of Amazon’s own authentication platform, IAM. WallIAM uses AWS secret keys, and Amazon’s architectural refusal to ever reprovision them, to move us away from the offline attack where you can try a million passwords a second (at worst) towards an online attack where you have to guess right, in few enough attempts, or Amazon IAM stops you in your tracks.
Overflowd
Overflowd is an early attempt to improve abuse response, by increasing the level to which the Internet’s own infrastructure participates, in band, in its defense. I didn’t originally expect to work on Overflowd, but Cosmo Mielke and Jeff Ward really wanted to bash on the DDoS problem. Managing the fallout of an attack is a lot of work. Addresses may be spoofed, routes are often asymmetric (meaning you can’t complain to transiting networks), and often it’s just hard figuring out who to get on the phone with to resolve issues. As progress continues, I’ll share more information on this particular project.
What’s next
This is just a sample of what we collectively worked on. I’ll be presenting full results at the O’Reilly Security Conference in New York, November 1-2. I’ll also be talking there about Autoclave, my design for strong, performant, and developer friendly sandboxing. It’s how this Chrome browser, fully hosted in the cloud…
…ends up perfectly functional, while requiring only thirteen different calls to the Linux kernel. (The exact same code under traditional “runc” container hosting requires over a hundred. It’s a lot easier to secure thirteen syscalls than triple digits.) You can try an earlier version of this engine out yourself at https://autoclave.run.
Yes, it runs Windows.
Lend a hand!
A lot of you have been mailing me (dan@whiteops.com) asking how you can help. We plan to do another hackathon. Stay tuned to White Ops Labs, or follow me and O’Reilly Security—details will be announced soon. The Internet definitely needs more and better code. Let’s make security easy.