Four Short Links

Nat Torkington’s eclectic collection of curated links.

Four Short Links

Four short links: 18 March 2020

Text Adventures, Startups & the Virus, Free Textbooks, and Software Engineering at Google

By Nat Torkington
  1. Inklewriter — open source interactive text adventure game creator. (Fun for adults, but also great to give to kids who love to read) (via Andy Baio)
  2. The Virus Survival Strategy Guide for Your Startup (Steve Blank) — Unfortunately, it’s no longer a normal market. All your assumptions about customers; sales cycle; and, most importantly, revenue, burn rate, and runway are no longer true. If you’re a startup, you’ve likely calculated your runway to last until you raise your next round of funding. Assuming there was going to be a next round. That may be no longer true.
  3. Free Cambridge University Textbooks — all available in HTML for free (gratis) until the end of May.
  4. Software Engineering at Google — a new O’Reilly book. Covers Google’s unique engineering culture, processes, and tools, and how these aspects contribute to the effectiveness of an engineering organization.

Four short links: 17 March 2020

Great Firewall, Security Liability, XOXO Talks, and Internet Censorship Research

By Nat Torkington
  1. How the Great Firewall Discovers Hidden Circumvention Servers — really interesting CCC talk from a few years ago.
  2. The Challenge of Software LiabilityLiability for insecure software is already a reality. The question is whether Congress will step in to give it shape and a coherent legal structure.
  3. XOXO Talks — video archive of past talks. Suitable for the long nights of social isolation.
  4. Selected Research Papers on Internet CensorshipMost papers on CensorBib approach the topic from a technical angle, by proposing designs that circumvent censorship systems, or by measuring how censorship works.

Four short links: 16 March 2020

Uncensored Library, Shmoocon 2020, Differential Privacy, Layoffs

By Nat Torkington
  1. The Uncensored Library — Reporters Without Borders built a library in Minecraft, in which you can read banned books. (via Gizmodo)
  2. Shmoocon 2020 Talk Recordings — everything from email addresses to Verilog by way of Zero Trust, social media, and choose-your-own-adventure ransomware.
  3. Differential Privacy: A Comparison of LibrariesWe will have a look at how the dataset size affects accuracy and how the desired privacy level (epsilon) affects data accuracy. For each case, we will compare the results obtained using the various differential privacy libraries.
  4. Layoffs are Coming (Jacob Kaplan-Moss) — who is likely to get laid off and how to prepare, from a web elder who has lived through two recessions.

Four short links: 13 March 2020

Access Management, Remote Relationships, API Security, and Hexagonal Architecture

By Nat Torkington
  1. OpenAMan open-access management solution that includes authentication, SSO, authorization, federation, entitlements and web services security.
  2. Building Relationships as a Remote Engineering ManagerAnd if you haven’t realized it yet, get used to this—you’re going to spend a lot of time writing.
  3. API Security Maturity Model — I’m not sure if I agree with this specific framework, but I like the idea of a maturity model for APIs in general and security in particular. Level 0 – API Keys and Basic Authentication; Level 1 – Token-Based Authentication; Level 2 – Token-Based Authorization; Level 3 – Centralized Trust Using Claims.
  4. Hexagonal Architecture (Netflix) — The idea of Hexagonal Architecture is to put inputs and outputs at the edges of our design. Business logic should not depend on whether we expose a REST or a GraphQL API, and it should not depend on where we get data from—a database, a microservice API exposed via gRPC or REST, or just a simple CSV file. How Netflix used this architectural concept in practice.

Four short links: 12 March 2020

AWS Bills, Offline-first, Censorship, and Corporate Engineering Blogs

By Nat Torkington
  1. AWS Bill Analysis — always interesting to see how to approach lowering your costs. In this case, the project owner works for Amazon on AWS, but still there were savings to be had.
  2. A Design Guide to Writing Offline-first AppsIn this article, we will be diving into some of the engineering challenges that make designing robust offline-first applications with good user experience hard, and explore some architectures.
  3. Zero Trust InformationTo that end, instead of trying to fight the internet—to try to build a castle and moat around information, with all of the impossible tradeoffs that result—how much more value might there be in embracing the deluge? The either-or is a false frame: you can fight the worst without giving up the best. (I think Ben and I would agree that limiting access to encryption is a bad idea.)
  4. How Some Good Corporate Engineering Blogs Are Written (Dan Luu) — In order to have a boring blog, the corporation has to actively stop engineers from putting interesting content out there. Unfortunately, it appears that the natural state of large corporations tends toward risk aversion and blocking people from writing, just in case it causes a legal or PR or other problem. Presents the process at a couple of different companies with interesting blogs, and some with boring blogs.

Four short links: 11 March 2020

Doctorow Newsletter, Map Library, Micro-FM Pi Nodes, and Evolving Algorithms

By Nat Torkington
  1. Pluralistic — Cory Doctorow’s news site and newsletter, where you can learn about African WhatsApp modders among other things.
  2. Mapnik — LGPLed software that combines pixel-perfect image output with lightning-fast cartographic algorithms, and exposes interfaces in C++, Python, and Node.
  3. pi nodeA π-box is a modular system of radio/streaming broadcast, composed of multiples inputs and outputs. The π-box aims to provide a multi-functional and easy-to-use micro-FM and streaming micro radio station. It is based on the mini-FM approach developed in the 80’s by the Japanese artist and researcher Tetsuo Kogawa, which promotes radio transmissions of FM waves upon a tiny perimeter, such as a house, a block, or a small zone. The π-box combines this ultra local transmission with internet possibilities (through ethernet, Wi-Fi or 3G/4G) to leverage all the possibilities of hybrid transmissions. The system is open source and based on open source software / open hardware.
  4. AutoML-Zero — evolutionary search by modifying basic math operations with minimal human direction: evolutionary search shows promising results by discovering linear regression with gradient descent, 2-layer neural networks with backpropagation, and even algorithms that surpass hand-designed baselines of comparable complexity. Code available.

Four short links: 10 March 2020

ML Lifecycle, Eventing, Offline Support, and TensorFlow Quantum

By Nat Torkington
  1. MLflowan open source platform to manage the ML lifecycle, including experimentation, reproducibility, and deployment. It currently offers three components: tracking, projects, and models.
  2. Eventing Facets (Tim Bray) — the word “eventing” makes my skin crawl, but this series of posts has A+ info in it.
  3. WorkboxJavaScript Libraries for adding offline support to web apps, from Google.
  4. TensorFlow Quantuma library for hybrid quantum-classical machine learning. See also arXiv paper, Google AI Blog, and source.

Four short links: 9 March 2020

Git for Geodata, Language Generation Bias, Online Conferences, and Multithreaded Gotchas

By Nat Torkington
  1. SnoDistributed version control
    for geospatial and tabular data
    . Finally, git for (geo)data done right. Open source.
  2. The Woman Worked as a Babysitter: On Biases in Language Generation — plugging prompts like “the woman worked as” and “the white person worked as” into text generation systems, and the horrors you get back. (via Violent Peng)
  3. How to Run a Free Online Academic Conference: A Workbook — as f2f conferences are canceled left, right, and center, this might be of interest to folks. (via Franklin Sayre)
  4. Solving 11 Likely Problems in Your Multithreaded CodeThere really is a fundamental set of concepts that you need to learn and become comfortable with. It’s likely that certain languages and libraries can hide some concepts over time, but if you’re doing concurrency today, you won’t have that luxury. This article describes some of the more common challenges to be aware of and presents advice for coping with them in your software.

Four short links: 6 March 2020

New Hardware, When Not to Kubernetes, Computability Proof, and Architecture Thinking

By Nat Torkington
  1. Soul of a New Machine: Rethinking the Computer (Bryan Cantrill) — talk at Stanford, about our vision for a new, rack-scale, server-side machine—and how we anticipate advances like open firmware, RISC-V, and Rust will play a central role in realizing that vision.
  2. Let’s Use Kubernetes: Now You Have 8 ProblemsIf you’re part of a small team, Kubernetes probably isn’t for you: it’s a lot of pain with very few benefits. See also the discussion on Lobsters and HN.
  3. Landmark Computer Science Proof Cascades Through Physics and Math (Quanta) — Computational complexity may seem entirely theoretical, but it’s also closely connected to the real world. The resources that computers need to solve and verify problems—time and memory—are fundamentally physical. For this reason, new discoveries in physics can change computational complexity. Readable and interesting.
  4. Millions of Tiny Databases — talking through the reasoning behind the design of the control plane for Elastic Block Storage. Over the decade since [the introduction of Availability Zones], our thinking on failure and availability has continued to evolve, and we paid increasing attention to blast radius and correlation of failure. Not only do we work to make outages rare and short, we work to reduce the number of resources and customers that they affect, an approach we call blast radius reduction. This philosophy is reflected in everything from the size of our datacenter, to the design of our services, to operational practices. (via Morning Paper)

Four short links: 5 March 2020

Face Detection, Mastery, NGINX, and Personal Organizer

By Nat Torkington
  1. libfacedetection — they claim 1000fps. Open source.
  2. Rich Hickey on Becoming a Better DeveloperBy constantly switching from one thing to another you are always reaching above your comfort zone, yes, but doing so by resetting your skill and knowledge level to zero. Mastery comes from a combination of at least several of the following: knowledge; focus; relentless considered practice over a long period of time; detected, recovered-from failures; mentorship by an expert; always working slightly beyond your comfort/ability zone, pushing it ever forward. This was my experience.
  3. NGINX Admin’s HandbookThis handbook is a set of rules and recommendations for the NGINX open source HTTP server. It also contains the best practices, notes, and helpers with countless examples. Many of them refer to external resources.
  4. voidterminal-based personal organizer.

Four short links: 4 March 2020

Accessibility, DOS, Downtime, and Cybersecurity Law

By Nat Torkington
  1. tota11yan [open source] accessibility visualization toolkit from Khan Academy.
  2. DOS Pi — a DOS computer in a keyboard.
  3. Simple Systems Have Less Downtime — Why? (1) Proficiency takes less time; (2) Troubleshooting takes less time; (3) More alternative solutions.
  4. Cybersecurity Law, Policy, and InfrastructureThis is the full text of my interdisciplinary “eCasebook” designed from the ground up to reflect the intertwined nature of the legal and policy questions associated with cybersecurity. My aim is to help the reader understand the nature and functions of the various government and private-sector actors associated with cybersecurity in the United States, the policy goals they pursue, the issues and challenges they face, and the legal environment in which all of this takes place. It is designed to be accessible for beginners from any disciplinary background, yet useful to experienced audiences, too.

Four short links: 3 March 2020

Privacy, HTTP Proxies, Covid-19 and Remote Work, and Tech Writing

By Nat Torkington
  1. Facebook’s Incomplete Download Your Data (Privacy International) — Despite Facebook claim, “Download Your Information” doesn’t provide users with a list of all advertisers who uploaded a list with their personal data. As a user, this means you can’t exercise your rights under GDPR because you don’t know which companies have uploaded data to Facebook. Information provided about the advertisers is also very limited (just a name and no contact details), preventing users from effectively exercising their rights. Recently announced Off-Facebook feature comes with similar issues, giving little insight into how advertisers collect your personal data and how to prevent such data collection. (via Bruce Schneier)
  2. Proxy Verifieran HTTP replay tool designed to verify the behavior of HTTP proxies. It builds a verifier-client binary and a verifier-server binary which each read a set of YAML or JSON files that specify the HTTP traffic for the two to exchange. Open source from Yahoo.
  3. Stripe’s Covid-19 Company Plan — great time to be a remote working consultant, although I’m sure nobody with entrenched on-site culture wants to hear what’s involved to enable long-term sustainable useful remote work (i.e., change your crappy practices that favor in-office staff, move comms to a slower channel for people who aren’t online all at once because they’re visiting the doctor, etc.).
  4. Google’s Tech-Writing CourseThis collection of courses and learning resources aims to improve your technical documentation. Learn how to plan and author technical documents.

Four short links: 2 March 2020

Rollout Automation, Internet of Quarantined People, Draft Fast.ai Book, and p5.js Hits 1.0

By Nat Torkington
  1. Gandalf: An Intelligent, End-to-end Analytics Service for Safe Deployment in Cloud-scale Infrastructure — a paper on Azure’s rollout-monitoring software that analyzes more than 20TB of data per day: 270K platform events on average (770K peak), 600 million API calls, with data on over 2,000 different fault types. If Gandalf doesn’t like what that data is telling it, it will pause a rollout and send an alert to the development team. (That’s from Morning Paper, which has a readable summary of the paper)
  2. Quarantine Cooking (New Yorker) — we tend to think of the Chinese internet as just a battleground—activists and censors locked in an endless conflict. But, to many it is also homey and comforting, parts of it as familiar as a cozy kitchen. Quarantine cooking captures their boredom, their loneliness, their creativity, and their desire for connection amid anxiety and panic. The street will find a way.
  3. Draft of the fast.ai BookThese draft notebooks cover an introduction to deep learning, fast.ai, and PyTorch.
  4. p5.js 1.0 — the celebratory Medium post summarizes what’s gone into the release, but it’s everything from tooling to I18N, libraries, and docs.

Four short links: 28 February 2020

Responsible Mapping, Testing Desirability, Data Center Power, and Augmenting Web Apps with Spreadsheets

By Nat Torkington
  1. Mapping Coronavirus Responsibly (ESRI) — Let’s take a look at how maps can help shape the narrative and, as concern (fear?) grows, how to map the data responsibly.
  2. Don’t Use Low-fidelity Prototypes to Test DesirabilityOne of my favorite techniques for testing desirability of brand new products is the mock screencast: after creating realistic-looking pages using Web Inspector or your favorite design tool, you can then record your screen while “navigating” through the site by tabbing through mockups and narrating the value proposition.
  3. Data Center Power (Jess Frazelle) — fascinating deep dive into power concerns in data centers, talking about what the “hyperscalers” (MSFT, GOOG, etc.) do versus what you’re likely to find in a typical colocation center.
  4. Wildcard: Spreadsheet-driven Customization of Web ApplicationsIn this paper, we present spreadsheet-driven customization, a technique that enables end users to customize software without doing any traditional programming. The idea is to augment an application’s UI with a spreadsheet that is synchronized with the application’s data. When the user manipulates the spreadsheet, the underlying data is modified and the changes are propagated to the UI, and vice versa.

Four short links: 27 February 2020

Map Typeface, Series A, Information Disorders, and Understanding Issues

By Nat Torkington
  1. BellTopo Sans — typeface and free font inspired by the sans serif in old maps. (via Flowing Data)
  2. YC’s Guide to Raising Series AThis guide is a distillation of everything we know about successfully raising an A. It includes insights learned from watching hundreds of founders succeed in raising, and in watching dozens fail.
  3. Information Disorders (Renee DiResta) — survey proposed regulatory approaches to addressing the range of challenges in the information environment, looking at regulatory proposals around ads, antitrust, and privacy, and how these proposed laws impact the privacy-security-free expression balance. (via Fast.ai)
  4. IssuesIssues come in many flavors, for example feature requests, bug reports, customer complaints, security alerts, team retrospectives, etc.; this page describes how our team uses issues, and how we communicate about them.

Four short links: 26 February 2020

WASM in the Kernel, Open Access, Instrumenting Complex Systems, and Attacking NLP

By Nat Torkington
  1. kernel-wasmSafely run WebAssembly in the Linux kernel, with faster-than-native performance..
  2. Smithsonian Open Access — images, 3d models, and more. The ultimate goal is digitizing their whole collection. (via Smithsonian Magazine)
  3. Systems that Defy UnderstandingIn such systems, we must resort to empirical methods. Instead of reasoning about the system and reading the source to answer questions, we find ways to ask questions about the running system. We can perform such queries by looking at existing logs and metrics, or by adding new instrumentation.
  4. TextFoolera model for natural language attack on text classification and inference.

Four short links: 25 February 2020

Incident Management, NLP, Antifragile Ideas, and Blogging

By Nat Torkington
  1. Dispatch — Netflix’s incident management framework. (via Netflix Tech Blog)
  2. Text-to-Text Transformer (Google) — In “Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer,” we present a large-scale empirical survey to determine which transfer learning techniques work best and apply these insights at scale to create a new model that we call the Text-To-Text Transfer Transformer (T5). We also introduce a new open source pre-training dataset called the Colossal Clean Crawled Corpus (C4). The T5 model, pre-trained on C4, achieves state-of-the-art results on many NLP benchmarks while being flexible enough to be fine-tuned to a variety of important downstream tasks. With code, notebook, and pre-trained models.
  3. Antifragile Ideas — a 2015 John Carmack internal talk at Facebook.
  4. fastpages — fast.ai’s blogging system designed to publish research outputs. SWEET. (via fast.ai)

Four short links: 24 February 2020

CV Unethical, Convert Schemas, Multiplayer Live-Coding Audio, and ML Risk Analysis

By Nat Torkington
  1. YOLO Creator Leaves Computer Vision — Joseph Redmon, creator of YOLO (You Only Look Once) has stopped doing computer vision because of its uses. But basically all facial recognition work would not get published if we took Broader Impacts sections seriously. There is almost no upside and enormous downside risk.
  2. schema tool to infer and instantiate schemas and translate between data formats. Supports JSON, GraphQL, YAML, TOML, and XML.
  3. Overtonean open source audio environment designed to explore new musical ideas from synthesis and sampling to instrument building, live-coding, and collaborative jamming. We combine the powerful SuperCollider audio engine with Clojure, a state of-the-art lisp, to create an intoxicating interactive sonic experience.
  4. An Architectural Risk Analysis of Machine Learning Systems — a comprehensive approach to identifying different types of risk in each component and process of a generic machine learning system.