Four Short Links
Nat Torkington’s eclectic collection of curated links.
Four short links: 9 December 2019
Learning from Incidents, ISBN Changes, Prisoner's Dilemma, and Load-bearing Skeletons
- Learning from Incidents — super useful articles on doing what it says on the box. (via duckalini)
- Say Goodbye to the 10-digit ISBN — ISBNs started out using a 10-digit number, but later transitioned to 13-digit numbers as the supply of unused numbers ran low. It has been standard up to this point to have a 10-digital ISBN that corresponded to the 13-digit ISBN [by prefixing it with 978], but the BISG reports that practice will be going away with the adoption of the 979 prefix. Parallels to “just put 19 on the front of the year” left as exercise to the reader.
- RIP Social Darwinism (Cory Doctorow) — Prisoner’s Dilemma games are often cited as evidence of intrinsic selfishness, but what if it turns out that telling people that selfishness is OK is why they behave selfishly, whereas a normative statement of solidarity turns that on its head?
- Load-bearing Skeletons — a phrase that will stick with you.
Four short links: 6 December 2019
Zero Code, Programmers and Experience, Commuting Sucks, and Amazon's Docs
- Declarative Assembly of Web Applications From Predefined Concepts — To build an app, the developer imports concepts from the catalog, tunes them to fit the application’s particular needs via configuration variables, and links concept components together to create pages. Components of different concepts may be executed independently, or bound together declaratively with dataflows and synchronization. The instantiation, configuration, linking, and binding of components is all expressed in a simple template language that extends HTML. (via Morning Paper)
- Programmers and Experience — Uncle Bob’s rough estimate of the number of programmers doubling every five years has a necessary consequence: it means that half the programmers out there have less than five years’ experience. That sentence blew my mind.
- The Commuting Paradox — 2004 paper that finds people with longer commuting time report systematically lower subjective well-being. Something I feel acutely. Interestingly, the Hacker News comments have stories from people who feel invigorated by their commute.
- Amazon Builders Library — a lot of great documentation on how Amazon builds and operates software.
Four short links: 5 December 2019
New Old Infocom, TikTok Privacy, COBOL Conference, and Difficult Conversations
- Rediscovered Incomplete Infocom Text Adventure: Hypochondriac — download link in the video description. Discovered by Adam Summerfield by rummaging through the directories of the Infocom Hard Drive. It’s not finished and it crashes, but wow—that’s like finding a new Shakespeare play. (via Renga in Blue)
- What TikTok Reports About You, and How (Matthias Ebert) — great Twitter thread where he shows how TikTok tracks you and where the data goes. I learned heaps, including Canvas Fingerprinting. They draw an image in the background using vector graphic commands. Afterward, they save the image to a rasterized PNG. This data is quite unique among different devices, depending on settings and hardware.
- COBOL Day — a conference for COBOL developers, in Italy. It’s a skill with immense employability.
- Practice Difficult Conversations (Lara Hogan) — details how to practice hard conversations, and how to have them. Includes sample situations to roleplay.
Four short links: 4 December 2019
Complexity Explorer, Information Awareness, Old School Colors, and Automatic Code Reviews
- The Complexity Explorer — online courses, tutorials, and resources essential to the study of complex systems. Complexity Explorer is an education project of the Santa Fe Institute.
- 52 Things I Learned in 2019 — Each year, humanity produces 1,000 times more transistors than grains of rice and wheat combined.
- How to Fight Lies, Tricks, and Chaos Online (The Verge) — When to look deeper: You have a strong emotional reaction; A story seems totally ridiculous—or perfectly confirms your beliefs; You’re going to spend money because of it; You immediately want to amplify the story. A lot of sound advice on spotting dodgy content and then what to do to dig into it. The trick is to find someone who wants to read it…
- Phosphor Colors — detailed answer on what colors the old amber and green-screen terminals were.
- AWS CodeGuru — a machine learning service for automated code reviews and application performance recommendations. Pricey: $0.75 per 100 lines of code scanned per month. Machine learning that helps programmers is here.
Four short links: 3 December 2019
On-Prem, Groupthink, Probability and Statistics, and Distributed Meetings
- Oxide.computer — a new hardware company, looking to make on-prem easy. (There are still a lot of applications for on-prem) Read Jessie Frazelle and Bryan Cantrell‘s blog posts for more background. The pendulum always swings between local and remote. Web was a huge breakthrough because it was remote info services, but eventually mobile had its day. Web 1.0 was built on pricey on-prem iron, which (with Moore’s Law) brought economies of scale that meant Google, Amazon, Twitter, etc., could build vast data centers for their own use—some of which then became clouds for others to use, the value being fast scaling from zero to zillions. Now there are signs of life in the on-prem again, where value is privacy, control, and so on. It’s always interesting times in this industry.
- Symptoms of Groupthink — Illusion of Invulnerability; Belief in Inherent Morality of the Group; Collective Rationalization; Out-group Stereotypes; Self-Censorship; Illusion of Unanimity; Direct Pressure on Dissenters; Self-Appointed Mindguards.
- Count Bayesie — Video and lecture notes from a tutorial on probability and statistics given at PyData NYC 2019. This tutorial provides a crash course in probability in statistics that will cover the essentials, including probability theory, parameter estimation, hypothesis testing, and using the generalized linear model—all in just 90 minutes! A truly great name.
- A Distributed Meeting Primer (Rands in Repose) — sound tactical advice for good meetings with remote team members. As the host, schedule meetings at X:05 or X:35 and get there at X:00 to make sure all technology is set up for a distributed meeting. Not only does this make sure the meeting starts on time, but it sends an important signal. How often have you had a meeting where seven minutes in someone asks, “Where’s Andy?” Well, Andy is distributed, and no one turned on the video camera. More importantly, Andy has been sitting in his home office for the last seven minutes wondering, “Did they forget me?”
Four short links: 2 December 2019
Experience, Webhooks, Learning Causal Theories, and Learning Vim With Fewer Tears
- Two Years at Dropbox — a lot of wisdom as he reflects on his experience. Nobody cares how you learned the trick, but you’re a wizard if you perform it in front of them.
- webhook — a lightweight incoming webhook server to run shell commands.
- Making Sense of Sensory Input — On our account, making sense of sensory input is a type of program synthesis, but it is unsupervised program synthesis. […] The Apperception Engine […] was designed to satisfy the above requirements. Our system is able to produce interpretable human-readable causal theories from very small amounts of data,
- Vim Adventures — what? Just having it beep at you until you kill the terminal window isn’t a good way to learn vi?!
Four short links: 29 November 2019
BERT, Linux in the Browser, Ethical Gifts, and Resilience
- A Visual Guide to Using BERT for the First Time — This post is a simple tutorial for how to use a variant of BERT to classify sentences. This is an example that is basic enough as a first intro, yet advanced enough to showcase some of the key concepts involved.
- jor1k — Online OR1K Emulator running Linux. Emulates hardware running Linux, in the browser. Wow.
- Ethical Tech-Giving Guide — EFF’s annual guide to presents that fit with their ethical principles.
- People are the Adaptable Element of Complex Systems (Vimeo) — John Allspaw’s talk about the apparent irony of finding sources of resilience (sustaining the capacity to adapt to the unforeseen) […] examining closely what would otherwise be categorized as failure: the messy details of critical incidents.
Four short links: 28 November 2019
Prepper Pi, Homomorphic Encryption, Reverse Engineering, and Synthesizing Data Structures
- Raspberry Pi Recovery Kit — Pi for Preppers.
- Machine Learning on Encrypted data without Decrypting it — an intro to homomorphic encryption, with examples in Julia.
- Reverse Engineering for Beginners (PDF) — a solid introduction to reading assembly language from decompiles, to understand wtf is going on.
- Learning Data Structure Alchemy — Harvard paper on the construction of an engine, a Data Alchemist, which learns how to blend fine-grained data structure design principles to automatically synthesize brand new data structures.
Four short links: 27 November 2019
Rewriting Code, Anomaly Detection, Program Synthesis, and Travel Tricks
- Comby — a tool for matching and rewriting code.
- Continuous Contrast Set Mining (Facebook) — effectively a paper on CCSM, an anomaly-detection framework that uses contrast set mining (CSM) techniques to locate statistically “interesting” (defined by several statistical properties) sets of features in groups. A novel algorithm we’ve developed extends standard contrast set mining from categorical data to continuous data, inspired by tree search algorithms and multiple hypothesis testing.
- Building Your First Program Synthesizer — In this post, I want to get concrete and show what’s involved in building a synthesis tool. By the end of this post, we’ll have a tool that can synthesize simple arithmetic expressions.
- My Travel Habits — some very good travel hacks from VM Brasseur, but I just needed to be told the answer to “should I pack this?” is always “NO.”
Four short links: 26 November 2019
Extending HTTP, Headless Chrome, Election Security, and a Rewrite Horror Story
- Braid — a set of extensions to HTTP which transform it from a state transfer protocol into a state synchronization protocol. When a resource is changed by one client or server, all other clients and servers update. Braid supports Operational Transform and CRDTs at web URLs, enabling peer-to-peer, offline-capable web applications. Interesting idea to extend HTTP rather than build it on top of HTTP.
- browserless Chrome — a web-service that allows for remote clients to connect, drive, and execute headless work, all inside of docker. It offers first-class integrations for puppeteer, selenium’s webdriver, and a slew of handy REST APIs for doing more common work.
- A Short Reading List on Election Security (Matt Blaze) — very short: it fits in a single tweet.
- An Etsy Rewrite Horror Story (Dan McKinley) — It was around this time that everyone got fired. My favorite genre of Twitter thread: the software development nightmare.
Four short links: 25 November 2019
Dates, Computational Propaganda, Ethics Review, and Massively Multiplayer Hackathon
- Why “Always use UTC” is Bad Advice — three use cases to consider, only one of which has UTC as the best answer.
- Industry Responses to Computational Propaganda and Social Media Manipulation (Oxford Information Labs) — a summary of the responses to the 2016 election tampering. tl;dr: no policy change, a flurry of initiatives, substantial differences between platforms in how they tackle it—which probably reflect the differences in their $ strategies.
- A Practical Way to Include an Ethics Review in Your Development Processes — nine questions: Is this work (project) illegal in any country? Does this work respect the dignity of all people? Is this work something that is sustainable? Does this work foster transparency, and honesty? Does this work require people to think about potential harm or good? Can I do this and tell my family about it proudly? Can we describe the balance between good use and harm? Does this protect and respect the moral rights of our customers, users, vendors, or employees? Are we treating everyone fairly?
- Massively Multiplayer Hackathon — this is a brilliant idea!
Four short links: 22 November 2019
FAQs, Privacy, APIs, and Deep Learning Hardware
- FAQ Off — open source software that lets you build gamebook-style FAQ websites to counteract sealioning and mob harassment on social media.
- Privacy Possum — not just content to block ads, Privacy Possum monkey wrenches common commercial tracking methods by reducing and falsifying the data gathered by tracking companies.
- Hyrum’s Law — With a sufficient number of users of an API, it does not matter what you promise in the contract: all observable behaviors of your system will be depended upon by somebody. (via Simon Willison)
- The Deep Learning Revolution and its Implications for Computer Architecture and Chip Design — paper by Jeff Dean (the Google name behind most of the web-scale distributed systems tech). Very readable guide to the reasoning behind Google’s TPU series of custom hardware for inference and learning, as well as future directions to apply deep learning to improve semiconductor design and manufacture, as well as compilers.
Four short links: 21 November 2019
Program Synthesis, Narrow Paragraphs, Tech Radar, and AI Snakeoil
- Program Synthesis and the Art of Programming by Intent with Dr. Sumit Gulwani — Microsoft podcast. It turns out that when you do not commit to a program yourself, but you rather program by intent, we can actually enable some unique debugging experiences that are not going to be there in the standard programming world. We can, for instance, synthesize multiple programs from few examples—each of those programs is consistent with these examples, so they’re user provided—and run all these programs in parallel on the remaining test inputs. If they all produce the same result, it doesn’t really matter which program you pick. But if these programs generate different results on some test input, it is a sign of ambiguity in the user’s intent on their test input.
- Wide vs. Narrow Paragraphs: An Eye Tracking Analysis — Comparing the wide and narrow formatting conditions, our analysis shows that for narrow formatting, subjects (a) read slightly faster, (b) have fewer regressions, (c) retain more information in a post-test of the material, but (d) tend to abandon the ends of longer paragraphs.
- Thoughtworks Radar — issue 21 is out. Cloud, software supply chain, interpreting ML, and development as a team sport are the themes. It does feel a little like they’re drinking their own buzzword koolaid a bit (and I say this from my Web 2.0-toting O’Reilly enclave) when they say “run cost as architecture fitness function” to mean “you need to rearchitect your systems when they cost you too much to run.”
- How to Recognize AI Snake Oil — explains clearly, in a way that even managers can understand, that for predicting social outcomes, AI is not substantially better than manual scoring using just a few features.
Four short links: 20 November 2019
Local First, Worker-Owned Apps, Differently Correct, and Recommender Simulations
- Local-First Software: You Own Your Data, in Spite of the Cloud — Findings: CRDT technology works; the user experience with offline work is splendid; developer experience is viable when combined with functional reactive programming (FRP); conflicts are not as significant a problem as we feared; visualizing document history is important; URLs are a good mechanism for sharing; peer-to-peer systems are never fully “online” or “offline,” and it can be hard to reason about how data moves in them; CRDTs accumulate a large change history, which creates performance problems; network communication remains an unsolved problem; cloud servers still have their place for discovery, backup, and burst compute. (via The Morning Paper)
- Worker-Owned Apps Are Trying to Fix the Gig Economy’s Exploitation (VICE) — workers are seizing the means of production and disintermediating the gig economy middlemen by building their own apps, co-op style. (via Boing Boing)
- A Pirate’s Guide to Accuracy, Precision, Recall, and Other Scores — essential knowledge for anyone who works with data, and prediction systems in particular, because they’re possible ways to answer the question “how correct are my predictions?”
- RecSim — Google AI’s configurable platform for authoring simulation environments to facilitate the study of RL algorithms in recommender systems (and CIRs in particular). (via Google AI Blog)
Four short links: 19 November 2019
GPS Spoofing, Multi-Arch Docker Images, People Problems, and Document Dumps
- Ghost Ships, Crop Circles, and Soft Gold: A GPS Mystery in Shanghai (MIT TR) — dangerous GPS spoofing to smuggle sand. A very William Gibson affair.
- Using Multi-arch Docker Images to Support Apps on Any Architecture — Using buildx, we were able to quickly build a multi-arch Docker image for arm, arm64, and amd64 without a single change to our Dockerfile, and push it up to Docker Hub, from where any Docker-supported platform could transparently pull down the correct image for its architecture.
- A Pragmatic Approach to People Problems — The very first thing you should enforce as a mod is that it is the moderator’s job to deal with problem people. Do not let it turn into a lynch mob.
- Cayman Islands Bank Document Dump (Twitter) — we may no longer need a Wikileaks to distribute leaked/hacked data like this.
Four short links: 18 November 2019
High Modernists, GitLab Runbooks, Wrist Exercises, and a Tiny Dynamicland
- The Efficiency-Destroying Magic of Tidying Up (Florent Crivello) — In his seminal book “Seeing Like a State,” James Scott describes what he calls “high modernists”: lovers of orders who mistake complexity for chaos, and rush to rearrange it from the ground up in a more centralized, orderly fashion. Scott argues that high modernists end up optimizing for a system’s legibility from their perspective, at the expense of its performance from that of the user. This insight is magic. Florent takes it good places.
- Runbooks — GitLab’s runbooks.
- Hand-Wrist Exercises for Computer Users — even if you don’t use these exercises, every computer professional should know the physiology and care of their wrists and hands. If you mentor young people in the industry, teach them this.
- Tinyland — a very, very small Dynamicland.
Four short links: 15 November 2019
- From Serverless to Elixir — always interesting to hear about other people’s technical journeys. I don’t recall the exceptions off hand because this was the quickest I’ve ever shut down a multi-variate test or rolled back code, but I drove the Logger straight into the ground. Request times sky rocketed, memory went off the rails, and I started seeing all sorts of crashes in the Logger process. Steam started coming out of everything and I swear I saw a sprocket fly off. I kinda backed away slowly from that approach.
- The Companies Venture Capital Isn’t Allowed to Invest In — The case of JUUL is quite divisive, and one I don’t have a major opinion to add on. There’s absolutely questions that need to be asked about underage use, and whether the product was designed to appeal to underage users. In these sorts of cases, VCs bear some responsibility for negative behavior when they support founders and decisions which go against the interests of society. Well, with firm and clear moral stances like this, I’m sure absolutely nothing can go wrong.
- The Difference Between Quick and Full Format of a Disk — far more than you ever wanted to know, but it’s surprising how much it turns out that you DO want to know about this.
- How Figma’s Multiplayer Technology Works — another “inside our tech” story, with a good explanation of pros and cons of the decisions they made. A favorite saying of mine is: “experience is a hard master, but fools will have no other.” Reading other people’s experiences is a much gentler master.
Four short links: 14 November 2019
Adversarial Interoperability, Open Virtual Assistants, Code Verification, Distributed Graph Computation
- alt.interoperability.adversarial (EFF) — If adversarial interoperability still enjoyed its alt.-era legal respectability, then Facebook alternatives like Diaspora could use their users’ logins and passwords to fetch the Facebook messages the service had queued up for them and allow those users to reply to them from Diaspora, without being spied on by Facebook. Mastodon users could read and post to Twitter without touching Twitter’s servers. Hundreds or thousands of services could spring up that allowed users different options to block harassment and bubble up interesting contributions from other users—both those on the incumbent social media services and the users of these new upstarts.
- Stanford Open Virtual Assistant Workshop — video available for interesting talks about open VA platforms and the problems that VAs face.
- Scaling Symbolic Evaluation for Automated Verification of Systems Code with Serval (Google) — Through this paper, we address the key concerns facing developers looking to apply automated verification: the effort required to write verifiers, the difficulty of diagnosing and fixing performance bottlenecks in these verifiers, and the applicability of this approach to existing systems. Serval enables us, with a reasonable effort, to develop multiple verifiers, apply the verifiers to a range of systems, and find previously unknown bugs. (via Morning Paper)
- Plato — Tencent’s framework for distributed graph computation and machine learning at WeChat scale.