Four Short Links
Nat Torkington’s eclectic collection of curated links.
Four short links: 9 July 2019
Future of Work, GRAND Stack, Hilarious Law Review Article, and The Platform Excuse
- At Work, Expertise Is Falling Out of Favor (The Atlantic) — an interesting longform exploration of “the future of work” (aka automation, generalists, lifelong learning) in the context of the Navy’s Littoral Combat Ship experiment. So much applicability to the business world (“experiment” becomes “must succeed flagship project” when CEO changes; chaos is opportunity to learn; etc.).
- GRANDstack — GraphQL, React, Apollo, and Neo4j.
- The Most Important Law Review Article You’ll Never Read: A Hilarious (in the Footnotes) Yet Serious (in the Text) Discussion of Law Reviews and Law Professors (SSRN) — the best discussion of foolish academic publishing measures you’ll read today.
- The ‘Platform’ Excuse is Dieing (The Atlantic) — The platform defense used to shut down the why questions: Why should YouTube host conspiracy content? Why should Facebook host provably false information? Facebook, YouTube, and their kin keep trying to answer, “We’re platforms!” But activists and legislators are now saying, “So what?”
Four short links: 8 July 2019
Algorithmic Governance, DevOps Assessment, Retro Language, and Open Source Satellite
- Algorithmic Governance and Political Legitimacy (American Affairs Journal) — Mechanized judgment resembles liberal proceduralism. It relies on our habit of deference to rules, and our suspicion of visible, personified authority. But its effect is to erode precisely those procedural liberties that are the great accomplishment of the liberal tradition, and to place authority beyond scrutiny. I mean “authority” in the broadest sense, including our interactions with outsized commercial entities that play a quasi-governmental role in our lives. That is the first problem. A second problem is that decisions made by an algorithm are often not explainable, even by those who wrote the algorithm, and for that reason cannot win rational assent. This is the more fundamental problem posed by mechanized decision-making, as it touches on the basis of political legitimacy in any liberal regime.
- The 27-Factor Assessment Model for DevOps — The factors are the cross-product of current best practices for three dimensions (people, process, and technology) with nine pillars (leadership, culture, app development/design, continuous integration, continuous testing, infrastructure on demand, continuous monitoring, continuous security, continuous delivery/deployment).
- Millfork — a middle-level programming language targeting 6502- and Z80-based microcomputers and home consoles.
- FossaSat-1 (Hackaday) — FossaSat-1 will provide free and open source IoT communications for the globe using inexpensive LoRa modules, where anyone will be able to communicate with a satellite using modules found online for under 5€ and basic wire mono-pole antennas.
Four short links: 5 July 2019
Online Not All Bad, Emotional Space, Ted Chiang, Thread Summaries
- How a Video Game Community Filled My Nephew’s Final Days with Joy (Guardian) — you had a rough week. Treat yourself to this heart-warming story of people going the extra mile for someone.
- Self-Report Captures 27 Distinct Categories of Emotion Bridged by Continuous Gradients — Although reported emotional experiences are represented within a semantic space best captured by categorical labels, the boundaries between categories of emotion are fuzzy rather than discrete. By analyzing the distribution of reported emotional states we uncover gradients of emotion—from anxiety to fear to horror to disgust, calmness to aesthetic appreciation to awe, and others—that correspond to smooth variation in affective dimensions such as valence and dominance. Reported emotional states occupy a complex, high-dimensional categorical space. In addition, our library of videos and an interactive map of the emotional states they elicit are made available to advance the science of emotion. (via Dan Hon)
- Sci-Fi Author Ted Chiang on Our Relationship to Technology, Capitalism, and the Threat of Extinction (GQ) — Right now I think we’re beginning to see a correction to the wild techno-boosterism that Silicon Valley has been selling us for the last couple decades, and that’s a good thing as far as I’m concerned. I wish we didn’t swing back and forth from the extremes of Pollyannaish optimism to dystopian pessimism; I’d prefer it if we had a more measured response throughout, but that doesn’t appear to be in our nature. +1 to this. I don’t like the way we have spent 20 years imagining dystopias and then building them.
- Wikum — Summarize large discussion threads.
Four short links: 4 July 2019
Debugging AI, Serverless Foundations, YouTube Bans, and Pathological UI
- tensorwatch — open source Microsoft, a debugging and visualization tool designed for data science, deep learning and reinforcement learning.
- Formal Foundations of Serverless Computing — the serverless computing abstraction exposes several low-level operational details that make it hard for programmers to write and reason about their code. This paper sheds light on this problem.
- YouTube Bans Videos Showing Hacking and Phishing (Kody) — We made a video about launching fireworks over Wi-Fi for the 4th of July only to find out @YouTube gave us a strike because we teach about hacking, so we can’t upload it. YouTube now bans: “Instructional hacking and phishing: Showing users how to bypass secure computer systems”.
- User Inyerface — an exercise in frustration.
Four short links: 3 July 2019
Models, More Models, robots.txt, and Event Sourcing
- On Models (Tom Stafford) — a Twitter thread where he lays out his work in models and the value of them.
- Why Model? — The [article] distinguishes between explanation and prediction as modeling goals, and offers 16 reasons other than prediction to build a model. It also challenges the common assumption that scientific theories arise from and ‘summarize’ data, when often, theories precede and guide data collection; without theory, in other words, it is not clear what data to collect. Among other things, it also argues that the modeling enterprise enforces habits of mind essential to freedom.
- Robots.txt — Google’s robots.txt parser and matcher as a C++ library (compliant to C++11). Released as part of standardization work.
- Mistakes We Made Adopting Event Sourcing (And How We Recovered) — a useful post for those also considering their first system built around events as the mechanism for changing state.
Four short links: 2 July 2019
Lock Convoys, AI Hardware, Lambda Observability, and AI for Science
- The Convoy Phenomenon (Adrian Colyer) — locks on resources lead to performance degradation which never recovers, a situation first described in 1979.
- AI is Changing the Entire Nature of Compute (ZD) — workloads have been doubling every 3.5 months while our post-Moore’s law chip speed increases have been 3.5% per year. What that means, both authors believe, is that the design of chips, their architecture, as it’s known, has to change drastically in order to get more performance out of transistors that are not of themselves producing performance benefits. The article explores some of those directions.
- The Annoying State of Lambda Observability — In the current state of the world, the available strategies boil down to either: (1) Send telemetry directly to external observability tools during Lambda execution. (2) Scrape or trigger off the telemetry sent to CloudWatch and X-Ray to populate external providers. Spoiler: neither option is ideal.
- Accelerating Science: A Computing Research Agenda — I found this quite challenging at first, because it seemed to be “cheating” somehow. But once I viewed it as the computer augmenting the human, not replacing them, then it was more acceptable. But I can imagine that better tools for each step of the scientific journey (e.g., Expressing, reasoning with, updating scientific arguments (along with supporting assumptions, facts, observations), including languages and inference techniques for managing multiple, often conflicting arguments, assessing the plausibility of arguments, their uncertainty and provenance) will create controversy no less than the software “proof” of the four-colour theorem did.
Four short links: 1 July 2019
General-Purpose Probabilistic Programming, Microsoft's Linux, Decolonizing Data, Testing Statistical Software
- Gen — general-purpose probabilistic programming system with programmable inference. Julia package described as Gen’s flexible modeling and inference programming capabilities unify symbolic, neural, probabilistic, and simulation-based approaches to modeling and inference, including causal modeling, symbolic programming, deep learning, hierarchical Bayesian modeling, graphics and physics engines, and planning and reinforcement learning..
- WSL2 Linux Kernel — source for the Linux kernel used in Windows Subsystem for Linux 2 (WSL2).
- Decolonizing Data — Decolonizing data means that the community itself is the one determining the information they want us to gather. Why are we gathering it? Who’s interpreting it? And are we interpreting it in a way that truly serves our communities? Decolonizing data is about controlling our own story and making decisions based on what is best for our people. That hasn’t been done in data before, and that’s what’s shifting and changing.
- Testing Statistical Software — In this post, I describe how I evaluate the trustworthiness of a modeling package, and in particular what I want from the test suite. If you use statistical software, this post will help you evaluate whether a package is worth using. If you write statistical software, this post will help you confirm the correctness of the code that you write.
Four short links: 28 June 2019
Heartbeat Identity, Seam Carving, Q&A Facilitation, and Secure Data in Distributed Systems
- The Pentagon Has a Laser That Can Identify People From a Distance By Their Heartbeat (MIT TR) — A new device, developed for the Pentagon after U.S. Special Forces requested it, can identify people without seeing their faces: instead, it detects their unique cardiac signature with an infrared laser. While it works at 200 meters (219 yards), longer distances could be possible with a better laser. […] It takes about 30 seconds to get a good return, so at present the device is only effective where the subject is sitting or standing.
- Real-world Dynamic Programming: Seam Carving — nifty explanation of using dynamic programming (which has a reputation as a technique you learn in school, then only use to pass interviews at software companies) to implement intelligent image resizing.
- How to Facilitate Q&As (Eve Tuck) — People don’t always bring their best selves to the Q&A—people can act out their own discomfort about the approach or the topic of the talk. We need to do better. I believe in heavily mediated Q&A sessions.
- Project Oak — a specification and a reference implementation for the secure transfer, storage, and processing of data in distributed systems. From Google.
Four short links: 27 June 2019
Security Mnemonics, Evidence Might Work, Misinformation Inoculation, and Spoofing Presidential Alerts
- STRIDE — mnemonic for remembering the different types of threads: Spoofing of user identity; Tampering; Repudiation; Information disclosure (privacy breach or data leak); Denial of service (D.o.S); Elevation of privilege. Use when you’re asking yourself, “what could possibly go wrong?” There’s probably a parallel “how things can be misused” mnemonic like Nazis, Anti-Vaxx, Spam, Threats, and Your Ex Follows You.
- Backfire Effect is Mostly a Myth (Nieman Lab) — some evidence that giving people evidence that shows they’re wrong can change their mind. Perhaps you no longer have to be careful to whom you show this story. Full Fact research manager Amy Sippett reviewed seven studies that have explored the backfire effect and found that “cases where backfire effects were found tended to be particularly contentious topics, or where the factual claim being asked about was ambiguous.” The studies where a backfire effect was not found also tended to be larger than the studies where it was found. Full Fact cautions that most of the research on the backfire effect has been done in the U.S., and “we still need more evidence to understand how fact-checking content can be most effective.”
- Bad News — a browser game by Cambridge University researchers that seems to inoculate users against misinformation. We conducted a large-scale evaluation of the game with N = 15,000 participants in a pre-post gameplay design. We provide initial evidence that people’s ability to spot and resist misinformation improves after gameplay, irrespective of education, age, political ideology, and cognitive style. (via Cambridge University)
- Spoofing Presidential Alerts — Their research showed that four low cost USRP or bladeRF TX capable software defined radios (SDR) with 1 watt output power each, combined with open source LTE base station software could be used to send a fake Presidential Alert to a stadium of 50,000 people (note that this was only simulated—real-world tests were performed responsibly in a controlled environment). The attack works by creating a fake and malicious LTE cell tower on the SDR that nearby cell phones connect to. Once connected an alert can easily be crafted and sent to all connected phones. There is no way to verify that an alert is legitimate. The article itself is paywalled, though Sci-Hub knows how to reach it.
Four short links: 26 June 2019
Ethics and OKRs, Rewriting Binaries, Diversity of Implementation, and Uber's Metrics Systems
- Ethical Principles and OKRs — Your KPIs can’t conflict with your principles if you don’t have principles. (So start by defining your principles; then consider your principles before optimizing a KPI; monitor user experience to see if you’re compromising your principles; and repeat) (via Peter Skomoroch)
- retrowrite — Retrofitting compiler passes through binary rewriting. Paper. The ideal solution for binary security analysis would be a static rewriter that can intelligently add the required instrumentation as if it were inserted at compile time. Such instrumentation requires an analysis to statically disambiguate between references and scalars, a problem known to be undecidable in the general case. We show that recovering this information is possible in practice for the most common class of software and libraries: 64-bit, position independent code (via Mathias Payer)
- Re: A libc in LLVM — very thoughtful post from a libc maintainer about the risks if Google implements an LLVM libc. Avoiding monoculture preserves the motivation for consensus-based standards processes rather than single-party control (see also: Chrome and what it’s done to the web) and the motivation for people writing software to write to the standards rather than to a particular implementation.
- M3 and M3DB — M3, a metrics platform, and M3DB, a distributed time series database, were developed at Uber out of necessity. After using what was available as open source and finding we were unable to use them at our scale due to issues with their reliability, cost and operationally intensive nature we built our own metrics platform piece by piece. We used our experience to help us build a native distributed time series database, a highly dynamic and performant aggregation service, query engine and other supporting infrastructure.
Four short links: 25 June 2019
Analog Deep Learning, Low-Trust Internet, Media Literacy, and Psych Experiments
- The Next Generation of Deep Learning: Analog Computing (IEEE) — Further progress in compute efficiency for deep learning training can be made by exploiting the more random and approximate nature of deep learning work flows. In the digital space that means to trade off numerical precision for accuracy at the benefit of compute efficiency. It also opens the possibility to revisit analog computing, which is intrinsically noisy, to execute the matrix operations for deep learning in constant time on arrays of nonvolatile memories. (Paywalled paper)
- The Internet is Increasingly a Low-Trust Society (Wired) — Zeynep Tufecki nails it. Social scientists distinguish high-trust societies (ones where you can expect most interactions to work) from low-trust societies (ones where you have to be on your guard at all times). People break rules in high-trust societies, of course, but laws, regulations, and norms help to keep most abuses in check; if you have to go to court, you expect a reasonable process. In low-trust societies, you never know. You expect to be cheated, often without recourse. You expect things not to be what they seem and for promises to be broken, and you don’t expect a reasonable and transparent process for recourse. It’s harder for markets to function and economies to develop in low-trust societies. It’s harder to find or extend credit, and it’s risky to pay in advance.
- Internet Awesome — Google’s media literacy materials. Be Internet Awesome is like an instruction manual for making smart decisions online. Kids today need a guide to the internet and media just as they need instruction on other topics. We need help teaching them about credible sources, the power of words and images and more importantly, how to be smart and savvy when seeing different media while browsing the web. All of these resources are not only available for classrooms, but also free and easily accessible for families as well. They’re in both English and in Spanish, along with eight other languages. (via Google Blog)
- PsyToolkit — create and run cognitive psychological experiments in your browser.
Four short links: 24 June 2019
Wacky Timestamps, Computers and Spies, Surveillance Capitalism, and Twitter Adventures
- NTFS Timestamps — a 64-bit value representing the number of 100-nanosecond intervals since January 1, 1601 (UTC). WTAF?
- Computers Changed Spycraft (Foreign Policy) — so much has changed, eg dead letter drops: It is easy for Russian counterintelligence to track the movements of every mobile phone in Moscow, so if the Canadian is carrying her device, observers can match her movements with any location that looks like a potential site for a dead drop. They could then look at any other phone signal that pings in the same location in the same time window. If the visitor turns out to be a Russian government official, he or she will have some explaining to do.
- Netflix Records All of your Bandersnatch Choices, GDPR Request Reveals (Verge) — that’s some next-level meta.
- Being Beyoncé’s Assistant for the Day (Twitter) — a choose-your-own-adventure implemented in Twitter. GENIUS!
Four short links: 21 June 2019
Private Computation, Robot Framework, 3D Objects, and Self-Supervised Learning
- Private Join and Compute (Google) — This functionality allows two users, each holding an input file, to privately compute the sum of associated values for records that have common identifiers. (via Wired)
- PyRobot — from CMU and Facebook. PyRobot is a framework and ecosystem that enables AI researchers and students to get up and running with a robot in just a few hours, without specialized knowledge of the hardware or of details such as device drivers, control, and planning.
- PartNet — a consistent, large-scale dataset of 3D objects annotated with fine-grained, instance-level, and hierarchical 3D part information. Our dataset consists of 573,585 part instances over 26,671 3D models covering 24 object categories. This dataset enables and serves as a catalyst for many tasks such as shape analysis, dynamic 3D scene modeling and simulation, affordance analysis, and others. (via IEEE Spectrum )
- Self-Supervised Learning (Andrew Zisserman) — 122 slides, very readable, about learning from images, from video, and from video with sound.
Four short links: 20 June 2019
Model Governance, Content Moderators, Interactive Fiction, and End-User Probabilistic Programming
- Model Governance and Model Operations — models built or tuned for specific applications (in reality, this means models + data) will need to be managed and protected.
- Bodies in Seats — the story of Facebook 30,000 content moderators: contractors, low pay (as little as $28,800 a year), and lots of PTSD for everyone. “Nobody’s prepared to see a little girl have her organs taken out while she’s still alive and screaming.” Moderators were told they had to watch at least 15 to 30 seconds of each video.
- Dialog — a domain-specific language for creating works of interactive fiction. Inspired by Inform and Prolog, they say.
- End-User Probabilistic Programming — We examine the sources of uncertainty actually encountered by spreadsheet users, and their coping mechanisms, via an interview study. We examine spreadsheet-based interfaces and technology to help reason under uncertainty, via probabilistic and other means. We show how uncertain values can propagate uncertainty through spreadsheets, and how sheet-defined functions can be applied to handle uncertainty. Hence, we draw conclusions about the promise and limitations of probabilistic programming for end-users.
Four short links: 19 June 2019
Voice2Face, DIY Minivac, Cloud Metrics, and Envoy for Mobile
- Speech2Face: Learning the Face Behind a Voice — complete with an interesting ethics discussion up-front. I wonder where this was intended to go: after all, it can’t perfectly reconstruct faces, so what you get is a stereotype based on the voice. Meh.
- Minivac 601 Replica (Instructables) — Created by information theory pioneer Claude Shannon as an educational toy for teaching digital circuits, the Minivac 601 Digital Computer Kit was billed as an electromechanical digital computer system.
- Nines Are Not Enough: Meaningful Metrics for Clouds — We show that this problem shares some similarities with the challenges of applying statistics to make decisions based on sampled data. We also suggest that defining guarantees in terms of defense against threats, rather than guarantees for application-visible outcomes, can reduce the complexity of these problems.
- Announcing Envoy Mobile (Lyft Engineering) — as Simon Willison said: Lyft’s Envoy proxy / service mesh has been widely adopted across the industry as a server-side component for adding smart routing and observability to the network calls made between services in microservice architectures. “The reality is that three 9s at the server-side edge is meaningless if the user of a mobile application is only able to complete the desired product flows a fraction of the time”—so Lyft are building a C++ embedded library companion to Envoy which is designed to be shipped as part of iOS and Android client applications. “Envoy Mobile in conjunction with Envoy in the data center will provide the ability to reason about the entire distributed system network, not just the server-side portion.” Their decision to release an early working prototype and then conduct ongoing development entirely in the open is interesting too.
Four short links: 18 June 2019
JavaScript Spreadsheets, Pessimism, Privacy Policies, and AI Ethics
- jExcel — a lightweight vanilla JavaScript plugin to create amazing web-based interactive tables and spreadsheets compatible with Excel or any other spreadsheet software. You can create an online spreadsheet table from a JS array, JSON, CSV, or XSLX files. You can copy from excel and paste straight to your jExcel spreadsheet and vice versa. It is very easy to integrate any third-party JavaScript plugins to create your own custom columns, custom editors, and customize any feature into your application.
- Why Are We So Pessimistic? (Brookings) — The belief or perception that things are much worse than they really are is widespread, and I believe it comes with significant detrimental impacts on societies.
- We Read 150 Privacy Policies. They Were an Incomprehensible Disaster (NYT) — Only Immanuel Kant’s famously difficult “Critique of Pure Reason” registers a more challenging readability score than Facebook’s privacy policy.
- Perspectives and Approaches in AI Ethics: East Asia — Each country’s perspectives on and approaches to AI and robots on the tool-partner spectrum are evaluated by examining its policy, academic thought, local practices, and popular culture. This analysis places South Korea in the tool range, China in the middle of the spectrum, and Japan in the partner range.
Four short links: 17 June 2019
Multiverse Databases, Detecting Photoshopping, Simulation Platform, and Tail-Call Optimization: The Musical
- Towards Multiverse Databases (Morning Paper) — The central idea behind multiverse databases is to push the data access and privacy rules into the database itself. The database takes on responsibility for authorization and transformation, and the application retains responsibility only for authentication and correct delegation of the authenticated principal on a database call. Such a design rules out an entire class of application errors, protecting private data from accidentally leaking.
- Detecting Photoshopped Fakes (Verge) — Adobe worked with Berkeley researchers to develop software that can spot Photoshopping in an image. (via BoingBoing).
- Open Sourcing AI Habitat (Facebook) — a new simulation platform created by Facebook AI that’s designed to train embodied agents (such as virtual robots) in photo-realistic 3D environments. […] To illustrate the benefits of this new platform, we’re also sharing Replica, a data set of hyperrealistic 3D reconstructions of a staged apartment, retail store, and other indoor spaces.
- Tail-Call Optimization: The Musical (YouTube) — you’re welcome.
Four short links: 14 June 2019
Information Operations, Game Creator, History Lessons, and Physical Pen Testing
- Information Operations on Twitter: Principles, Process, and Disclosure (Twitter) — We believe that people and organizations with the advantages of institutional power and which consciously abuse our service are not advancing healthy discourse but are actively working to undermine it. By making this data open and accessible, we seek to empower researchers, journalists, governments, and members of the public to deepen their understanding of critical issues impacting the integrity of public conversation online, particularly around elections. This transparency is core to our mission. Twitter is leading in this area; it’s great to see. I hope this makes others lift their game.
- Create 3D Games with Friends, No Experience Required (Google) — Our prototype is called Game Builder, and it is free on Steam for PC and Mac.
- Five Lessons from History — all are relevant to business as well as to wider politics: People suffering from sudden, unexpected hardship are likely to adopt views they previously thought unthinkable. Reversion to the mean occurs because people persuasive enough to make something grow don’t have the kind of personalities that allow them to stop before pushing too far. Unsustainable things can last longer than you anticipate. Progress happens too slowly for people to notice; setbacks happen too fast for people to ignore. Wounds heal; scars last.
- I’ll Let Myself In: Tactics of Physical Pen Testers (YouTube) — As head of a Physical Penetration team, however, my deliverable day tends to be quite different. With faces agog, executives routinely watch me describe (or show video) of their doors and cabinets popping open in seconds. This presentation will highlight some of the most exciting and shocking methods by which my team and I routinely let ourselves in on physical jobs.