Four Short Links
Nat Torkington’s eclectic collection of curated links.
Four short links: 3 July 2020
Differential Privacy, Engineering Resumes, Evil C, and Next Web
- Open Differential Privacy — Open source software from Microsoft and Harvard. (via Microsoft’s announcement).
- Engineering Resumes — to help those of you looking for a new job in these uncertain times, here are some examples of what accomplishments look like for software engineers. These are oriented towards individual contributors (perhaps I’ll do an engineering managers version next).
- Evil C — A 29-byte source file that takes 27m to produce a 16GB executable.
- Platform Adjacency Theory — (Alex Russell) the web thrives or declines to the extent it can accomplish the lion’s share of the things we expect most computers to do. […] Growing a platform’s success requires unlocking use-cases not already serviced. That mean finding needs that combine things your platform is already good at with a small number of missing capabilities. An interesting essay arguing that Apple and Mozilla are underinvesting in web feature development and thus threatening the web metaplatform.
Four short links: 2 July 2020
Data Viz, Fixing Bugs, Hardware, and SaaS Domains
- Sweetviz — an open source Python library that generates beautiful, high-density visualizations to kickstart EDA (Exploratory Data Analysis) with a single line of code. Output is a fully self-contained HTML application. (via Mike Loukides)
- DrRepair — Code from this paper, which tackles learning to repair programs from diagnostic feedback (e.g., compiler error messages).
- Lightning Cables — Here’s my little article about (almost) everything I know about Apple Lightning and related technologies: Tristar, Hydra, HiFive, SDQ, IDBUS and etc. “Little”. Dang, cables are complex these days.
- Domain Structure for SaaS Products — Opinions (with background and context that’s informative even if you disagree with the opinions) on the right use of paths and subdomains to separate marketing and product websites, and to separate customers on the product.
Four short links: 1 July 2020
Python in VS Code, Product Lessons, Voice Data, and Timeshared Robots
- PyLance — Python language server for VS Code that brings type information, auto-imports, type-checking, and multi-root workspace support to Visual Studio Code.
- 50 Short Product Lessons — A set of short thoughts from John Cutler on different elements of product management. (via Twitter).
- Mozilla Updates its Voice Dataset — 54 languages, 7,226 total hours of contributed voice data, 5,591 hours verified. New is a single word target segment: digits, plus “yes”, “no”, “hey”, and “Firefox”.
- Timeshared Robots — Remote access to a robot, and a dev environment, so you can control it. Designed to make parameter tuning very easy.
Four short links: 30 June 2020
Durable Teams, Big Tech, Photos to Cartoons, and Deep Chernoff Faces
- Durable Teams — Rael Dornfest called this “project mindset vs product mindset”. If you’re in a project mindset, you spin up and wind down teams and codebases. If you’re in a product mindset, you have code that lives forever so you need a team to stick with it. Unmaintained code is a security and operational risk, to paraphrase this article.
- Breaking Up Big Tech — I’m collecting these stated issues because there are a lot of suggested courses of action: digital taxes; breaking companies up; preventing companies from selling products in their own marketplaces, and so on and so forth. But what I haven’t seen when these suggestions are made is: A statement of which issue is to be addressed; An analysis of the issue, breaking it down into underlying causes; A hypothesis of how the proposed remedy will be effective; Consequence scanning for undesirable side-effects.
- Learning to Cartoonize Using White-box Cartoon Representations — Paper and code for making “cartoons” (not comic-book style cartoons, but images that look like they were drawn with a sketching app and a drawing tablet) from photos.
- Deep Chernoff Faces — Using a GAN to create Chernoff face visualisations. (This creates visualisations as unsuccessful as all the previous Chernoff face attempts, but it’s still … well … cool!)
Four short links: 29 June 2020
Research, SSH Keys, Breaking Up Google, and Indie VC
- Why Does DARPA Work? — Absolutely the best thing you’ll read this month. A very lucid essay on what makes DARPA work. Resonates with everything else I’ve read and what I’ve heard from program managers.
- Secretive — open-source app for storing and managing SSH keys in the Secure Enclave.
- Break Up Google — Following on from his post about Amazon, Tim Bray has a cogent summary of the strong arguments for breaking up Google. For many years, the astonishing torrent of money thrown off by Google’s Web-search monopoly has fueled invasions of multiple other segments, enabling Google to bat aside rivals who might have brought better experiences to billions of lives.
- More Harm Than Good — Tim O’Reilly talks about VC and the indie.vc approach that he and Bryce are taking now. I’m glad to hear the model getting the love it deserves. It was a bit of a struggle when we did fund four, which was focused on [this newer model]. It was about a third of the size of fund three. But for fund five, the fundraising is [going] like gangbusters. Everybody wants in because the model has proven itself.
Four short links: 28 June 2020
Recreating Painting, NLP Deep Learning, Microcopy, and Apple Chip Performance
- timecraft — synthesizing time lapse videos depicting the creation of paintings.
- Natural Language Processing Advancements By Deep Learning: A Survey — This survey categorizes and addresses the different aspects and applications of NLP that have benefited from deep learning. It covers core NLP tasks and applications and describes how deep learning methods and models advance these areas. We further analyze and compare different approaches and state-of-the-art models.
- How to Write Great Microcopy — Be clear, concise, and useful; Use consistent wording; Create a microcopy framework; Be conversational; Use humor and idioms carefully; Highlight your brand’s character; Be wary of word translations; (Almost) always use the active voice; Use the passive voice (sometimes); Provide context; Assume your user is smart; Keep it scannable; Write short paragraphs and sentences; Don’t overuse contractions and many more short digestable (and illustrated) bits of advice.
- Apple Chip Performance — Every time Apple comes out with an application processor, you get more details in terms of area transistors in performance than you get anyplace else. And I’ve just charted what happens with Apple’s chips every year. For the last 10 years, the performance, power, density — all of this has been increasing directly on a Moore’s Law pace. And I believe cost per transistor continues to go down. Now we have been hit by one really nasty effect, which is Dennard scaling is dead. So all the power gains don’t come just by shrinking devices. They have to come from materials science, with new types of transistors and new architectural approaches. And all of those prove you can still achieve these things.
Four short links: 25 June 2020
Art software, Facial Recognition, No Code, and Firmware Update
- Krita — a professional FREE and open source painting program. Made by and for artists, rather than attempting to clone Photoshop 4.
- Facial Recognition Leads to False Arrest — Civil rights experts say Williams is the first documented example in the U.S. of someone being wrongfully arrested based on a false hit produced by facial recognition technology.
- Amazon Honeycode — This new fully-managed AWS service gives you the power to build powerful mobile & web applications without writing any code. It uses the familiar spreadsheet model and lets you get started in minutes. This is important because there’s a whole category of app that now doesn’t require developer time but also … spare a feel for AirTable. Early mover leads aren’t defensible when the FAANGs decide they can build what you’ve got. Companies with billions of profits to reinvest in competing with your startup are apex predators.
- Device Firmware Update Cookbook — Implementing OTA (Over The Air) firmware updates is a rite of passage for firmware engineers. […] I have worked on multiple firmware update systems over the year, and every time I have learned something new. How do I package my images? How do I make sure I don’t brick the device? How do I share information between my bootloader and my application? […] In this post, I share the device firmware update architecture I would implement knowing everything I know now. I also highlight a few design patterns that are particularly useful.
Four short links: 24 June 2020
CDA, WireViz, Geospatial, and Photorealistic Upscaling
- You’ve Been Referred Here Because You’re Wrong About Section 230 Of The Communications Decency Act — Really good mythbusting piece about Section 230 of the Communications Decency Act (the piece of legislation that places the liability for online content upon the whoever created the content, not on whoever is hosting it).
- WireViz — a tool for easily documenting cables, wiring harnesses and connector pinouts. It takes plain text, YAML-formatted files as input and produces beautiful graphical output (SVG, PNG, …) thanks to GraphViz. It handles automatic BOM (Bill of Materials) creation and has a lot of extra features.
- 10 Opinions about the Geospatial Industry — The most successful and ambitious mapping project of all time, Google Maps, is an advertising platform. There is no “geospatial industry,” only industries with spatial problems. It follows, then, that the most valuable geospatial applications are always custom-built in service of a particular domain that doesn’t self-identify as “geospatial.” The closest you can get to a “geospatial app” is a UI on top of a dev tool. And much more. So good.
- Clear Statement of Why Photorealistic Upscaling is Bogus — ML-assisted upscaling doesn’t produce output that accurately represents the original full-resolution image. It produces output that humans perceive to be realistic-looking, or free of traditional upscaling artifacts. […] The problem with these ML hallucinated upscaled images is that they look and feel real enough that they bypass people’s suspicions. We can try to present them as “Here’s what the suspect might look like”, but when they look like a full-resolution photograph, people will simply assume that it’s exactly what the suspect looks like.
Four short links: 23 June 2020
WFH Security, Face Super-Resolution, A/B Street, and Covid Tokens
- Work From Home Cybersecurity Course — Secure your home Wi-Fi; Strong passwords; VPNs; Protecting confidential information; Personal devices; Phishing. It’s quite basic, but I imagine there are many companies we know (not the ones we’re in, of course!) who need this basic info.
- There Is No (Real World) Use Case for Face Super Resolution — Nothing good ever comes from face datasets. Such a powerful point.
- A/B Street — A/B Street is a game exploring how small changes to a city affect the movement of drivers, cyclists, transit users, and pedestrians. I’m a huge fan of simulations as a way of developing intuition for a field.
- On Contact Tracing and Hardware Tokens — Singapore recently gave a bunch of security people access to their TraceTogether token. Bunnie’s write-up is very good because it contains a discussion of the scenarios of contact tracing, against which any solution must be evaluated. See also Roland Turner and Sean Cross‘s reports.
Four short links: 22 June 2020
Root Causes, Humor, Bluetooth Latency, and Computing with Vision
- Root Causes — here are a set of “root causes” that I think are close to exhaustive: (1) trade-off: we were aware of this concern but explicitly made the speed-vs-quality trade-off (IE, not adding tests for an experiment). This was tech debt coming back to bite us. (2) knowledge gap: the person doing the work was not aware that this kind of error was even possible (IE, tricky race conditions, worker starvation). (3) brain fart: now that we look at it, we should have caught this earlier. “Just didn’t get enough sleep that night” kind of thing.
- Software Engineer D&D Classes — Special ability: Hotfix. All party members immediately gain 1d4 + 1 HP and any damaged equipment is instantly repaired, but you must skip your next turn.
- Bluetooth Latency — Bluetooth headsets introduce 150-300ms of latency. (via Ben Kuhn)
- Computing with Vision — a research program with the goal of devising ways of converting digital logic circuits into visual stimuli – “visual circuits” – which, when presented to the eye, “tricks” the visual system into carrying out the digital logic computation and generating a perception that amounts to the “output” of the computation. That is, the technique amounts to turning our visual system into a programmable computer.
Four short links: June 19, 2020
Security Scanner, Quoteback, Online Events, and Proxy Scrapers
- Tsunami — a general purpose network security scanner with an extensible plugin system for detecting high severity vulnerabilities with high confidence. From Google.
- Quotebacks — like a quote retweet, but for any piece of content on the web. They work on any webpage, and gracefully fall back to a standard blockquote. (via Matt Webb)
- Conferences in the Age of Zoom — (Matt Webb) Can virtual conferences be designed for multi-tasking? (See also Marie Foulston’s spreadsheet party which I found via Matt’s post.
- Scrapoxy — hides your webscraper behind a cloud. It starts a pool of proxies to relay your requests. Now, you can crawl without thinking about blacklisting!
Four short links: 17 June 2020
Drive and Listen, Veblenian Entrepreneurship, Gestural UI, and Loglo
- Drive and Listen — Video of driving through a city, and you can flip through the local radio stations. Quite an impressive sense of place. (As I write this, I’m bicycling/scootering through Wuhan.)
- Veblenian Entrepreneurship — Veblenian Entrepreneurship. This is entrepreneurship pursued primarily as a form of conspicuous consumption. Aside from lowering average entrepreneurial quality, Veblenian Entrepreneurship has a range of (short-run) positive and (medium and long-run) negative effects for both individuals and society at large. We argue that the rise of the Veblenian Entrepreneur has contributed to creating an increasingly Untrepreneurial Economy. That is an economy which superficially appears innovation-driven and dynamic, but is actually rife with inefficiencies and unable to generate economically meaningful growth through innovation.
- Earbud Gestural UI — We propose EarBuddy, a real-time system that leverages the microphone in commercial wireless earbuds to detect tapping and sliding gestures near the face and ears.
- Loglo — “LOGO for the Glowforge”: an experimental — very experimental — programming environment by Avi Bryant. It’s currently focused on the narrow domain of producing SVG output to feed to a CNC machine or laser cutter. A really cute cross between Postscript and a spreadsheet.
Four short links: 16 June 2020
Totalitarian Software, Protobuf, Quantum Computing, and New Databases
- The Global Implications of “Re-education” Technologies in Northwest China — A great summary of the way that technology facilitates China’s Muslim “re-education” system in Northwest China. It feels like the 21C version of IBM helping the Nazis.
- Buf — A project aiming to make protobuf easier to use than JSON, by adding linters, breaking change detector, editor integration, and ultimately a schema library.
- Silq — a new high-level programming language for quantum computing with a strong static type system, developed at ETH Zürich. “High-level” is a relative term.
- Recent Database Tech — This is Part 1 or 2. It covers TileDB (multidimensional arrays); Materialize (SQL views on streaming data); and Prisma (a data layer that abstracts away the db layer, compatible with PostgreSQL, MySQL, and SQLite).
Four short links: 15 June 2020
Social Skills, Programming Languages, GDPR for Developers, and Scanning Microscope
- Team Players: How Social Skills Improve Group Performance — Some people consistently cause their group to exceed its predicted performance. We call these individuals “team players”. Team players score significantly higher on a well-established measure of social intelligence, but do not differ across a variety of other dimensions, including IQ, personality, education and gender. Social skills – defined as a single latent factor that combines social intelligence scores with the team player effect – improve group performance about as much as IQ. We find suggestive evidence that team players increase effort among teammates.
- PLDI 2020 Proceedings — Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation.
- CNIL’s GDPR Guide for Developers — 17 pages, one per major topic: Develop in compliance with the GDPR; Identify personal data; Prepare your development; Secure your development environment; Manage your source code; Make an informed choice of architecture; Secure your websites, applications and servers; Minimize the data collection; Manage user profiles; Control your libraries and SDKs; Ensure quality of the code and its documentation; Test your applications; Inform users; Prepare for the exercise of people’s rights; Define a data retention period; Take into account the legal basis in the technical implementation; Use analytics on your websites and applications.
- 3D Scanning Microscope for $250 — Using a single element, but a stepping platform and stitching software, you can take super-hi-res close-up photos of objects. (Kickstarter)
Four short links: 12 June 2020
Robots, Source Hacking, Anonymous Camera, and Apple in China
- OpenSHC — Syropod High-level Controller (SHC) is a versatile controller capable of generating body poses and gaits for quasi-static multilegged robots. It is implemented as a C++ ROS package that can be easily deployed on legged robots with different sensor, leg and joint configurations.
- Source Hacking — In this report, we identify four specific techniques of source hacking: 1. Viral Sloganeering: repackaging reactionary talking points for social media and press amplification; 2. Leak Forgery: prompting a media spectacle by sharing forged documents; 3. Evidence Collages: compiling information from multiple sources into a single, shareable document, usually as an image; 4. Keyword Squatting: the strategic domination of keywords and sockpuppet accounts to misrepresent groups or individuals These four tactics of source hacking work.
- Anonymous Camera — Real-time removal of faces, voices. etc. to maintain privacy in video.
- Apple’s Success in China — A long article looking at the history of Apple in China, successes and failures. Part 1 introduces the essay series. Part 2 explains Apple’s product-zeitgeist fit in China. Part 3 looks at product localization. Part 4 looks at Apple’s services in China and relationship with Tencent. Part 5 looks at the complexities of operating in China. Part 6 and Part 7 look at Apple’s compliance efforts in respect of the App Store and iCloud respectively. Part 8 looks at Apple’s investment in DiDi. Part 9 concludes with lessons from Apple’s experience in China.
Four short links: 11 June 2020
Automation, Future, Table Library, and Procedurally-Generated Landscapes
- Testing the Automation Revolution Hypothesis — 25 simple job features explain over half the variance in which jobs are how automated. The strongest job automation predictor is: Pace Determined By Speed Of Equipment. Which job features predict job automation did not change from 1999 to 2019. Jobs that get more automated do not on average change in pay or employment. Labor markets change more often due to changes in demand, relative to supply.
- How to Plan for the 21st Century — So, while you may hope for a return to normal, and plan for that as one of your scenarios, it is worth taking the time to think through what you might do were the world we knew to be swept away as surely as the 19th century certainties were swept away by the events of the early 20th century. There will be a temptation to have a “new normal” that is really a minor adjustment of the old. It does not have to be that way.
- Regular Table — A Javascript library for the browser, regular-table exports a custom element named , which renders a regular HTML
to a sticky position within a scollable viewport. Only visible cells are rendered and queried from a natively async virtual data model, making regular-table ideal for enormous or remote data sets. Use it to build Data Grids, Spreadsheets, Pivot Tables, File Trees, ….- Procedurally-Generated Chinese Landscapes — Procedurally-generated vector-format infinitely-scrolling Chinese landscape for the browser. Open source.
Four short links: 10 June 2020
CapRover, Embedded Programming, Text Summarising, and Microservices
- CapRover — an extremely easy to use app/database deployment & web server manager for your NodeJS, Python, PHP, ASP.NET, Ruby, MySQL, MongoDB, Postgres, WordPress (and etc…) applications.
- Gravity — a powerful, dynamically typed, lightweight, embeddable programming language written in C without any external dependencies (except for stdlib). It is a class-based concurrent scripting language with a modern Swift like syntax.
- PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization — We evaluated our best PEGASUS model on 12 downstream summarization tasks spanning news, science, stories, instructions, emails, patents, and legislative bills. Experiments demonstrate it achieves state-of-the-art performance on all 12 downstream datasets measured by ROUGE scores. Our model also shows surprising performance on low-resource summarization, surpassing previous state-of-the-art results on 6 datasets with only 1000 examples. Finally we validated our results using human evaluation and show that our model summaries achieve human performance on multiple datasets. (Code)
- The Seven Deceptions of Microservices — “Lies” implies deliberate intent to deceive, but whether intentional or not, these are the false ideas about microservices (according to the author): Separation of concerns across services reduces complexity; Microservices increase development speed;It’s safer to deploy small services than an entire app; It is often advantageous to scale services independently; Microservice architectures are more performant; Managing multiple services won’t be hard; Microservices will work if you design them carefully from the ground up.
Four short links: 9 June 2020
Monopolies, Internet Voting, Trends, and Translating Programming Languages
- Anti-Monopoly Thinking — Tim Bray reviews “The Myth of Capitalism: Monopolies and the Death of Competition” by Jonathan Tepper and Denise Hearn. The quote that caught my eye:
“X companies control Y% of the US market in Z: X=2, Y=90, Z=beer; X=4, Y=almost all, Z=airlines; X=5, Y=50, Z=banks; X=2, Y=90, Z=health insurers in many states; X=1, Y=75%, Z=fast Internet, most places in the US; X=3, Y=70, Z=pesticides; X=3, Y=80, Z=seed corn.” - Eugene Spafford on Internet Voting — Really one of the goals of an election should be that whoever loses in an election can look at what happened and acknowledge it was a fair loss. For the general population, if your candidate lost and if a majority of people are able to examine the methodology, they can go, “OK, it was fair. We didn’t have the votes.” That’s really the goal. The winner is always going to say, “Yeah, this is right.” I’d not thought of it this way, but obviously yes: we vote so we don’t have bloody revolutions, but voting without credibility will still get us bloody revolutions.
- Radar Trends to Watch: June 2020 — Interesting to see new languages coming up at regular intervals with strengths in particular areas.
- Unsupervised Translation of Programming Languages — We train our model on source code from open source GitHub projects, and show that it can translate functions between C++, Java, and Python with high accuracy. Our method relies exclusively on monolingual source code, requires no expertise in the source or target languages, and can easily be generalized to other programming languages. We also build and release a test set composed of 852 parallel functions, along with unit tests to check the correctness of translations. We show that our model outperforms rule-based commercial baselines by a significant margin.