Four Short Links

Nat Torkington’s eclectic collection of curated links.

Four Short Links

Four short links: 2 January 2020

Voice Assistant, Public Domain, Bing Disinformation, and Knowledge Bases

By Nat Torkington
  1. Rhasspyan open source, fully offline voice assistant toolkit for many languages that works well with Home Assistant, Hass.io, and Node-RED.
  2. Public Domain Day 2020 — Forster’s “A Passage to India,” Gershwin’s “Rhapsody in Blue,” and the first film adaptation of Peter Pan are amongst the works entering the public domain in the US.
  3. Bing’s Top Search Results Contain an Alarming Amount of DisinformationIn general, Bing returns disinformation and misinformation at a significantly higher rate than Google does. In general, Bing directs users to conspiracy-related content, even if they aren’t explicitly looking for it. Bing shows users Russian propaganda at a much higher rate than Google does. Bing places student-essay sites—sites where students post or sell past papers — in its top 50 results for certain queries. Bing dredges up gratuitous white-supremacist content in response to unrelated queries.
  4. Outlinewiki and knowledge base for growing teams. Beautiful, feature rich, markdown compatible, and open source.

Four short links: 1 January 2020

Think Like a Programmer, Do Good Deeds, Command-line Trello-like Tool, and Advice for a New Executive

By Nat Torkington
  1. Seven Ways to Think Like a Programmer1. It’s all just data. 2. Data doesn’t mean anything on its own—it has to be interpreted. 3. Programming is about creating and composing abstractions. 4. Models are for computers, and views are for people. 5. Paranoia makes us productive. 6. Better algorithms are better than better hardware. 7. The tool shapes the hand.
  2. Using FOIA Data and Unix to Halve Major Source of Parking Tickets — a reminder of one of the best things to happen in the 2010s: automating good deeds.
  3. TaskbookTasks, boards, and notes for the command-line habitat.
  4. Advice for a New Executive (Lara Hogan) — Chad’s advice to Lara when she joined Kickstarter. 1. Find/create a peer support group. 2. Partner absurdly closely with product and make sure you understand priorities and the head of product understands tradeoffs. 3. Focus on delivery of the roadmap and everything else will follow. 4. Ask your executive peers regularly what you can do to make their jobs easier—particularly the CEO. 5. Take a stand when you need to. 6. Always have a story. 7. Read widely—offline!—about management and leadership. 8. Realize the impact your mood and demeanor has on people. 9. Develop the right relationship with members of your company’s board. From August, but it holds up very well.

Four short links: 31 December 2019

Learn Assembly, Quantum Puzzles, Ghost Characters, and Computer Networks

By Nat Torkington
  1. MicrocorruptionYou’ve been given access to a device that controls a lock. Your job: defeat the lock by exploiting bugs in the device’s code. Fun way to learn assembly language and debugging.
  2. Meqanicquantum computer puzzle game.
  3. Unicode’s Ghost Charactersafter the JIS standard was released, people noticed something strange—several of the added characters had no obvious sources, and nobody could tell what they meant or how they should be pronounced. Nobody was sure where they came from. These are what came to be known as the ghost characters.
  4. Computer Networks: A Systems Approach (GitHub) — textbook released under CC.

Four short links: 30 December 2019

Dynamic Graphs, Gamification, hipsterDB, and JavaScript Testing

By Nat Torkington
  1. GraphStreama Java library for the modeling and analysis of dynamic graphs. You can generate, import, export, measure, layout, and visualize them.
  2. Governing by Video Game“Real participation—and this is important to clarify—is not a game. It takes time. It takes energy. That’s why not many people participate,” Sugeo says. On the other hand, making it clear that an activity is supposed to be a bit of fun, à la CitySwipe, immediately downgrades the seriousness with which participants engage. “So you’re probably attracting more people to the simplified version and still not solving the problem of engagement.”
  3. hipsterDBhipsterDB is a key/value store that only returns data as long as it isn’t mainstream. The more often that you access a key, the more mainstream it becomes. After data has gone mainstream, you will have to wait for it to go out of style before using it again. Satire, duh.
  4. JavaScript and Node.js Testing Best Practices — covering test anatomy, back end, front end, measuring test effectiveness, and continuous integration.

Four short links: 27 December 2019

Algorithmic Puzzles, Publishing, Microtask Programming, AI Introduction

By Nat Torkington
  1. Algorithmic Puzzles: History, Taxonomies, and Applications in Human Problem SolvingThe paper concerns an important but underappreciated genre of algorithmic puzzles, explaining what these puzzles are, reviewing milestones in their long history, and giving two different ways to classify them. Also covered are major applications of algorithmic puzzles in cognitive science research, with an emphasis on insight problem solving, and the advantages of algorithmic puzzles over some other classes of problems used in insight research. The author proposes adding algorithmic puzzles as a separate category of insight problems, suggests 12 specific puzzles that could be useful for research in insight problem solving, and outlines several experiments dealing with other cognitive aspects of solving algorithmic puzzles.
  2. b-berboth a method and an application for producing publications in a variety of formats—EPUB 3, Mobi/KF8, static website, PDF, and XML file, which can be imported into InDesign for print layouts—from a single source that consists of plain-text files and other assets. b-ber also functions as a browser-based EPUB reader, which explains the name.
  3. Microtask ProgrammingTo more effectively harness potential contributions from the crowd, we propose a method for programming in which work occurs entirely through microtasks, offering contributors short, self-contained tasks such as implementing part of a function or updating a call site invoking a function to match a change made to the function. In microtask programming, microtasks involve changes to a single artifact, are automatically generated as necessary by the system, and nurture quality through iteration.
  4. Elements of AIa series of free online courses created by Reaktor and the University of Helsinki. We want to encourage as broad a group of people as possible to learn what AI is, what can (and can’t) be done with AI, and how to start creating AI methods. The courses combine theory with practical exercises and can be completed at your own pace. Super high-level but also super-accessible, so something to give to non-coders who are curious.

Four short links: 26 December 2019

Paper Recommendations, SDR, JavaScript OCR, and Changing Minds

By Nat Torkington
  1. Stuff I Learned in 2019 — this cat is deep into their theory, and shares a lot of paper recommendations for topics that sound like they were generated by a neural net. This paper introduces cubical type theory and its implementation in Agda. […] This uses a different encoding of presburger sets, which allows them to bound a different quantity (the norm) rather than the bitwidth descriptions. But the best lesson learned may be: I now have a single file […] to which I add notes on things I find interesting. I kept a ruthless log as I learned at my last gig, and I miss that. 2020 is the year I pick this up again.
  2. New to SDR — a get-started guide, from the LuaRadio folks.
  3. Tesseract.jsa pure JavaScript port of the popular Tesseract OCR engine.
  4. This is How to Change Someone’s Mind: Six Secrets from Research — some help for those difficult holiday conversations. Be a partner, not an adversary. Use Rapaport’s Rules. Facts are the enemy. Use the “Unread Library Effect.” Use scales. Use disconfirmation. Serious beliefs are about values and identity. […] If absolutely nothing else works, they might just be a totally unreachable zealot. Or it could be that…you’re the zealot. And if you are unwilling to give any serious consideration to this possibility, that’s a big red flag.

Four short links: 25 December 2019

Multiple Regexps, NLP Beyond English, MSR Roundup, and Embedded Linux

By Nat Torkington
  1. Hyperscan — Intel’s library for fast testing a string against multiple regexps.
  2. Natural Language Isn’t Just English — English isn’t a great representative of the diversity of languages in the world: It’s a spoken language, not a signed language; it has a well-established, long-used roughly phone-based orthographic system; … with white space between words; … using (mostly) only lower-ascii characters; it has relatively little morphology; and, thus, fewer forms of each word; it has relatively fixed word order, etc. It just happens to have a massive training set.
  3. Microsoft Research 2019 Reflection — roundup of MSR’s work in ML, ethics, UI, security, and open source.
  4. Buildroota simple, efficient and easy-to-use tool to generate embedded Linux systems through cross-compilation.

Four short links: 24 December 2019

Cybersecurity Book, Real-time Code Collaboration, Content Moderation, and Dangerous Rust

By Nat Torkington
  1. Cybersecurity Book 1.0 Released — the UK’s National Cyber Security Centre has a comprehensive book covering everything from risks to incident management, laws, protocols, and more.
  2. RTCodea web application to share real-time code with multiple connected users. RTCode takes the pain out of group development, avoiding problems like such as: IDE settings, environment settings, diverging programming SDK versions, code version divergence, and difficulty in code collaboration between users.
  3. The Other Side of Stack Overflow Content Moderation — this post gives you a taste of the flood of questions from people who can’t or haven’t done any work themselves before turning to Stack Overflow. The result is a denial of service attack on mods, which means responses are frequently brusque. “The site’s not friendly!” is the criticism, but perhaps the real problem is that the site is too welcoming.
  4. Learn Rust the Dangerous Waya series of articles putting Rust features in context for low-level C programmers who maybe don’t have a formal CS background—the sort of people who work on firmware, game engines, OS kernels, and the like.

Four short links: 23 December 2019

Choose Your Own Adventure, Crystal OS, Chinese Tone Language, and RIP Chuck Peddle

By Nat Torkington
  1. The Hidden Structures of “Choose Your Own Adventure” Books (Verge) — maps of the books reveal/illustrate the differences between the books.
  2. lilithPOSIX-like x86-64 kernel and userspace written in Crystal. The Crystal language is statically typed with compile-time checks for null references, a concurrency model, C bindings, and Ruby-like syntax. This is the first UI I’ve seen in it.
  3. Wenyanan esoteric programming language that closely follows the grammar and tone of classical Chinese literature. Useless and incomprehensible to me, but a notable experiment. I see plenty of Chinese-language projects on GitHub now, often trending, and I feel like English’s position as the tech de facto lingua franca can no longer be presumed for the next decade.
  4. In Memoriam of Chuck Peddle — he created the 6502, the chip inside the C64, Apple II, Atari 2600, NES, BBC Micro, and other home computers that are where my generation coded, hacked, and BBSed. The book The Story of Commodore, A Company on the Edge gave me huge respect for his work. (via Slashdot)

Four short links: 20 December 2019

Homomorphic Encryption, Supply Chain Security, Location Tracking, Cognitive Uncertainty

By Nat Torkington
  1. ArcaneVMA Fully Homomorphic Encryption Brainfuck virtual machine. A toy language but implementing a serious idea. It’s a positive sign that homomorphic encryption is spreading. However … Our research shows that there are many security pitfalls in fully homomorphic encryption from the perspective of practical application. The security problems of a fully homomorphic encryption in a real application is more severe than imagined.
  2. Supply Chain Security: If I were a Nation State (Bunnie Huang) — In this talk, we will calibrate expectations about how difficult (or easy) it may be for actors ranging from rogue individuals to Nation-States to infiltrate various points of our global supply chain.
  3. One Nation, Tracked (NY Times) — those apps on your phone, the ones that request access to your location, are frequently uploading your location to … well, nobody really knows, but it often ends up aggregated in giant data pools that are analyzed for insights. Or, in this case, leaked to the New York Times. They’re a massive privacy problem.
  4. Cognitive UncertaintyThis paper introduces a formal definition and an experimental measurement of the concept of cognitive uncertainty: people’s subjective uncertainty about what the optimal action is. This concept allows us to bring together and partially explain a set of behavioral anomalies identified across four distinct domains of decision-making: choice under risk, choice under ambiguity, belief updating, and survey expectations about economic variables. […] Building on existing models of noisy Bayesian cognition, we formally propose that cognitive uncertainty generates these patterns by inducing people to compress probabilities toward a mental default of 50:50. We document experimentally that the responses of individuals with higher cognitive uncertainty indeed exhibit stronger compression of probabilities in choice under risk and ambiguity, belief updating, and survey expectations.

Four short links: 19 December 2019

Organizational Complexity, YAGNI, Social Robotics, and Formal Reasoning

By Nat Torkington
  1. The #1 Bug Predictor is Not Technical; It’s Organizational ComplexityMicrosoft Research published a paper in which they developed a new statistical model for predicting whether or not a software module was at risk of having bugs, based on a statistical analysis of the module itself. […] Organizational structure has the highest precision, and the highest recall.
  2. A Failed SaaS Postmortem — I can’t believe how powerful YAGNI is, and how hard it is to internalize. Like, you know YAGNI intellectually, but you still get suckered into building things you don’t actually need. “Sufficient” is hard to judge and even harder to stick to.
  3. What We Can Learn from Social Robots That Didn’t Make ItIn my analysis, the current moment on the social robotics timeline is akin to the era following the failure of the Apple Newton, long before today’s ubiquity of smartphone devices.
  4. Formal Reasoning About Programming As other engineering disciplines have their computer-aided-design tools, computer science has proof assistants, IDEs for logical arguments. We will learn how to apply these tools to certify that programs behave as expected. More specifically, introductions to two intertangled subjects: the Coq proof assistant, a tool for machine-checked mathematical theorem proving, and formal logical reasoning about the correctness of programs.

Four short links: 18 December 2019

Tech Leads, Securing Research, Factoids, and Prioritizing Technical Debt

By Nat Torkington
  1. Tech Lead Expectations for Engineering Projects (Gergely Orosz) — list of responsibilities and a checklist for new projects, from an engineering manager at Uber. On my team of software engineers, we rotate the project lead role. I thought it would be interesting to share the approach we came up with. (via Lobsters)
  2. Fundamental Research Security (NSF) — the section on “research integrity and foreign influence” is interesting reading.
  3. Coolest Things I Learned in 2019 — I like these types of posts—a lot of “huh, that IS cool” nuggets, including some visualizations.
  4. Prioritizing Technical Debt as if Time and Money Matters (Adam Tornhill) — In this presentation, you’ll see how easily obtained version-control data let us uncover the behavior and patterns of the development organization. This language-neutral approach lets you prioritize the parts of your system that benefit the most from improvements, so that you can balance short- and long-term goals guided by data.

Four short links: 17 December 2019

Shenzhen Guide, Content Moderation Damages, WebAssembly Recommended, and Best Business Books

By Nat Torkington
  1. The Essential Guide to Shenzhen (Bunnie Huang) — This book is designed to help non-Mandarin speakers navigate the sprawling electronics markets of Shenzhen. The book is out of print, so Bunnie released it free to download.
  2. The Terror Queue (The Verge) — User-submitted content opens the door to damaging material; laws require it to be identified and removed, which means someone is paid to look at damaging material…which damages them. Doing this in a way that doesn’t harm people is an unsolved problem.
  3. WebAssembly Becomes W3C RecommendationWebAssembly improves web performance and power consumption by being a virtual machine and execution environment enabling loaded pages to run as native compiled code. In other words, WebAssembly enables near-native performance, optimized load time, and perhaps most importantly, a compilation target for existing code bases.
  4. Three Best Business Books of 2019there were three books that appeared on every list this year: Range by David Epstein, Nine Lies About Work by Marcus Buckingham and Ashley Goodall, and Loonshots by Safi Bahcall.

Four short links: 16 December 2019

Screen Time and Depression, Crowdsourcing Bad Ideas, Parallel Programming, and Rust Idioms

By Nat Torkington
  1. Association of Screen Time and Depression in AdolescenceTime-varying associations between social media, television, and depression were found, which appeared to be more explained by upward social comparison and reinforcing spirals hypotheses than by the displacement hypothesis. Both screen time modes should be taken into account when developing preventive measures and when advising parents. Idea is that what do you on the screen matters more than the hours you spend doing it.
  2. Why Crowdsourcing Often Leads to Bad Ideas (HBR) — really good summary of research into crowdsourcing. Intrinsic and extrinsic motivations were associated with higher-quality solutions, whereas learning and prosocial motivations were negatively related to solution quality. Social motivation was not a significant predictor of the quality of ideas.
  3. Regenta language for implicit dataflow parallelism. Regent discovers dataflow parallelism in sequential code by computing a dependence graph over tasks. […] Tasks execute as soon as all dependencies are satisfied, and can be distributed automatically over a cluster of (possibly heterogeneous) machines.
  4. Idiomatic RustA peer-reviewed collection of articles/talks/repos that teach concise, idiomatic Rust. Idioms matter in programming languages, and I’m always surprised by how often people don’t teach them.

Four short links: 13 December 2019

Cross-Platform UI, Marketing Games, 32 Into 64, Unhealthy Facebook

By Nat Torkington
  1. FlutterGoogle’s UI toolkit for building beautiful, natively compiled applications for mobile, web, and desktop from a single codebase. Open source, uses Dart extensively, and has happy users. Worth watching.
  2. Azure Mystery Mansion — collect the keys by visiting Microsoft properties and answering questions. Microsoft made it with Twine. I am fond of games as narrative aids for learning, but it’s interesting to see Twine used for marketing purposes. (via Cloud Blogs)
  3. Win32 on macOS — answering the question “how did Wine developers get 32-bit Windows to run on 64-bit macOS?”.
  4. Association of Facebook Use With Compromised Well-Being: A Longitudinal Study (NCBI) — Our results showed that overall, the use of Facebook was negatively associated with well-being. For example, a 1-standard-deviation increase in “likes clicked” (clicking “like” on someone else’s content), “links clicked” (clicking a link to another site or article), or “status updates” (updating one’s own Facebook status) was associated with a decrease of 5%-8% of a standard deviation in self-reported mental health. These associations were robust to multivariate cross-sectional analyses, as well as to 2-wave prospective analyses. The negative associations of Facebook use were comparable to or greater in magnitude than the positive impact of offline interactions, which suggests a possible tradeoff between offline and online relationships. Don’t read the comments, and beware websites that are all comments.

Four short links: 12 December 2019

Social Science One, Structuring Work, Power of Links, and JavaScript Flowcharts

By Nat Torkington
  1. Social Science One Advisory Group Fingers FacebookAs members of the European Advisory Committee of Social Science One, we—along with the co-chairs—are frustrated. On the one hand, we were genuinely interested in helping to build a model to support academic research, and we appreciate the efforts the specific data science teams within Facebook have made to this end. On the other hand, the eternal delays and barriers from both within and beyond the company lead us to doubt whether substantial progress can be made, at least under the current model. Their proposed next steps would be excellent.
  2. Work is a Queue of QueuesThe ideal situation would be that once you’ve decided to work on a given queue, as an individual or a team, all the stack-based workflows would shift from push()’ing onto your stack and gaining your attention, to merely enqueue()’ing onto your backlog queue to get your ordered attention at a later date. Abstractions for productivity hack fetishists.
  3. The Power of Links (Anil Dash) — For a closed system, those kinds of open connections are deeply dangerous. If anyone on Instagram can just link to any old store on the web, how can Instagram—meaning Facebook, Instagram’s increasingly overbearing owner—tightly control commerce on its platform? If Instagram users could post links willy-nilly, they might even be able to connect directly to their users, getting their email addresses or finding other ways to communicate with them. Links represent a threat to closed systems.
  4. Flowy — flowcharts in JavaScript.

Four short links: 11 December 2019

Resilient Communications, Powered Exoskeleton, Software Making Software, and Disinformation Resources

By Nat Torkington
  1. disaster.radioa disaster-resilient communications network powered by the sun. More specifically, a solar-powered LoRa-connected box running Scuttlebutt and other software. Still a WIP. I’m not sure about the choice of LoRa, though—the throughput isn’t great.
  2. Powered Exoskeleton (IEEE Spectrum) — and the demo heavy object to carry is a missile. Of course.
  3. The Road to Software 2.0 — the idea that software’s going to profoundly change how we develop software. Most companies don’t have the AI expertise to implement Karpathy’s vision. Traditional programming is well understood. Training models isn’t well understood yet, at least not within companies that haven’t already invested significantly in technology (in general) or AI (in particular). Nor are building data pipelines and deploying ML systems well understood. The companies that are systematizing how they develop ML and AI applications are companies that already have advanced AI practices.
  4. The ComProp Navigator (Oxford University) — an online resource guide for civil society groups looking to better deal with the problem of disinformation.

Four short links: 10 December 2019

By Nat Torkington
  1. The Hidden Worries of Facial Recognition Technology — excellent post by Tsinghua Professor Lao Dongyan, questioning Chinese authorities’ plans to put facial recognition into the Beijing subway, with some great rebuttals to common objections. First, some people may think that I am overthinking it, and I cannot appreciate and thank the government, as a father figure, for its protection and kindness. I can only say: forgive me, but I cannot accept this type of kindness.
  2. A Framework for Regulating Competition on the Internet (Ben Thompson) — an interesting framework, where regulators and aggregators are differentiated. Platforms are the most powerful economic and innovation engines in technology: they create the possibility for products that never existed previously and are the foundation for huge amounts of innovation. It is in the interest of society that there be more and larger platforms, not fewer and smaller. [… For aggregators,] regulatory priorities should be the opposite of platforms: given that aggregator power comes from controlling demand, regulators should look at the acquisition of other potential aggregators with extreme skepticism. At the same time, whatever an aggregator chooses to do on its own site or app is less important, because users and third parties can always go elsewhere, and if they don’t, that is because they are satisfied.
  3. Failure Modes in Machine Learning (Microsoft) — excellent round up of intentional failures (perturbation attack, poisoning attack, model inversion, membership inference, model stealing, reprogramming ML system, adversarial example in the physical domain, malicious ML provider recovering training data, attacking the ML supply chain, backdoor ML, and exploit software dependencies) and unintentional failures (reward hacking, side effects, distributional shifts, natural adversarial examples, common corruption, and incomplete testing). (via BoingBoing)
  4. O(n^2) (Bruce Dawson)– Dawson’s first law of computing: O(n^2) is the sweet spot of badly scaling algorithms: fast enough to make it into production, but slow enough to make things fall down once it gets there. [After some debugging work,] I found that WinMgmt.exe was executing roughly a branch instruction per cycle which meant that the loop (which I already knew was consuming most of the CPU time) was running extremely quickly, and the slowness was because it was executing hundreds of billions of times.