Blogs

BROWSE: Most Recent | Popular Tags |

Tags > big data

Four short links: 18 April 2014

By Nat Torkington
April 18, 2014

16 Interviewing Tips for User Studies — these apply to many situations beyond user interviews, too. The Backlash Against Big Data contd. (Mike Loukides) — Learn to be a data skeptic. That doesn’t mean becoming skeptical about the value of …

Four short links: 10 April 2014

By Nat Torkington
April 10, 2014

Rise of the Patent Troll: Everything is a Remix (YouTube) — primer on patent trolls, in language anyone can follow. Part of the fixpatents.org campaign. (via BoingBoing) Petabytes of Field Data (GigaOm) — Farm Intelligence using sensors and computer vision …

The backlash against big data, continued

By Mike Loukides
April 9, 2014

Yawn. Yet another article trashing “big data,” this time an op-ed in the Times. This one is better than most, and ends with the truism that data isn’t a silver bullet. It certainly isn’t. I’ll spare you all the links (most of …

The backlash against big data, continued

By Mike Loukides
April 8, 2014

Yawn. Yet another article trashing “big data,” this time an op-ed in the Times. This one is better than most, and ends with the truism that data isn’t a silver bullet. It certainly isn’t. I’ll spare you all the links (most of …

5 Fun Facts about HBase that you didn’t know

By Ben Lorica
April 6, 2014

With HBaseCon right around the corner, I wanted to take stock of one of the more popular1 components in the Hadoop ecosystem. Over the last few years, many more companies have come to rely on HBase to run key products …

Four short links: 2 April 2014

By Nat Torkington
April 2, 2014

Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing (PDF) — Berkeley research paper behind Apache Spark. (via Nelson Minar) Angular Tour — trivially add tour tips (“This is the widget basket, drag and drop for widget goodness!” type …

Wearable intelligence

By Glen Martin
April 1, 2014

The age of ubiquitous computing is accelerating, and it’s creating some interesting social turbulence, particularly where wearable hardware is concerned. Intelligent devices other than phones and screens — smart headsets, glasses, watches, bracelets — are insinuating themselves into our daily …

Four short links: 31 March 2014

By Nat Torkington
March 31, 2014

Game Programming Patterns — a book in progress. Search for the Next Platform (Fred Wilson) — Mobile is now the last thing. And all of these big tech companies are looking for the next thing to make sure they don’t …

Four short links: 28 March 2014

By Nat Torkington
March 28, 2014

WearScript — open source project putting Javascript on Glass. See story on it. (via Slashdot) Mining the World’s Data by Selling Street Lights and Farm Drones (Quartz) — Depending on what kinds of sensors the light’s owners choose to install, …

Four short links: 24 March 2014

By Nat Torkington
March 24, 2014

The Parable of Google Flu (PDF) — We explore two issues that contributed to [Google Flu Trends]’s mistakes—big data hubris and algorithm dynamics—and offer lessons for moving forward in the big data age. Overtrained and underfed? Duktape — a lightweight …

Podcast: thinking with data

By Jon Bruner
March 18, 2014

Max Shron and Jake Porway spoke with me a few weeks ago about frameworks for making reasoned arguments with data. Max’s recent O’Reilly book, Thinking with Data, outlines the crucial process of developing good questions and creating a plan to answer …

Four short links: 18 March 2014

By Nat Torkington
March 18, 2014

On Managers (Mike Migurski) — Managers might be difficult, hostile, or useless, but because they are parts of an explicit power structure they can be evaluated explicitly. Big Data: Humans Required (Sherri Hammons) — the heart of the problem with …

The dangers of data-driven list-making

By Alistair Croll
March 17, 2014

Editor’s note: this post originally appeared on Tilt the Windmill; it is republished here with permission. Startupfest’s Pamela Perotti asked for my thoughts on this great Forbes piece by Lightspeed’s Barry Eggers about using big data to build top ten …

Four short links: 13 March 2014

By Nat Torkington
March 13, 2014

Is Parallel Programming Hard? And, If So, What Can You Do About It? — book by Paul E. McKenney, on single-machine multi-CPU parallel programming. Malignant Computation — The bitcoin mining network would work just as well if it had far …

Four short links: 11 March 2014

By Nat Torkington
March 11, 2014

In-Game Graph Analysis (The Economist) — one MLB team has bought a Cray Ulrika graph-processing appliance for in-game analysis of data. Please hold, boggling. (via Courtney Nash) Disney Bets $1B on Technology (BusinessWeek) — MyMagic+ promises far more radical change. …

Big data and privacy: an uneasy face-off for government to face

By Andy Oram
March 5, 2014

Thrust into controversy by Edward Snowden’s first revelations last year, President Obama belatedly welcomed a “conversation” about privacy. As cynical as you may feel about US spying, that conversation with the federal government has now begun. In particular, the first …

The technical aspects of privacy

By Andy Oram
March 5, 2014

Thrust into controversy by Edward Snowden’s first revelations last year, President Obama belatedly welcomed a “conversation” about privacy. As cynical as you may feel about US spying, that conversation with the federal government has now begun. In particular, the first …

Healthcare Lessons from the Data Sages at Strata

By Bonnie Feldman
February 27, 2014

This article was written with Ellen M. Martin. Most healthcare clinicians don’t often think about donating or sharing data. Yet, after hearing Stephen Friend of Sage Bionetworks talk about involving citizens and patients in the field of genetic research at …

Four short links: 26 February 2014

By Nat Torkington
February 26, 2014

Librarybox 2.0 — fork of PirateBox for the TP-Link MR 3020, customized for educational, library, and other needs. Wifi hotspot with free and anonymous file sharing. v2 adds mesh networking and more. (via BoingBoing) Chicago PD’s Using Big Data to …

Four short links: 19 February 2014

By Nat Torkington
February 19, 2014

1746 Slippy Map of London — very nice use of Google Maps to recontextualise historic maps. (via USvTh3m) TPP Comic — the comic explaining TPP that you’ve been waiting for. (via BoingBoing) Synthetic Biology Investor’s Lament — some hypotheses about …

Four short links: 17 February 2014

By Nat Torkington
February 17, 2014

imsg — use iMessage from the commandline. Facebook Data Science Team Posts About Love — I tell people, “this is what you look like to SkyNet.” A System for Detecting Software Plagiarism — the research behind the undergraduate bete noir. …

Four short links: 10 February 2014

By Nat Torkington
February 10, 2014

Bruce Sterling at transmediale 2014 (YouTube) — “if it works, it’s already obsolete.” Sterling does a great job of capturing the current time: spies in your Internet, lost trust with the BigCos, the impermanence of status quo, the need to …

Big Data solutions through the combination of tools

By Ben Lorica
February 9, 2014

As a user who tends to mix-and-match many different tools, not having to deal with configuring and assembling a suite of tools is a big win. So I’m really liking the recent trend towards more integrated and packaged solutions. A …

The Challenge of Health Data Security

By Julie Steele
February 5, 2014

Dr. Andrew Litt, Chief Medical Officer at Dell, made a thoughtful blog post last week about the trade-offs inherent in designing for both the security and accessibility of medical data, especially in an era of BYOD (bring your own device) …

Four short links: 5 February 2014

By Nat Torkington
February 5, 2014

sigma.js — Javascript graph-drawing library (node-edge graphs, not charts). DARPA Open Catalog — all the open source published by DARPA. Sweet! Quantified Vehicle Meetup — Boston meetup around intelligent automotive tech including on-board diagnostics, protocols, APIs, analytics, telematics, apps, software …

Four short links: 28 January 2014

By Nat Torkington
January 28, 2014

Intel On-Device Voice Recognition (Quartz) — interesting because the tension between client-side and server-side functionality is still alive and well. Features migrate from core to edge and back again as cycles, data, algorithms, and responsiveness expectations change. Meet Microsoft’s Personal …

Four short links: 22 January 2014

By Nat Torkington
January 22, 2014

How a Math Genius Hacked OkCupid to Find True Love (Wired) — if he doesn’t end up working for OK Cupid, productising this as a new service, something is wrong with the world. Humin: The App That Uses Context to …

Four short links: 21 January 2014

By Nat Torkington
January 21, 2014

On Being a Senior Engineer (Etsy) — Mature engineers know that no matter how complete, elegant, or superior their designs are, it won’t matter if no one wants to work alongside them because they are assholes. Control Theory (Coursera) — …

Four short links: 15 January 2014

By Nat Torkington
January 15, 2014

Hackers Gain ‘Full Control’ of Critical SCADA Systems (IT News) — The vulnerabilities were discovered by Russian researchers who over the last year probed popular and high-end ICS and supervisory control and data acquisition (SCADA) systems used to control everything …

Four short links: 8 January 2014

By Nat Torkington
January 8, 2014

Launching the Wolfram Connected Devices Project — Wolfram Alpha is cognition-as-a-service, which they hope to embed in devices. This data-powered Brain-in-the-Cloud play will pit them against Google, but G wants to own the devices and the apps and the eyeballs …

The Snapchat Leak

By Alasdair Allan
January 2, 2014

While the site crumbled quickly under the weight of so many people trying to get to the leaked data—and has now been suspended—there isn’t really such a thing as putting the genie back in the bottle on the Internet. Just before …

Four short links: 26 December 2013

By Nat Torkington
December 26, 2013

Nest Protect Teardown (Sparkfun) — initial teardown of another piece of domestic industrial Internet. Logs — The distributed log can be seen as the data structure which models the problem of consensus. Not kidding when he calls it “real-time data’s …

Four short links: 16 December 2013

By Nat Torkington
December 16, 2013

Suro (Github) — Netflix data pipeline service for large volumes of event data. (via Ben Lorica) NIPS Workshop on Data Driven Education — lots of research papers around machine learning, MOOC data, etc. Proofist — crowdsourced proofreading game. 3D-Printed Shoes …

Four short links: 10 December 2013

By Nat Torkington
December 10, 2013

ArangoDB — open-source database with a flexible data model for documents, graphs, and key-values. Build high performance applications using a convenient sql-like query language or JavaScript extensions. Google’s Seven Robotics Companies (IEEE) — The seven companies are capable of creating …

Four short links: 9 December 2013

By Nat Torkington
December 9, 2013

Reform Government Surveillance — hard not to view this as a demarcation dispute. “Ruthlessly collecting every detail of online behaviour is something we do clandestinely for advertising purposes, it shouldn’t be corrupted because of your obsession over national security!” Brian …

Four short links: 6 December 2013

By Nat Torkington
December 6, 2013

Society of Mind — Marvin Minsky’s book now Creative-Commons licensed. Collaboration, Stars, and the Changing Organization of Science: Evidence from Evolutionary Biology — The concentration of research output is declining at the department level but increasing at the individual level. …

Four short links: 5 December 2013

By Nat Torkington
December 5, 2013

Deducer — An R Graphical User Interface (GUI) for Everyone. Integration of Civil Unmanned Aircraft Systems (UAS) in the National Airspace System (NAS) Roadmap (PDF, FAA) — first pass at regulatory framework for drones. (via Anil Dash) Bitcoin Stats — …

Four short links: 3 December 2013

By Nat Torkington
December 3, 2013

SAMOA — Yahoo!’s distributed streaming machine learning (ML) framework that contains a programming abstraction for distributed streaming ML algorithms. (via Introducing SAMOA) madlib — an open-source library for scalable in-database analytics. It provides data-parallel implementations of mathematical, statistical and machine-learning …

23andMe flap at FDA indicates fundamental dilemma in health reform

By Andy Oram
November 26, 2013

The FDA order stopping 23andM3 from offering its genetic test kit strikes right into the heart of the major issue in health care reform: the tension between individual care and collective benefit. Health is not an individual matter. As I …

Day-Long Immersions and Deep Dives at Strata Santa Clara 2014

By Ben Lorica
November 16, 2013

As the Program Development Director for Strata Santa Clara 2014, I am pleased to announce that the tutorial session descriptions are now live. We’re pleased to offer several day-long immersions including the popular Data Driven Business Day and Hardcore Data …

Four short links: 11 November 2013

By Nat Torkington
November 11, 2013

Living Light — 3D printed cephalopods filled with bioluminescent bacteria. PAGING CORY DOCTOROW, YOUR ORGASMATRON HAS ARRIVED. (via Sci Blogs) Repacking Lego Batteries with a CNC Mill — check out the video. Patrick programmed a CNC machine to drill out …

Four short links: 5 November 2013

By Nat Torkington
November 5, 2013

Influx DB — open-source, distributed, time series, events, and metrics database with no external dependencies. Omega (PDF) — flexible, scalable schedulers for large compute clusters. From Google Research. GraspJS — Search and replace your JavaScript code based on its structure …

How Secure is Your Old and Inactive User Data?

By Jon Callas
November 4, 2013

A couple weeks ago Brian Krebs announced that Adobe had a serious breach, of customer data as well as source code for a number of its software products. Nicole Perlroth of The New York Times updated that to say that …

Four short links: 31 October 2013

By Nat Torkington
October 31, 2013

Insect-Inspired Collision-Resistant Robot — clever hack to make it stable despite bouncing off things. The Battle for Power on the Internet (Bruce Schneier) — the state of cyberspace. [M]ost of the time, a new technology benefits the nimble first. [...] …

Four short links: 30 October 2013

By Nat Torkington
October 30, 2013

Offline.js — Javascript library so web app developers can gracefully deal with users going offline. Android Guides — lots of info on coding for Android. Statistics Done Wrong — learn from these failure modes. Not medians or means. Modes. Streaming, …

Four short links: 29 October 2013

By Nat Torkington
October 29, 2013

Mozilla Web Literacy Standard — things you should be able to do if you’re to be trusted to be on the web unsupervised. (via BoingBoing) Berg Cloud Platform — hardware (shield), local network, and cloud glue. Caution: magic ahead! Shark …

Cloudera Impala: Bringing the SQL and Hadoop Worlds Together

By O'Reilly Strata
October 23, 2013

By John Russell When I came to work on the Cloudera Impala project, I found many things that were familiar from my previous experience with relational databases, UNIX systems, and the open source world. Yet other aspects were all new …

Mining the social web, again

By Mike Loukides
October 22, 2013

When we first published Mining the Social Web, I thought it was one of the most important books I worked on that year. Now that we’re publishing a second edition (which I didn’t work on), I find that I agree with …

Four short links: 18 October 2013

By Nat Torkington
October 18, 2013

Science Not as Self-Correcting As It Thinks (Economist) — REALLY good discussion of the shortcomings in statistical practice by scientists, peer-review failures, and the complexities of experimental procedure and fuzziness of what reproducibility might actually mean. Reproducibility Initiative Receives Grant …

Four short links: 16 October 2013

By Nat Torkington
October 14, 2013

Scientific Data Has Become So Complex, We Have to Invent New Math to Deal With It (Jennifer Ouellette) — Yale University mathematician Ronald Coifman says that what is really needed is the big data equivalent of a Newtonian revolution, on …


1 to 50 of 211 Next
The Watering Hole