Blogs
Tags > search
Robots.Txt and the .Gov TLD
By Carl MalamudNovember 20, 2009
The robots.txt file should be used sparingly by government organizations and only in a non-discriminatory fashion.
Asia Continues to be Facebook's Strongest Growth Region
By Ben LoricaNovember 20, 2009
With Facebook topping 330 million active users over the past week, the company's strongest growth region continues to be Asia. Over the last 12 weeks, Facebook added close to 17M active users in Asia alone. Since my previous post, the share of active users from Asia grew by 2% (to 13.5% of all users), and roughly 1 in 7 users...
Four short links: 20 November 2009
By Nat TorkingtonNovember 20, 2009
Spokeo -- abysmal indictment of society, first prize in mankind's race to the bottom. Uncover personal photos, videos, and secrets ... GUARANTEED! Spokeo deep searches within 48 major social networks to find truly mouth-watering news about friends and coworkers. PS, anybody who gives their gmail username and password to a site that specializes in dishing dirt can only be...
Four short links: 12 November 2009
By Nat TorkingtonNovember 12, 2009
Fat Free CRM -- open source (Affero GPL) Ruby on Rails CRM system. Bixo -- open source data mining toolkit that runs as a series of pipes on top of Hadoop. Built on Cascading workflow system for Hadoop that hides MapReduce. (via kdnuggets) Andy Kessler's Keynote at Defrag Stank (Pete Warden) -- I'm sorry to hear it, because I...
Four short links: 2 November 2009
By Nat TorkingtonNovember 2, 2009
Your Botnet is My Botnet (PDF) -- 2008 USENIX Security paper analysing >70G of data gathered when security researchers hijacked the Torpig botnet. A major limitation of analyzing a botnet from the inside is the limited view. Most current botnets use stripped-down IRC or HTTP servers as their command and control channels, and it is not possible to make...
Four short links: 30 October 2009
By Nat TorkingtonOctober 30, 2009
The3is In Three -- PhD students must explain their thesis topic in three minutes and one Powerpoint slide. Winner had written on the last words of Shakespearean characters as they met unlikely ends. No video alas, but what a great idea for an Ignite! (via sciblogs) Google Wave: We Came, We Saw, We Played D&D (ArsTechnica) -- gamers using...
Four short links: 25 September 2009
By Nat TorkingtonSeptember 25, 2009
Diesel: A Case Study In That Thing I Just Said -- a new asynchronous I/O library in Python, which earned this fabulous review from Glyph Lefkowitz who wrote the granddaddy of all asynch libraries in Python, Twisted. Again, I don't want to dump on Diesel here; for what it is, i.e. an experiment in how to idiomatically structure asynchronous...
There are Over a Million People Actively Using Facebook Right Now
By Ben LoricaSeptember 24, 2009
A little over a week ago Facebook reached a major milestone: 300 million active users. The fastest-growth region continues to be Asia, but growth in other overseas regions such as the Americas and Africa have also been strong. Currently reaching only 1% of potential users in Asia and Africa, Facebook has barely scratched the surface in both regions: Growth in...
Four short links: 8 September 2009
By Nat TorkingtonSeptember 7, 2009
jQTouch -- jQuery library for mobile web app development. (via brian on Delicious) GData API to Google Book Search -- search full text, get back metadata, modify "my library" collections, etc. Open and Free Courses at the CMU Open Learning Initiative -- rather than just a lecture and handout dump, it has interactive exercises and questions to help you...
Four short links: 26 August 2009
By Nat TorkingtonAugust 26, 2009
Better BBQ Through Chemistry -- food is the perfect ground for geek training: there are measurements, there's science, it's easy to know whether you've succeeded, and you can eat all but the worst of your failures. (via BoingBoing) NoSQL (East) -- conference on East Coast for relationless databases. Human Brain Processing Speed -- clocked at 60bits/second, according to this...
Four short links: 25 August 2009
By Nat TorkingtonAugust 24, 2009
Tineye -- reverse search engine; you upload an image and they find you similar images so you know where else it's used. Check out their cool searches. PDF Pirate -- upload a PDF and this web site will give it back to you minus the restrictions on copying/printing/etc. Flare -- an ActionScript library for creating visualizations that run in...
Four short links: 11 August 2009
By Nat TorkingtonAugust 10, 2009
The Slowing Growth of Wikipedia and More Details of Changing Editor Resistance -- researchers at PARC analysed Wikipedia and found the number of new articles and number of new editors have flattened off, and more edits from first-time contributors are being reverted. This is a writeup in their blog, with the numbers and charts. It's interesting that coverage in...
Four short links: 7 August 2009
By Nat TorkingtonAugust 6, 2009
Defragging the Stimulus -- each [recovery] site has its own silo of data, and no site is complete. What we need is a unified point of access to all sources of information: firsthand reports from Recovery.gov and state portals, commentary from StimulusWatch and MetaCarta, and more. Suggests that Recovery.gov should be the hub for this presently-decentralised pile of recovery...
Use APIs to do market research
By Andrew OdewahnJuly 30, 2009
Basic product attribute questions (what's the best price, size, length, etc) are crucial elements in any product or marketing strategy, but it's often too difficult or expensive to get timely market information. However, a quick script that pulls data from a relevant website's API can often give you an answer that's good enough. This post provides a few techniques for using this powerful new resource for market research.
Four short links: 14 July 2009
By Nat TorkingtonJuly 14, 2009
Twenty Questions about GPLv3 (Jacob Kaplan-Moss) -- twenty very challenging questions about the GPLv3. foo.js is a JavaScript library released under the GPLv3. bar.js is a library with all rights reserved. For performance reasons, I would like to minimize all my site’s JavaScript into a single compressed file called foobar.js. If I distribute this file, must I also distribute...
Four short links: 10 July 2009
By Nat TorkingtonJuly 9, 2009
Ceph -- open source distributed filesystem from UCSC. Ceph is built from the ground up to seamlessly and gracefully scale from gigabytes to petabytes and beyond. Scalability is considered in terms of workload as well as total storage. Ceph is designed to handle workloads in which tens thousands of clients or more simultaneously access the same file, or write...
Bing's Sanaz Ahari on Query Level Categorization (1 of 2)
By Brady ForrestJune 29, 2009
A couple of weeks ago Bing had a small search summit for analysts, bloggers, SEO experts, entrepreneurs and advertisers. It was held in Bellevue; they put us up in the hotel and fed us. While there we received demos from Bing project teams. I was able to snag an interview with Sanaz Ahari, Lead PM on Bing. She led the...
Facebook Adds Million of Users in Asia
By Ben LoricaJune 19, 2009
Since my previous post on Facebook users by country, the company has grown rapidly in Asia. Over the last 12 weeks, Facebook grew 90% in Asia going from 11.4 to 21.7 million active users. With a Market Penetration of only 0.6% in Asia, Facebook has barely scratched the surface in the region. The company also gained 11.3M users in Europe...
Four short links: 18 June 2009
By Nat TorkingtonJune 18, 2009
Harvard Study Finds Weaker Copyright Protection Has Benefited Society (Michael Geist) -- Given the increase in artistic production along with the greater public access conclude that "weaker copyright protection, it seems, has benefited society." This is consistent with the authors' view that weaker copyright is "uambiguously desirable if it does not lessen the incentives of artists and entertainment companies...
Google Squared is an Exponential Improvement in Search
By James TurnerJune 4, 2009
One of the things I've learned about Google is that the most amazing things will come out of them with barely a whisper of fanfare. Such is the case with Google Squared, a new Google Labs tool that was released today. What does Google Squared do? It organizes and tables information from searches for you in a way that makes it much more useful.
Google's Browser-Based Plan for Ebook Sales
By Mac SlocumJune 1, 2009
BEA '09 may be remembered as the moment when Google formally entered the ebook market. From the New York Times: Mr. [Tom] Turvey [director of strategic partnerships at Google] said...
Google Engineering Explains Microformat Support in Searches
By James TurnerMay 12, 2009
Today, Google is releasing support for parsing and display of microformat data in their search results. While the initial launch will be limited to a specific set of partners (including LinkedIn, Yelp and CNet reviews), the intent is that very quickly, anyone who marks their pages up with the appropriate microformat data will be able to make their information understandable...
Coming this Month: Wolfram|Alpha Search
By Michael FitzgeraldMay 12, 2009
Sometime this month, a new, more-interesting-than-your-average-search-engine will launch: Wolfram|Alpha. Wolfram Research, who has brought us Mathematica and A New Kind of Science, has set out to "create a true computational knowledge engine" that I am dying to play with. According...
Results from Wolfram Alpha: All the Questions We Ever Wanted to Ask About Software as a Service
By Andy OramMay 6, 2009
Software as a Service, known in earlier decades as Application Service Providers, upends the relationship between computer users and software. I'm seriously tempted to say that Wolfram Alpha takes the SaaS model to its extreme. So Wolfram Alpha's chances at scaling the heights of fame should force us to stop for a moment and run our own calculations concerning the value to us of data integrity, reliability, privacy, and innovation.
Four short links: 29 Apr 2009
By Nat TorkingtonApril 29, 2009
Moot Wins, Time Inc. Loses -- summary of how the 4chan group Anonymous rigged the voting in Time's 100 Most Influential poll to not just put their man at the top, but also spell an in-joke with the initial letters of the first 21 people. Time tried weakly to prevent the vote-rigging, and ReCAPTCHA gave the Internet scalliwags their...
Four short links: 27 Apr 2009
By Nat TorkingtonApril 27, 2009
Google Server and Data Center Details -- Greg Linden reports on a Efficient Data Center Summit. Google uses single volt power and on-board uninterruptible power supply to raise efficiency at the motherboard from the norm of 65-85% to 99.99%. There is a picture of the board on slide 17. (and this is a 2005 board). Greg has left Microsoft...
Four short links: 21 Apr 2009
By Nat TorkingtonApril 21, 2009
Space arrays, mobile hell, book scanners, and open source brains: Great Brazilian Sat-Hack Crackdown (Wired) -- Satellite hackers in Brazil are bouncing ham signals off a disused US military satellite array. Truck drivers love the birds because they provide better range and sound than ham radios. Rogue loggers in the Amazon use the satellites to transmit coded warnings when authorities...
Active Facebook Users By Country
By Ben LoricaApril 19, 2009
Since I last posted numbers on Facebook's user base six week ago, the company has added close to 20 million active users. I've had a few requests for detailed numbers by country so I quickly assembled an update for each of the regions shown above....
Simplify business research with Google Ajax Search API
By Andrew OdewahnApril 13, 2009
Business research usually starts with a list -- brands, competitors, people, products, whatever. This post describes a quick Python script that uses the Google Search API to automate the routine parts of the task, giving you more time to analyze and understand the results.
Four short links: 7 Apr 2009
By Nat TorkingtonApril 7, 2009
Maps, meaning, makers, and orphaned works: Lens Tools and Fisheye Map Browsing -- a summary of magnification in maps through history, culminating in use of the fisheye/lens as a way to explore layers and data in thematic maps. (via Titine's delicious stream) Socially Relevant Computing -- frustrated by the meaningless examples and work in computer science classes, Mike Buckley started...
Four short links: 2 Apr 2009
By Nat TorkingtonApril 2, 2009
Predictions, PDF, source code control, and recommendation engines: Wrong Tomorrow -- track pundits predictions and see how accurate they really are. From the ever-awesome Maciej Ceglowski. PDFMiner -- Unlike other PDF-related tools, it allows to obtain the exact location of texts in a page, as well as other layout information such as font size or font name, which could be...
At Risk: Universal Online Access to All Knowledge
By Linda StoneMarch 11, 2009
"After digesting the proposed Google Book Settlement, it becomes clear that the dizzyingly complex agreement is, in essence, an elaborate scheme for the exploitation of orphan works… The upshot, if the Settlement is approved, would be legal protection for Google, and only for Google, to scan and provide digital access to the orphan works."
Four short links: 5 Mar 2009
By Nat TorkingtonMarch 5, 2009
Google Books, conference books, a museum API, and some number silliness that makes me happy. Jon Orwant on Google Book Search at TOC -- Jon drops info on conversion rates, future plans, mobile, etc. See this post for a roundup of blog-world commentary on the talk. Brooklyn Museum Collection API -- I've linked to this amazing museum work before. Now...
No Need for Weed. Or, Misadventures in Google Book Search
By Allen NorenMarch 3, 2009
We all know that Google is known for their ground breaking software, their keen ability to mine the collective intelligence of users and deep data sets to deliver just the right bit of information one needs at a point in...
How Entity Extraction is Fueling the Semantic Web Fire
By Dan McCrearyFebruary 23, 2009
New OpenSource Entity Extraction programs are becoming easier than ever for non-programmers to use. Apache UIMA is one example of a revolutionary technology that will make it easier then ever for non-programmers to tap the power of the Semantic Web.
Four short links: 16 Feb 2009
By Nat TorkingtonFebruary 16, 2009
A lot of Python and databases today, with some hardware and Twitter pranking/security worries to taste: Free Telephony Project, Open Telephony Hardware -- professionally-designed mass-manufactured hardware for telephony projects. E.g., IP04 runs Asterisk and has four phone jacks and removable Flash storage. Software, schematics, and PCB files released under GPL v2 or later. Don't Click Prank Explained -- inside the...
Google Opens Mobile Access to Public-Domain Books
By Andrew SavikasFebruary 5, 2009
Via a Google press release, word that visiting books.google.com/m provides mobile access to 1.5 million public-domain books from within Google Book Search: Today, we're making it possible for anyone...
Competition in the eBook Market
By Tim O'ReillyJanuary 25, 2009
There's been a lot of buzz on forward-looking publisher mailing lists in the past few days about Robert Darnton's piece in the New York Review of Books, Google and the Future of Books. When it hit techmeme today, I thought it might be appropriate to share more broadly the comments I made on the Reading 2.0 list (links added, minor...
Making Site Architecture Search-Friendly: Lessons From whitehouse.gov
By Vanessa FoxJanuary 22, 2009
Guest blogger Vanessa Fox is co-chair of the new O'Reilly conference Found: Search Acquisition and Architecture. Find more from Vanessa at ninebyblue.com and janeandrobot.com. Vanessa is also entrepreneur in residence at Ignition Partners, and Features Editor at Search Engine Land. Yesterday, as President-elect Obama became president Obama, we geeky types filled the web with chatter about change. That change of...
Magazines Now in Google Book Search
By Mac SlocumDecember 11, 2008
Google is adding back issues of magazines to its Book Search index. From the Official Google Blog: Try queries like [obama keynote convention], [hollywood brat pack] or [world's most challenging...
Looking under what rises to the top: personal information in online searches
By Andy OramDecember 10, 2008
The search for self remains a powerful force, driving the flood of social networks, microblogging, and the posting of photos and videos to the Web. The urge toward self-definition exerts itself also when we search for information on other people--and that's where it becomes a problem.
Why Are Newspapers Dying?
By Kurt CagleDecember 9, 2008
While newspapers are likely on their way to the recycle bin, editorial journalism isn't. We are moving to an era where journalistic integrity and personal prestige of the individual journalist is becoming more important than the prestige of the newspaper or other media that the journalist writes for. Journalism is becoming decentralized, and there are many indications that this is, just perhaps, a good thing.
Education of software project members: New API posted
By Andy OramDecember 6, 2008
Over the past month I've made a few significant updates to my API for educating software project members.
EFF Attorney: Google Book Search Settlement Weakens Innovation
By Peter BrantleyNovember 20, 2008
In an editorial in The Recorder, Fred von Lohmann of the Electronic Frontier Foundation says Google's settlement with publishers and authors signals an implicit abandonment of Google's legal team...
Point-Counterpoint: On Digital Book DRM
By Peter BrantleyNovember 20, 2008
In the first part of a point-counterpoint exchange, Peter Brantley outlines reasons why DRM is bad for book publishers.
Slides from "What Publishers Need to Know about Digitization" Webcast
By Liza DalyNovember 13, 2008
Slides from the "What Publishers Need to Know about Digitization" webcast.
APIs, New "Transactions" and the Google Book Search Registry
By Peter BrantleyNovember 13, 2008
At PersonaNonData, Michael Cairns discusses the Google Book Search registry, and muses whether it might support certain types of transactions through an API: How the registry may be formed is...
Android Barcode App Connects to Google Book Search
By Peter BrantleyNovember 12, 2008
Google has released a nifty Android app that permits the scanning of a book's barcode, enabling the linkage with the corresponding work in Google Book Search. From E-Reads: "Google has...
Google Responds to Some Book Search Questions
By Mac SlocumNovember 6, 2008
Shortly after last week's Google Book Search announcement, Siva Vaidhyanathan posed a number of questions about the agreement's impact on publishers, libraries and consumers. Google responded, and today Vaidhyanathan...
A Call for Tiered Access to Google Book Search Terminals
By Mac SlocumNovember 4, 2008
Peter Brantley says proposed public access (pdf) to Google Book Search library terminals is too restrictive, particularly in areas serving underprivileged populations: This is not an economic matter; it is...
1 to 50 of 68 Next







