Chapter 1. Reputation Systems Are Everywhere

Reputation systems impact your life every day, even when you don’t realize it. You need reputation to get through life efficiently, because reputation helps you make sound judgments in the absence of any better information. Reputation is even more important on the Web, which has trillions of pages to sort through—each one competing for your attention. Without reputation systems for things like search rankings, ratings and reviews, and spam filters, the Web would have become unusable years ago.

This book will clarify the concepts and terminology of reputation systems and define their mechanisms. With these tools, you can analyze existing models, or even design, deploy, and operate your own online reputation systems.

But, before all that, let us start at the beginning….

An Opinionated Conversation

Imagine the following conversation—maybe you’ve had one like it yourself. Robert is out to dinner with a client, Bill, and proudly shares some personal news.

He says, My daughter Wendy is going to Harvard in the fall.

Really! I’m curious—how did you pick Harvard? asks Bill.

Why, it has the best reputation. Especially for law, and Wendy wants to be a lawyer.

Did she consider Yale? My boss is a Yale man—swears by the law school.

Heh. Yes, depending on who you ask, their programs are quite competitive. In the end, we really liked Harvard’s proximity. We won’t be more than an hour away.

Won’t it be expensive?

It’s certainly not cheap…but it is prestigious. We’ll make trade-offs elsewhere if we have to—it’s worth it for my little girl!

It’s an unremarkable story in the details (OK, maybe most us haven’t been accepted to Harvard), but this simple exchange demonstrates the power of reputation in our everyday lives. Reputation is pervasive and inescapable. It’s a critical tool that enables us to make decisions, both large (like Harvard versus Yale) and small (what restaurant would impress my client for dinner tonight?). Robert and Bill’s conversation also yields other insights into the nature of reputation.

People Have Reputations, but So Do Things

We often think of reputation in terms of people (perhaps because we’re each so conscious of our own reputation), but of course a reputation can also be acquired by many types of things. In this story, Harvard, a college, obviously has a reputation, but so may a host of other things: the restaurant in which Bill and Robert are sharing a conversation, the dishes that they’ve ordered, and perhaps the wine that accompanies their meal.

It’s probably no coincidence that Bill and Robert have made the specific set of choices that brought them to this moment: reputation has almost certainly played a part in each choice. This book describes a formal, codified system for assessing and evaluating the reputations of both people and things.

Reputation Takes Place Within a Context

Bill praises Harvard for its generally excellent reputation, but that is not what’s led his family to choose the school: it was Harvard’s reputation as a law school in particular. Reputation is earned within a context. Sometimes its value extends outside that context (for example, Harvard is well regarded for academic standards in general). And reputations earned in one context certainly influence reputations in other contexts.

Things can have reputations in multiple contexts simultaneously. In our example, domains of academic excellence are important contexts. But geography can define a context as well, and it can sway a final decision. Furthermore, all of an item’s reputations need not agree across contexts. In fact, it’s highly unlikely that they will. It’s entirely possible to have an excellent reputation in one context, an abysmal one in another, and no reputation at all in a third. No one excels at everything, after all.

For example, a dining establishment may have a five-star chef and the best seafood in town, but woefully inadequate parking. Such a situation can lead to seemingly oxymoronic statements such as Yogi Berra’s famous line: No one goes there anymore—it’s too crowded.

We Use Reputation to Make Better Decisions

A large part of this book is dedicated to defining reputation in a formal, systematized fashion. But for now, put simply (and somewhat incompletely), reputation is information used to make a value judgment about a person or a thing. It’s worth examining this assertion in a little more detail.

Reputation is information used to make a value judgment about an object or a person.

Where does this information come from? It depends—some of it may be information that you, the evaluator, already possess (perhaps through direct experience, longstanding familiarity, or the like). But a significant component of reputation has to do with assimilating information that is externally produced, meaning that it does not originate with the person who is evaluating it. We tend to rely more heavily on reputation in circumstances where we don’t have firsthand knowledge of the object being evaluated, and the experiences of others can be an invaluable aid in our decision. This is even more true as we move our critical personal and professional decisions online.

What kinds of value judgments are we talking about? All kinds. Value judgments can be decisive, continuous, and expressive. Sometimes a judgment is as simple as declaring that something is noteworthy (thumbs up or a favorite). Other times you want to know the relative rank or a numeric scale value of something in order to decide how much of your precious resources—attention, time, or money—to dedicate to it. Still other judgments, such as movie reviews or personal testimonials, are less about calculation and more about freeform analysis and opinion. Finally, some judgments, such as all my friends liked it, make sense only in a small social context.

What about the people and things that we’re evaluating? We’ll refer to them as reputable entities (that is, people or things capable of accruing reputation) throughout this book. Some entities are better candidates for accruing reputation than others, and we’ll give guidance on the best strategies for identifying them.

Finally, what kind of information do we mean? Well, almost anything. In a broad sense, if information can be used to judge someone or something, then it informs—in some part—the reputation of that person or thing. In approaching reputation in a formal, systematized way, it’s beneficial to think of information in small, discrete units; throughout this book, we’ll show that the reputation statement is the building block of any reputation system.

The Reputation Statement

Explicit: Talk the Talk

So what are Robert and Bill doing? They’re exchanging a series of statements about an entity, Harvard. Some of these statements are obvious: Harvard is expensive, says Bill. Others are less direct: Their programs are quite competitive implies that Robert has in fact compared Harvard to Yale and chosen Harvard. Robert might have said more directly, For law, Harvard is better than Yale. These direct and indirect assertions feed into the shared model of Harvard’s reputation that Robert and Bill are jointly constructing. We will call an asserted claim like this an explicit reputation statement.

Implicit: Walk the Walk

Other reputation statements in this story are even less obvious. Consider for a moment Wendy, Robert’s daughter—her big news started the whole conversation. While her decision was itself influenced by Harvard’s many reputations—as being a fine school, as offering a great law program, as an excellent choice in the Boston area—her actions themselves are a form of reputation statement, too. Wendy applied to Harvard in the first place. And, when accepted, she chose to attend over her other options. This is a very powerful claim type that we call an implicit reputation statement: action taken in relation to an entity. The field of economics calls the idea revealed preference; a person’s actions speak louder than her words.

The Minimum Reputation Statement

Any of the following types of information might be considered viable reputation statements:

  • Assertions made about something by a third party. (Bill, for instance, posits that Harvard will be expensive.)

  • Factual statistics about something.

  • Prizes or awards that someone or something has earned in the past.

  • Actions that a person might take toward something (for example, Wendy’s application to Harvard).

All of these reputation statements—and many more—can be generalized in this way:

image with no caption

As it turns out, this model may be a little too generalized; some critical elements are left out. For example, as we’ve already pointed out, these statements are always made in a context. But we’ll explore other enhancements in Chapter 2. For now, the general concepts to get familiar with are source, target, and claim. Here’s an example of a reputation statement broken down into its constituent parts. This one happens to be an explicit reputation statement by Bill:

image with no caption

Here’s another example, an action, which makes an implicit reputation statement about the quality of Harvard:

image with no caption

You may be wrestling a bit with the terminology here, particularly the term claim. (Why, Wendy’s not claiming anything, you might be thinking. That’s simply what she did.) It may help to think of it like this: we are going to make the claim—by virtue of watching Wendy’s actions—that she believes Harvard is a better choice for her than Yale. We are drawing an implicit assumption of quality from her actions. There is another possible reputation statement hiding in here, one with a claim of did-not-choose and a target of Yale.

These are obviously two fairly simple examples. And, as we said earlier, our simplified illustration of a reputation statement is omitting some critical elements. Later, we’ll revise that illustration and add a little rigor.

Reputation Systems Bring Structure to Chaos

By what process do these random and disparate reputation statements cohere and become a reputation? In real life, it’s sometimes hard to say: boundaries and contexts overlap, and impressions get muddied. Often, real-world reputations are no more advanced than irregular, misshapen lumps of collected statements, coalescing to form a haphazard whole. Ask someone, for example, What do you think about Indiana? Or George W. Bush? You’re liable to get 10 different answers from eight different people. It’s up to you to keep those claims straight and form a cohesive thought from them.

Systems for monitoring reputation help to formalize and delineate this process. A (sometimes, but not always) welcome side effect is that reputation systems also end up defining positive reputations, and suggesting exactly how to tell them from negative ones. (See the sidebar Negative and Positive Reputation.) Next, we’ll discuss some real-world reputation systems that govern all of our lives.

Then, the remainder of this book proposes a system that accomplishes that very thing for the social web. For the multitude of applications, communities, sites, and social games that might benefit from a reputation-enriched approach, we’ll take you—the site designer, developer, or architect—through the following process:

  • Defining the targets (or the best reputable entities) in your system

  • Identifying likely sources of opinion

  • Codifying the various claims that those sources may make

Reputation Systems Deeply Affect Our Lives

We all use reputation every day to make better decisions about anything, from the mundane to choices critical for survival. But the flip side is just as important and pervasive—a multitude of reputation systems currently evaluate you, your performance, and your creations. This effect is also true for the groups that you are a member of: work, professional, social, or congregational. They all have aggregated reputations that you are a part of, and their reputation reflects on you as well. These reputations are often difficult to perceive and sometimes even harder to change.

Local Reputation: It Takes a Village

Many of your personal and group reputations are limited in scope: your latest performance evaluation at work is between you, your boss, and the human resources department; the family living on the corner is known for never cutting their grass; the hardware store on Main Street gives a 10% discount to regular customers. These are local reputations that represent much of the fabric that allows neighbors, coworkers, and other small groups to make quick, efficient decisions about where to go, whom to see, and what to do.

Local reputation can be highly valuable to those outside of the original context. If the context can be clearly understood and valued by a larger audience, then surfacing a local reputation more broadly can create significant real-world value for an entity. For example, assuming a fairly standard definition of a good sushi restaurant, displaying a restaurant’s local reputation to visitors can increase the restaurant’s business and local tax revenue. This is exactly what the Zagat’s guide does—it uses local reputation statements to produce a widely available and profitable reputation system.

Note that—even in this example—a reputation system has to create a plethora of categories (or contexts) in order to overcome challenges of aggregating local reputation on the basis of personal taste. In Manhattan, Zagat’s lists three types of American cuisine alone: new, regional, and traditional. We will discuss reputation contexts and scope further in Chapter 6.

On the other hand, a corporate performance review would not benefit from broader publication. On the contrary, it is inappropriate, even illegal in some places, to share that type of local reputation in other contexts.

Generally, local reputation has the narrowest context, is the easiest to interpret, and is the most malleable. Sources are so few that it is often possible—or even required—to change or rebuild collective local perception. A retailer displaying a banner that reads Under New Management is probably attempting to reset his business’s reputation with local customers. Likewise, when you change jobs and get a new boss, you usually have to (or get to, depending on how you look at it) start over and rebuild your good worker reputation.

Global Reputation: Collective Intelligence

When strangers who do not have access to your local reputation contexts need to make decisions about you, your stuff, or your communities, they often turn to reputations aggregated in much broader contexts. These global reputations are maintained by external formal entities—often for-profit corporations that typically are constrained by government regulation.

Global reputations differ from local ones in one significant way; the sources of the reputation statements do not know the personal circumstances of the target. That is, strangers generate reputation claims for other strangers.

You may think, Why would I listen to strangers’ opinions about things I don’t yet know how to value? The answer is simply that a collective opinion is better than ignorance, especially if you are judging the value of the target reputable entity against something precious—such as your time, your health, or your money.

Here are some global reputations you may be familiar with:

  • The FICO credit score represents your ability to make payments on credit accounts, among many other things.

  • Television advertising revenues are closely tied to Nielsen ratings. They measure which demographic groups watch which programming.

  • For the first 10 years after the Web came into widespread use, page views were the primary metric for the success of a site.

  • Before plunking down their $10 or more per seat, over 60% of U.S. moviegoers report consulting online movie reviews and ratings created by strangers.

  • Statistics such as the Dow Jones Industrial Average, the trade deficit, the prime interest rate, the consumer confidence index, the unemployment rate, and the spot price of crude oil are all used as proxies for indicating America’s economic health.

Again, these examples are aggregated from both explicit (what people say) and implicit (what people do) claims. Global reputations exist on such a large scale that they are very powerful tools in otherwise information-poor contexts. In all the previous examples, reputation affects the movement of billions of dollars every day.

Even seemingly trivial scores such as online movie ratings have so much influence that movie studios have hired professional review writers to pose as regular moviegoers, posting positive ratings early in an attempt to inflate opening weekend attendance figures. This is known in the industry as buzz marketing, and it’s but one small example of the pervasive and powerful role that formal reputation systems have assumed in our lives.

FICO: A Study in Global Reputation and Its Challenges

Credit scores affect every modern person’s life at one time or another. A credit score is the global reputation that has the single greatest impact on the economic transactions in your life. Several credit scoring systems and agencies exist in the United States, but the prevalent reputation tool in the world of creditworthiness is the FICO credit score devised by the company Fair Isaac. We’ll touch on how the FICO score is determined, how it is used and misused, and how difficult it is to change.

The lessons we learn from the FICO score apply nearly verbatim to reputation systems on the Web.

The FICO score is based on the following factors (all numbers are approximate; see Figure 1-1):

  • Start with 850 points—the theoretical maximum score. Everything is downhill from here.

  • The largest share, up to 175 points, is deducted for late payments.

  • The next most important share, up to 150 points, penalizes you for outstanding balances close to or over available credit limits (capacity).

  • Up to 75 points are deducted if your credit history is short. (This effect is reduced if your scores for other factors are high.)

  • Another 50 points may be deducted if you have too many new accounts.

  • Up to 50 points are reserved for other factors.

Your credit score is a formalized reputation model made up of numerous inputs.
Figure 1-1. Your credit score is a formalized reputation model made up of numerous inputs.

Like all reputation scores, the FICO score is aggregated from many separate reputation statements. In this case, the reputation statements are assertions such as “Randy was 15 days late with his Discover payment last month,” all made by various individual creditors. So, for the score to be correct, the system must be able to identify the target (Randy) consistently and be updated in a timely and accurate way.

When new sources (creditors) appear, they must comply with the claim structure and be approved by the scoring agency; a bogus source or bad data can seriously taint the resulting scores. Given these constraints and a carefully tuned formulation, the FICO score may well be a reasonable representation of something we can call creditworthiness.

For most of its history of more than 50 years, the FICO score was shrouded in mystery and nearly inaccessible to consumers, except when they were opening major credit lines (such as when purchasing a home). At the time, this obscurity was considered a benefit. A benefit, that is, to lenders and the scoring agencies—that, in operating a high-fee-per-transaction business, were happy to be talking only with one another. But this lack of transparency meant that an error on your FICO score could go undetected for months—or even years—with potentially deleterious effects on your cash flow: increased interest rates, decreased credit limits, and higher lending fees.

However, as it has in most other businesses, the Internet has brought about a reform of sorts in credit scoring. Nowadays you can quickly get a complete credit report or take advantage of a host of features related to it: flags to alert you when others are looking at your credit data, or alarms whenever your score dips or an anomalous reputation statement appears in your file.

[In the United States] an employer is generally permitted to [perform a credit check], primarily because there is no federal discrimination law that specifically prohibits employment discrimination on the basis of a bad credit report.

EmployeeIssues.com

As access to credit reports has increased, the credit bureaus have kept pace with the trend and have steadily marketed the reports for a growing number of purposes. More and more transaction-based businesses have started using them (primarily the FICO score) for less and less relevant evaluations. In addition to their original purpose—establishing the terms of a credit account—credit reports are now used by landlords for the less common but somewhat relevant purpose of risk mitigation when renting a house or apartment and by some businesses to run background checks on prospective employees—a legal but unreasonably invasive requirement.

Global reputation scores are so powerful and easily accessible that the temptation to apply them outside of their original context is almost irresistible. The rise and spread of the FICO score illustrates what can happen when a reputation that is powerful and ubiquitous in one specific context is used in other, barely related contexts: it transforms the reputation beyond recognition. In this ironic case, your ability to get a job (to make money that will allow you to pay your credit card bills) can be seriously hampered by the fact that your potential boss can determine that you are over your credit limit.

Web FICO?

Several startup companies have attempted to codify a global user reputation to be employed across websites, and some try to leverage a user’s preexisting eBay seller’s Feedback score as a primary value in their rating. They are trying to create some sort of real person or good citizen reputation system for use across all contexts. As with the FICO score, it is a bad idea to co-opt a reputation system for another purpose, and it dilutes the actual meaning of the score in its original context. The eBay Feedback score reflects only the transaction-worthiness of a specific account, and it does so only for particular products bought or sold on eBay. The user behind that identity may in fact steal candy from babies, cheat at online poker, and fail to pay his credit card bills. Even eBay displays multiple types of reputation ratings within its singular limited context. There is no web FICO because there is no kind of reputation statement that can be legitimately applied to all contexts.

Reputation on the Web

Over the centuries, as human societies became increasingly mobile, people started bumping into one another. Increasingly, we began to interact with complete strangers and our locally acquired knowledge became inadequate for evaluating the trustworthiness of new trading partners and goods. The emergence of various formal and informal reputation systems was necessary and inevitable. These same problems of trust and evaluation are with us today, on the Web. Only…more so. The Web has no centralized history of reputable transactions and no universal identity model. So we can’t simply mimic real-world reputation techniques, where once you find someone (or some group) that you trust in one context, you can transfer that trust to another. On the Web, no one knows who you are, or what you’ve done in the past. There is no multi-context reputation at large for users of the Web, at least for the vast majority of users.

Consider what people today are doing online. Popular social media sites are the product of millions of hands and minds. Around the clock and around the globe, the world is pumping out contributions small and large: full-length features on Vimeo, video shorts on YouTube, entries on Blogger, discussions on Yahoo! Groups, and tagged-and-titled Del.icio.us bookmarks. User-generated content and robust crowd participation have become the hallmarks of Web 2.0.

But the booming popularity of social media has created a whole new set of challenges for those who create websites and online communities (not to mention the challenges faced by the users of those sites and communities). Here are just a few of them.

Attention Doesn’t Scale

Attention Economics: An approach to the management of information that treats human attention as a scarce commodity…

Wikipedia

If there ever was any question that we live in an attention economy, YouTube has put a definitive end to it. According to YouTube’s own data, every minute, 10 hours of video is uploaded to YouTube. That’s over 14,000 hours of video each and every day. If you started watching just today’s YouTube contributions nonstop, end to end, you’d be watching for the next 40 years. That’s a lot of sneezing pandas!

Clearly, no one has the time to personally sift through the massive amount of material uploaded to YouTube. This situation is a problem for all concerned.

  • If I’m a visitor to YouTube, it’s a problem of time management. How can I make sure that I’m finding the best and most relevant stuff in the time I have available?

  • If I’m a video publisher on YouTube, I have the opposite problem: how can I make sure that my video gets noticed? I think it’s good content, but it risks being lost in a sea of competitors.

  • And, of course, YouTube itself must manage an overwhelming inflow of user contributions, with the attendant costs (storage, bandwidth, and the like). It’s in YouTube’s best interest to quickly identify abusive content to be removed, and popular content to promote to their users. This decision-making process also has significant cost implications—the most viewed videos can be cached for the best performance, while rarely viewed items can be moved to slower, cheaper storage.

There’s a Whole Lotta Crap Out There

Sturgeon’s Law: Ninety percent of everything is crud.

Theodore Sturgeon, author, March 1958

Even in contexts where attention is abundant and the sheer volume of user-generated content is not an issue, there is the simple fact that much of what’s contributed just may not be that good. Filtering and sorting the best and most relevant content is what web search engines such as Google are all about. Sorting the wheat from the chaff is a multibillion-dollar industry.

The great content typically is identified by reputation systems, local site editors, or a combination of the two, and it is often featured, promoted, highlighted, or rewarded (see Figure 1-2).

Content at the higher end of the scale should be rewarded, trumpeted, and showcased; stuff on the lower registers will be ignored, hidden, or reported to the authorities.
Figure 1-2. Content at the higher end of the scale should be rewarded, trumpeted, and showcased; stuff on the lower registers will be ignored, hidden, or reported to the authorities.

The primary goal of a social media site should be to make user-generated content of good quality constitute the bulk of what users interact with regularly. To reach that goal, user incentive reputation systems are often combined with content quality evaluation schemes.

Like an off-color joke delivered in mixed company, seemingly inappropriate content may become high-quality content when it’s presented in another context. The quality of such content may be OK, but moving or improving the content will move it up the quality scale. On an ideal social media site, community members would regularly only encounter content that is OK or better.

Unfortunately, when a site has the minimum possible social media features—such as blog comments turned on without oversight or moderation—the result is usually a very high ratio of poor content. As user-generated content grows, content moderation of some sort is always required: typically, either employees scan every submission or the site’s operators deploy a reputation system to identify bad content. Simply removing the bad content usually isn’t good enough—most sites depend on search engine traffic, advertising revenue, or both. To get search traffic, external sites must link to the content, and that means the quality of the content has to be high enough to earn those links.

Then there are submissions that violate the terms of service (TOS) of a social website. Such content needs to be removed in a timely manner to avoid dragging down the average quality of content, degrading the overall value of the site.

Finally, if illegal content is posted on a site, not only must it be removed, but the site’s operators may be required to report the content to local government officials. Such content obviously must be detected and dealt with as quickly and efficiently as possible.

For sites large and small, the worst content can be quickly identified and removed by a combination of reputation systems and content moderators. But that’s not all reputations can do. They also provide a way to identify, highlight, and reward the contributors of the highest quality content, motivating them to produce their very best stuff.

People Are Good. Basically.

Of course, content on your site does not just appear, fully formed, like Athena from the forehead of Zeus. No, we call it user-generated content for a reason. And any good reputation system must consider this critical element—the people who power it—before almost anything else.

Visitors to your site will come for a variety of reasons, and each will arrive prearmed with her own motivations, goals, and prejudices. On a truly successful social media site, it may be impossible to generalize about those factors. But it does help to consider the following guidelines, regardless of your particular community and context.

Know thy user

Again, individual motivations can be tricky—in a community of millions like the Web, you’ll have as many motivating factors as users (if not more; people are a conflicted lot). But be prepared, at least, to anticipate your contributors’ motivations and desires. Will people come to your site and post great content because…

  • They crave attention?

  • It’s intrinsically rewarding to them in some way?

  • They expect some monetary reward?

  • They’re acting altruistically?

In reality, members of your community will act (and act out) for all of these reasons and more. And the better you can understand why they do what they do, the better you can fine-tune your reputation system to reflect the real desires of the people that it represents. We’ll talk more about your community members and their individual motivations in Chapter 5, but we’ll generalize about them a bit here.

Honor creators, synthesizers, and consumers

Not everyone in your community will be a top contributor. This is perfectly natural, expected, and—yes—even desired. Bradley Horowitz (vice president of product management at Google) makes a distinction among creators, synthesizers, and consumers (see Figure 1-3) and speculates on the relative percentages of each that you’ll find in a given community:

Creators

1% of the user population might start a group (or a thread within a group).

Synthesizers

10% of the user population might participate actively and actually submit content, whether starting a thread or responding to one.

Consumers

100% of the user population benefits from the activities of these two groups.

In any community, you’ll likely find a similar distribution of folks who actively administer the site, those who contribute, and those who engage with it in a more passive fashion.
Figure 1-3. In any community, you’ll likely find a similar distribution of folks who actively administer the site, those who contribute, and those who engage with it in a more passive fashion.

Again, understanding the roles that members of a community naturally fall into will help you formulate a reputation system that enhances this community dynamic (rather than fights against it). A thoughtful reputation system can help you reward users at all levels of participation and encourage them to move continually toward higher levels of participation, without ever discouraging those who are comfortable simply being site consumers.

Throw the bums out

And then there are the bad guys. Not every actor in your community has noble intentions. Attention is a big motivator for some community participants. Unfortunately, for some participants—known as trolls—that crassest of motivations is the only one that really matters. Trolls are after your attention, plain and simple, and unfortunately will stoop to any behavioral ploy to get it. But, luckily, they can be deterred (often with only a modicum of effort, when that effort is directed in the right way).

A (by far) more persistent and methodical group of problem users will have a financial motive: if your application is successful, spammers will want to reach your audience and will create robots that abuse your content creation tools to do it. But when given too much prominence, almost any motivation can lead to bad behavior that transgresses the values of the larger community.

The Reputation Virtuous Circle

Negative reputation systems are important for saving costs and keeping virtual neighborhoods garbage-free, but their chief value is generally seen as cost reduction. For example, a virtual army of robots keeps watch over controversial Wikipedia pages and automatically reverts obvious abuse, such as “blanking”—removing all article content nearly instantaneously—a task that would cost millions of dollars a year if paid human moderators had to perform it. Like a town’s police force, negative reputation systems are often necessary, but they don’t actually make things more attractive to visitors.

Where reputation systems really add value to a site’s bottom line is by focusing on identifying the very best user-generated content and offering incentives to users for creating it. Surfacing the best content creates a virtuous circle (Figure 1-4): consumers of content visit a site and link to it because it has the best content, and the creators of that content share their best stuff on that site because all the consumers go there.

Quality contributions attract more attention, which begets more reward, which inspires more quality contributions….
Figure 1-4. Quality contributions attract more attention, which begets more reward, which inspires more quality contributions….

Who’s Using Reputation Systems?

Reputation systems are the underlying mechanisms behind some of the best-known consumer websites. For example:

  • Amazon’s product reviews are probably the most well-known example of object reputation, complete with a built-in meta-moderation model: “Was this review helpful?” Its Top Reviewers program tracks reputable reviews and trusted reviewers to provide context for potential product buyers when evaluating the reviewer’s potential biases.

  • eBay’s feedback score is based on the number of transactions that a seller or buyer has completed. It is aggregated from hundreds or thousands of individual purchase transaction ratings.

  • Built on a deep per-post user rating and classification system, Slashdot’s karma is an often-referenced program used to surface good content and silence trolls and spammers.

  • Xbox Live’s (very successful) Achievements reward users for beating minor goals within games and cumulatively add to community members’ gamerscores.

Table 1-1 illustrates that all of the top 25 websites listed on Alexa.com use at least one reputation system as a critical part of their business, many use several, and quite a few would fail without them. (Note that multiple Google and Yahoo! sites are collapsed in this table.)

Table 1-1. Use of reputation systems on top websites
WebsiteVote to promoteContent rating and rankingContent reviews and commentsIncentive karma (points)Quality karmaCompetitive karmaAbuse scoring
“left">yahoo.*††[a]†††††††††††
google.*†††††--†††
youtube.com†††[b]†††††-††
live.com††††††††††-†††
facebook.com[c]††††††††††
msn.com††††††-††††
wikipedia.org-††--†††
blogger.com†††--†††
baidu.com††††††-††
rapidshare.com- [d]--†††--††
microsoft.com-----†††
hi5.com††
sina.com.cn
ebay.com-†††-†††
mail.ru---†††
fc2.com-----††
vkontakte.ru††††††††††

[a] Multiple types

[b] Extensively

[c] Yes

[d] Unknown

Challenges in Building Reputation Systems

User-generated sites and online games of all shapes and sizes face common challenges. Even fairly intimate community sites struggle with the same issues as large sites. Regardless of the media types on a site or the audience for which a site is intended, once a reputation system hits a certain threshold of community engagement and contribution, the following problems are likely to affect it:

Problems of scale

How to manage and present an overwhelming inflow of user contributions

Problems of quality

How to tell the good stuff from the bad

Problems of engagement

How to reward contributors in a way that keeps them coming back

Problems of moderation

How to stamp out the worst stuff quickly and efficiently

Fortunately, a well-considered strategy for employing reputation systems on your site can help you make headway on all of these problems. A reputation system compensates for an individual’s scarcest resource—his attention—by substituting a community’s greatest asset: collective energy.

Sites with applications that skillfully manage reputations (both of the site’s contributors and of their contributions) will prosper. Sites on which the reputation of users and content is ignored or addressed in only the crudest or most reactive way do so at their own peril. Those sites will see the quality of their content sag and participation levels falter, and will themselves earn a reputation as places to avoid.

This book will help you understand, in detail, how reputation systems work and give you the tools you need to apply that knowledge to your site, game, or application. It will help you see how to create your own virtuous circle, producing real value to you and your community. It will also help you design and develop systems to reduce the costs of moderating abuse, especially by putting much of the power back into the hands of your most ardent users.

We have limited our examination of reputation systems to context-aggregated reputations, and therefore we will only lightly touch on reputation-related subjects. Each of these subjects is covered in detail in other reference works or academic papers (see Appendix B for references to these works):

Search relevance

Algorithms such as search rank and page rank are massively complex and require teams of people with doctorates to understand, build, and operate them. If you need a good search engine, license one or use a web service.

Recommender systems

These are information filters for identifying information items of interest on the basis of similarities of attributes or personal tastes.

Social network filters

Though this book will help you understand the mechanics of most social network filters, it does not cover in depth the engineering challenges required to generate unique reputation scores for every viewing user.

We will not be addressing personal or corporate identity reputation management services, such as search engine optimization (SEO), WebPR, or trademark-monitoring. These are techniques to track and manipulate the very online reputation systems described in this book.

Conceptualizing Reputation Systems

We’ve demonstrated that reputation is everywhere and that it brings structure to chaos by allowing us to proxy trust when making day-to-day decisions. Therefore reputation is critical for capturing value on the Web, where everything and everybody is reduced to a set of digital identifiers and database records. We demonstrated that all reputation exists in a context. There is no overall web trust reputation—nor should there be. The abuses of the FICO credit score serve well as examples of the dangers therein.

Now that we’ve named this domain and limited its scope, we next seek to understand the nature of the currently existing examples—successes and failures—to help create both derivative and original reputation systems for new and existing applications. In order to talk consistently about these systems, we started to define a formal grammar, starting with The Reputation Statement as its core element. The remainder of this book builds on this premise, starting with Chapter 2, which provides the formal definition of our graphical reputation system grammar. This foundation is used throughout the remainder of the book, and is recommended for all readers.

Get Building Web Reputation Systems now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.