Harnessing the Power of Disruptive TechnologiesEdited by Andy Oram
0-596-00110-X, Order Number: 110X
448 pages, $29.95
Jeremie Miller, Jabber
Conversations are an important part of our daily lives. For most people, in fact, they are the most important way to acquire and spread knowledge during a normal working day.
Conversations provide a comfortable medium in which knowledge flows in both directions, and where contributors share an inherent context through their subjects and relationships. In addition to old forms of conversations--direct interaction and communication over the phone and in person--conversations are becoming an increasingly important part of the networked world. Witness the popularity of email, chat, and instant messaging, which enable users to increase the range and scope of their conversations to reach those that they may not have before.
Still, little attention has been paid in recent years to the popular Internet channels that most naturally support conversations. Instead, most people see the Web as the driving force, and they view it as a content delivery platform rather than as a place for exchanges among equals. The dominance of the Web has come about because it has succeeded in becoming a fundamentally unifying technology that provides access to content in all forms and formats. However, it tends toward being a traditional one-way broadcast medium, with the largest base of users being passive recipients of content.
Conversations have a stubborn way of reemerging in any human activity, however. Recently, much of the excitement and buzz around the Web have centered on sites that use it as a conversational medium. These conversations take place within a particular web site (Slashdot, eBay, Amazon.com) or an application (Napster, AIM/ICQ, Netshow).
And repeating the history of the pre-Web Internet, the new conversations sprout up in a disjointed, chaotic variety where the left hand doesn't know what the right hand is doing. The Web was a godsend for lowering the barrier to access information; it increased the value of all content by unifying the technologies that described and delivered that content. In the same way, Internet conversations stand to benefit significantly by the introduction of a common platform designed to support the rich dynamic and flexible nature of a conversation.
Jabber could well become this platform. It's not a single application (although Jabber clients can be downloaded and used right now) nor even a protocol. Instead, using XML, Jabber serves as a glue that can tie together an unlimited range of applications that tie together people and services. Thus, it will support and encourage the growth of diverse conversational systems--and this moment in Internet history is a ripe one for such innovations.
Conversations and peers
So what really is a conversation? A quick search using Dictionary.com reveals the following:con·ver·sa·tion (kän-vr-'sá-shn) n. 1. A spoken exchange of thoughts, opinions, and feelings; a talk. 2. An informal discussion of a matter by representatives of governments, institutions, or organizations. 3. Computer Science. A real-time interaction with a computer.
Essentially, a conversation is the rapid transfer of information between two or more parties. A conversation is usually characterized by three simple traits: it happens spontaneously, it is transient (lasting a short time), and it occurs among peers--that is, all sides are equal contributors.
Let's turn then to the last trait. The term "peer" is defined by Dictionary.com:peer (pîr) n. 1. A person who has equal standing with another or others, as in rank, class, or age; children who are easily influenced by their peers.
The Internet expands this definition to include both people (P) and applications (A). Inherently, when peers exchange information, it is a conversation, since both sides are equal and are transiently exchanging information with each other. Person-to-person conversations (P-P) include email, chat, and message boards. But crucial conversations also include application-to-application (A-A) ones such as web services, IP routing, and UUCP. Least common, but most intriguing for future possibilities, are person-to-application (P-A) conversations such as smart agents and bots.
It's interesting to take a step back and look at the existing conversations happening on the Internet today. How well does each technology map to the kind of natural conversational style we know from real life? Let's identify a few important metrics to help evaluate these traditional forms of Internet communication as conversational channels:
- The more rapidly messages can be created and delivered, and the more rapidly the recipient can respond, the more productive the conversation is for both participants.
- A technology provides greater potential for future innovation if it inherently supports applications as well as people.
- Participants in a conversation should be equal and the conversation bidirectional.
- Conversations may be constrained if there is a central form of control or authority.
We can now evaluate a few technologies along some of the metrics just defined.
Email comes to mind first as the most popular form of conversation now happening on the Internet. It is relatively fast, each message taking typically between 30 seconds and a few days to deliver, but certainly not real-time. It is predominantly P-P, with some P-A applications, but it is not a very natural use for A-A, because it provides no structure for content. Usenet is similar to email but is focused on group discussions. Both are innately distributed, and participants are peers.
Internet Relay Chat (IRC) is a very popular conversational medium, primarily supporting real-time group discussions. As with email, it's primarily P-P with some P-A and very little A-A. Participants are peers. IRC is a distributed application within a network of groups, but it is restricted to that particular network--it does not extend beyond a single collection of groups.
The traditional Web is real-time, but in a strict sense it does not support conversations, because the participants are not peers. The content may be produced by a person, but it has a natural flow in only one direction. Applications that support conversations can be built and made available on the Web, but they are pretty rigid--each conversation is specific and centralized to that application.
The next-generation Web--also called the Two-Way Web by visionary developer Dave Winer--is represented by Microsoft's .NET; and it tries to solve the shortcomings in the evolution of the Web. It involves personal/fractional-horsepower (specialized) HTTP and DAV servers. These systems more naturally support peers and conversations than the traditional Web, but the conversations between these peers are still predominantly one-way (consumer or producer) and are often centralized based on the application or content.
Traditional instant messaging services, such as AOL Instant Messenger, ICQ, Yahoo! Messenger, and MSN Messenger, come the closest to a real-world conversation yet, and that is the reason for their soaring popularity. They unfortunately focus primarily on P-P. The most significant drawback is that they are commercial and completely centralized around a single closed service. You must be part of the service to communicate with others on it.
None of these existing technologies provides a common platform for Internet conversations as the Web does for content. Each is either limited in some important dimension or is specific to one application.
What could people do with an ideal, standardized conversational platform open to applications that can cross boundaries and access end user content? Here are some fanciful future possibilities:
- I could ask a coworker's word processor or source editor what documents they are editing and discuss revisions.
- My spell checker could ask the entire department to check the validity of unknown acronyms and project or employee names.
- Instead of trying to combine the details of everybody's lives in a central address book or schedule, each application that needs to discover this information could ask other peers for it. Different conversations could be with different communities I define, such as my department, my family (for holiday card or birthday lists), or my friends (for event invitations).
- My television set or video recorder could ask my friends what programs they are watching and use their recorders' extra space to save the programs in case I want to watch them too. With broadband, the television sets could have a conversation exchanging the actual video.
- My games could exchange scores and playing levels with my friends' games and schedule times to play collaboratively (possibly invoking some of the other peers above to schedule conversations). I could also ask another game to deliver an important message or to join a game.
- Businesses could reproduce some of the warmth and responsiveness of a phone conversation online, replacing the cold, faceless e-commerce store or customer support site that serves to drive us to our phones. The new sites could combine a rich context and content with the kind of conversational medium we all like to have.
Evolving toward the ideal
A look back at a bit of the World Wide Web's brief history proves quite interesting and enlightening. Back in its pioneering days, the Web was idealized as a revolutionary peer platform that would enable anyone on the Internet to become a publisher and editor. It empowered individuals to publish their unique collections of knowledge so that they were accessible by anyone. The vision was of a worldwide conversation where everyone could be both a voice and a resource. Here are a few quotes from Tim Berners-Lee to pique your interest:The World Wide Web was designed originally as an interactive world of shared information through which people could communicate with each other and with machines (http://www.w3.org/People/Berners-Lee/1996/ppf.html).
I had (and still have) a dream that the web could be less of a television channel and more of an interactive sea of shared knowledge. I imagine it immersing us as a warm, friendly environment made of the things we and our friends have seen, heard, believe or have figured out. I would like it to bring our friends and colleagues closer, in that by working on this knowledge together we can come to better understandings (http://www.w3.org/Talks/9510_Bush/Talk.html).
Although the Web fulfills this vision for many people, it has quickly evolved into a traditional consumer/producer relationship. If it had instead evolved as intended, we might be in a different world today. Instead of passively receiving content, we might be empowered individuals collectively producing content, publishing parts of ourselves online to our family and friends, and collectively editing the shared knowledge within our communities.
So where did it go wrong in this respect? It could be argued that the problem was technological, in that the available tools were browsing-centric, and it wasn't easy to become an editor or publisher. A more thought-provoking answer might be that the problem was social, in that there was little demand for those empowering tools. Perhaps only a few people were ready to become individual publishers, and the rest of society wasn't ready to take that step.
The Web did not stagnate, however. It continued to evolve from a content distribution medium to an application distribution medium. Few users are publishing content, but a huge number of companies, groups, and talented individuals are building dynamic applications with new characteristics that reach beyond the original design of the Web. The most exciting of these exhibit characteristics of a peer medium and empower individuals to become producers as well as consumers. Examples include eBay, Slashdot, IMDB, and MP3.com. Although the applications provide a new medium for conversations between P-P peers, the mechanisms for doing so are application-specific. These new web-driven peer applications also have the drawbacks of being centralized, of not being real-time in the sense of a conversation, and of requiring their own form of internal addressing.
So instead of the Web being used primarily as a peer publishing medium, it has become a client/server application medium upon which a breed of peer applications are being built.
Elsewhere in the computer field we can find still other examples of systems that are incorporating greater interactivity. Existing desktop applications are evolving in that direction. They are becoming Internet-aware as they face competition from web sites, so that they can take advantage of the Internet in order to remain competitive and provide utility to the user. Thus, they are evolving from static, standalone, self-contained applications into dynamic, networked, componentized services.
Microsoft, recognizing the importance of staying competitive with online services, is pushing the evolution of desktop applications with their .NET endeavor. By turning applications into networked services, .NET blurs the lines even further between the desktop and the Internet.
The evolution of the Web and the desktop shows a definite trend towards applications becoming peers and having conversations with other applications, services, and people. The common language of conversations in both mediums is XML. As a way of providing a hierarchical structure and a meaningful context for data, XML is being adopted worldwide as the de facto language for moving this data between disparate applications. As Tim Bray puts it, "XML is the ASCII of the future."
Jabber is created
To fully realize the potential for unifying the conversations ranging throughout the Internet today, and enabling applications and services to run on top of a common platform, a community of developers worldwide has developed a set of technologies collectively known as Jabber (http://jabber.org). Jabber was designed from the get-go for peer conversations, both P-P and particularly A-A, and for real-time as well as asynchronous/offline conversations. Jabber is fully distributed, while allowing a corporation or service to manage its own namespace. Its design is a response to the popularity of the closed IM services. We are trying to create a simple and manageable platform that offers the conversational traits described earlier in this chapter, traits that none of the existing systems come close to providing in full.
Jabber began in early 1998 out of a desire to create a truly open, distributed platform for instant messaging and to break free from the centralized, commercial IM services. The design began with XML, which we exploited for its extensibility and for its ability to encapsulate data, which lowers the barrier to accessing it. The use of XML is pervasive across Jabber, allowing new protocols to be transparently implemented on top of a deployed network of servers and applications. XML is used for the native protocol, translated to other formats as necessary in order to communicate between Jabber applications and other messaging protocols.
The Jabber project emerged from that early open collaboration of numerous individuals and companies worldwide. The name Jabber symbolizes its existence as numerous independent projects sharing common goals, each building a part of the overall architecture. These projects include:
- A modular open source server written in C
- Numerous open source and commercial clients for nearly every platform
- Gateways to most existing IM services and Internet messaging protocols
- Libraries for nearly every programming language
- Specialized agents and services such as RSS and language translations
Jabber is simply a set of common technologies that all of these projects agree on collaboratively when building tools for peer-to-peer systems. One important focus of Jabber is to empower conversations between both people and applications.
The Jabber team hopes to create an open medium in which the user has choice and flexibility in the software used to manage conversations, instead of being hindered by the features provided by a closed, commercial service. We hope to accelerate the development of peer applications built on an open foundation, by enabling them to have intelligent conversations with other people and applications, and by providing a common underlying foundation that facilitates conversations and the accessibility of dynamic data from different services.
The centrality of XML
Fundamentally, Jabber enables software to have conversations in XML. When people use Jabber-based software as a messaging platform to have conversations with other people, data exchanges use XML under the surface. Applications use Jabber as an XML storage and exchange service on behalf of their users.
XML is not only the core format for encoding data in Jabber; it is also the protocol, the transport layer between peers, the storage format, and the internal data model within most applications. XML permeates every conversation.
The Jabber architecture is also aware of XML namespaces, which permit different groups of people to define different sets of XML tags to represent data. Thus, using a namespace, one group (Dublin Core) has developed a set of tags for talking about the titles, authors, and other elements of a document. Another group might define a namespace for describing music. An instant messaging community using Jabber could combine the two namespaces to exchange information on books about music. Chapter 13, Metadata, looks at the promise of Dublin Core and other namespaces for peer-to-peer applications.
Here is a simple message using Jabber's XML format:
<message to="hamlet@denmark" from="horatio@denmark" type="chat">
<body>Here, sweet lord, at your service.</body>
And here's a hypothetical message with additional data in a namespace included:
<message to="horatio@denmark" from="hamlet@denmark">
<body>Angels and Ministers of Grace, defend us!</body>
By supporting namespaces, Jabber enables the inclusion of any XML data in any namespace anywhere within the conversation. This allows applications and services to include, intercept, and modify their own XML data at any point. Jabber is thus reduced to serving as a conduit between peers. Ironically, this lowly status provides the power that Jabber offers to Internet conversations.
Pieces of the infrastructure
While the goal of Jabber is to support other naming conventions and protocols, rather than to create brand-new ones, it depends on certain new concepts that require new types of syntax and binding technologies. These help create a common architecture.
Naming is at the heart of any system--each resource must have a unique identity. In Jabber, each resource is identified by a three-part name consisting of a user, a server, and a resource.
The user is often an individual, and the server is a system that runs a Jabber-based application. In a name, the user and server are formatted just like email, user@server. This provides a general way to pass identification between people that is already well understood and socially accepted. Since the server resolves the username, the format also allows a user's identity to be managed by a service or corporation the way America Online and Napster manage their usernames. This is an important point for Internet services that are providing a public utility to consumers or companies, and especially for corporations that want to or are required to manage their identities very carefully. This also allows any user to use a third party, such as Dynamic DNS Network Services (http://www.dyndns.org), for transient access to a permanent hostname so as not to be forced to rely on someone else's identity.
The server component of the identity could also provide a community aspect to naming, as it may be shared between a small group of friends, a family, or a special interest group. The name then stands out and identifies the user's relationship as part of that community.
The third part of the identity is the resource. As in a Unix filename or URL, the resource follows the server and is delimited by a slash, as in user@server/resource. Outside Jabber, the name is formatted like a combination of an email address and a web URL: jabber://user@server/resource/data.
This third aspect of the identity, the resource, allows any Jabber application to provide public access to any data within itself, analogous to a web server providing access to any file it can serve. It also serves to identify different applications that might be operating for a single user. For example, my Jabber ID is email@example.com, and when I'm online at home my client application might be identified as firstname.lastname@example.org/desktop.
Presence is a concept fundamental to conversations, because it supports the arbitrary coming and going of participants. Technically, presence is simply a state that a user or application is in. Traditional states in instant messaging include online, offline, and somewhere in between (away, do not disturb, sleeping, etc.). The Jabber architecture automatically manages presence information for users and applications, distributing the information as needed while strictly protecting privacy. It is often this single characteristic that adds the most value to the peers in a conversation: just knowing that the other peer is available to have a conversation.
Presence can go beyond simple online/offline state information. XML could be used to convey location, activity, and contextual (work/project) or application-specific data. Presence information itself provides an inherent context for P-P conversations, as well as status and location context for A-A conversations.
Here is a simple presence example in XML:
<status>Gone to England</status>
Another powerful feature of a traditional instant messaging service is the buddy list or roster. The importance of this list is often underestimated. It is a valuable part of the user's reality that they've stored and made available to their applications.
In social terms, each user's roster is his or her community. It defines the participants in this community or relationships to larger communities. A roster is an actualization of personal trust and relationships with peers. Applications should use this list intelligently to share their functionality and filter conversations.
The circle of trust in which a user has chosen to include his or her computer is a starting point for applications to locate other devices the user utilizes. It should also be used for choosing to collaborate with the resources available from trusted peers. This single, simple feature begins to open the door to the future possibilities mentioned near the beginning of this chapter, and it forms a step toward the warm, friendly environment envisioned by Tim Berners-Lee for the World Wide Web.
The Jabber architecture closely resembles email. Peers are connected and route data in a chain until it reaches the desired recipient. A client is connected to its server only, and its server is responsible for negotiating the delivery and receipt of that client's data with other servers or networks using whatever protocol is available. All data within the architecture is processed immediately and passed on to the next peer, or stored offline for immediate delivery once that peer is available again.
Peers can play traditional client and server roles within the Jabber architecture. Every server acts as a peer with respect to another server, using SRV DNS records to locate the actual server. Servers also use hostname dialback, independently contacting the sending server to validate incoming data. This prevents spoofing and helps ensure an overall more reliable and secure trust system.
All clients are peers with respect to other clients, and, after establishing a conversation with their servers, are able to establish real-time conversations in XML with any other client. Clients can also include or embed a server internally so that they can operate in any role and provide additional flexibility and security.
Along with support for all major instant messaging services (AIM, ICQ, MSN, Yahoo!), Jabber is also protocol agnostic. It uses a variety of applications between the endpoints of the conversations to transparently translate the XML data to and from another protocol. In its immediate applications, Jabber's translation capabilities let it support P-P relationships across traditional instant messaging services, IRC, and email. But the same flexibility also allows the construction of A-A bridges, such as transparent access to SIP, IMXP, and PAM applications, as well as access to Jabber's native presence and messaging functionality from those protocols.
Finally, the protocol-agnostic design of Jabber allows it to participate in the exciting evolution of the Web mentioned earlier in the section "Evolving toward the ideal": An evolution including such technologies as WebDAV, the use of XML over HTTP in the SOAP protocol, the RSS service that broadcasts information about available content, and other web services. We hope to set up revolving door access so that HTTP applications can access native Jabber functionality and so that Jabber applications can transparently access conversations happening over HTTP.
A recent addition to Jabber is browsing, which is similar to the feature of the same name in the Network Neighborhood on Microsoft systems. Browsing lets users retrieve lists of peers from other peers and establish relationships between peers. It can be used to see what services might be available from a server, as well as what applications and paths of communication a user has made available to other users and their applications.
Peers that a user might make available could include their normal instant messaging client (home, work, laptop, etc.), a pager transport, an offline inbox, a cell phone, a PDA, a TV, a scheduling application, a 3-D game, or a word processor. Additionally, XML information can be made browsable by a user or application, so that a user's vCard (verification information), public key, personal recipes, music list, bookmarks, or other XML information could be read by both people and applications. Browsing also allows people and applications to locate public peers, such as other messaging gateways mentioned earlier, web services, group chats, and agents (searching, translation, fortune, announcements, Eliza).
By centralizing and coordinating all of your conversations via a central identity, the software managing that identity for you may be empowered to act upon incoming conversations and intelligently filter them. This feature can be used to modify the content of a transmission or, even more often, to make decisions about what to do with a conversation when you're not available (store it offline, copy it to a pager, forward it to another account, etc.). The same feature is also useful to manage the conversations between applications. For instance, if you maintain a personal peer and a work-scheduling peer, conversation management software can redirect incoming conversations to the correct agent based on the relationship to the sender stored in the roster. When you have all of your conversations managed by a common identity, they can be managed directly from one single point, enabling you to have more control over your conversations.
For more information about Jabber, or to become involved in the project (we openly welcome anyone interested), visit http://jabber.org or contact the core team at email@example.com. The 1.0 server was released in May of 2000 and rapidly evolved into a 1.2 release in October, due to popularity and demand. The development focus is now on helping the architecture mature and further developing many of the ideas mentioned here. The development team is collaborating to quickly realize the future possibilities described in this paper, so that they're not so "future" after all.
Back to: Sample Chapter Index
Back to: Peer-to-Peer
© 2001, O'Reilly & Associates, Inc.