BUY THIS BOOK
Add to Cart

Print Book $44.99


Safari Books Online

What is this?

Add to UK Cart

Print Book £24.95

What is this?

Looking to Reprint this content?


Java Distributed Computing
Java Distributed Computing By Jim Farley
January 1998
Pages: 384

Cover | Table of Contents | Colophon


Table of Contents

Chapter 1: Introduction
For the past decade, "distributed computing" has been one of the biggest buzz phrases in the computer industry. At this point in the information age, we know how to build networks; we use thousands of engineering workstations and personal computers to do our work, instead of huge behemoths in glass-walled rooms. Surely we ought to be able to use our networks of smaller computers to work together on larger tasks. And we do—an act as simple as reading a web page requires the cooperation of two computers (a client and a server) plus other computers that make sure the data gets from one location to the other. However, simple browsing (i.e., a largely one-way data exchange) isn't what we usually mean when we talk about distributed computing. We usually mean something where there's more interaction between the systems involved.
You can think about distributed computing in terms of breaking down an application into individual computing agents that can be distributed on a network of computers, yet still work together to do cooperative tasks. The motivations for distributing an application this way are many. Here are a few of the more common ones:
  • Computing things in parallel by breaking a problem into smaller pieces enables you to solve larger problems without resorting to larger computers. Instead, you can use smaller, cheaper, easier-to-find computers.
  • Large data sets are typically difficult to relocate, or easier to control and administer located where they are, so users have to rely on remote data servers to provide needed information.
  • Redundant processing agents on multiple networked computers can be used by systems that need fault tolerance. If a machine or agent process goes down, the job can still carry on.
There are many other motivations, and plenty of subtle variations on the ones listed here.
Assorted tools and standards for assembling distributed computing applications have been developed over the years. These started as low-level data transmission APIs and protocols, such as RPC and DCE, and have recently begun to evolve into object-based distribution schemes, such as CORBA, RMI, and OpenDoc. These programming tools essentially provide a protocol for transmitting structured data (and, in some cases, actual runnable code) over a network connection. Java offers a language and an environment that encompass various levels of distributed computing development, from low-level network communication to distributed objects and agents, while also having built-in support for secure applications, multiple threads of control, and integration with other Internet-based protocols and services.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Anatomy of a Distributed Application
A distributed application is built upon several layers. At the lowest level, a network connects a group of host computers together so that they can talk to each other. Network protocols like TCP/IP let the computers send data to each other over the network by providing the ability to package and address data for delivery to another machine. Higher-level services can be defined on top of the network protocol, such as directory services and security protocols. Finally, the distributed application itself runs on top of these layers, using the mid-level services and network protocols as well as the computer operating systems to perform coordinated tasks across the network.
At the application level, a distributed application can be broken down into the following parts:
Processes
A typical computer operating system on a computer host can run several processes at once. A process is created by describing a sequence of steps in a programming language, compiling the program into an executable form, and running the executable in the operating system. While it's running, a process has access to the resources of the computer (such as CPU time and I/O devices) through the operating system. A process can be completely devoted to a particular application, or several applications can use a single process to perform tasks.
Threads
Every process has at least one thread of control. Some operating systems support the creation of multiple threads of control within a single process. Each thread in a process can run independently from the other threads, although there is usually some synchronization between them. One thread might monitor input from a socket connection, for example, while another might listen for user events (keystrokes, mouse movements, etc.) and provide feedback to the user through output devices (monitor, speakers, etc.). At some point, input from the input stream may require feedback from the user. At this point, the two threads will need to coordinate the transfer of input data to the user's attention.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Requirements for Developing Distributed Applications
Now that we've defined some terms that can be used to discuss distributed applications, we can start to look at what goes into developing these applications. In this section we'll discuss some of the issues that you face when developing distributed systems, and what kinds of tools and capabilities you'll need in order to address these issues. The next section will describe how Java provides these tools and capabilities.
If you think of the computer hosts and network connections available for a distributed application to use as a "virtual machine," then one of the primary tasks you have is to engineer an optimal mapping of processes, objects, threads and agents to the various parts of this virtual machine. In some cases, a straightforward client/server partitioning based on data requirements can be used. Computational tasks can be distributed based on the data needs of the application: maximize local data needed for processing, and minimize data transfers over the network. In other, more compute-intensive applications, you can partition the system based upon the functional requirements of the system, with data mapped to the most logical compute host. This method of partitioning is especially useful when the overhead associated with data transfers is negligible compared to the computing time spent at the various hosts.
In the best of all possible worlds, you could develop modules based upon either data- or functionally driven partitioning. You could then distribute these modules as needed throughout a virtual machine comprised of computers and communication links, and easily connect the modules to establish the data flow required by the application. These module interconnections should be as flexible and transparent as possible, since they may need to be adjusted at any point during development or deployment of the distributed system.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
What Does Java Provide?
The original design motivations behind Java and its predecessor, Oak, were concerned mainly with reliability, simplicity, and architecture neutrality. Subsequently, as the potential for Java as an "Internet programming language" was seen by its developers at Sun Microsystems, support for networking, security, and multithreaded operations was incorporated or improved. All of these features of the Java language and environment also make for a very powerful distributed application development environment. This is, of course, no accident. The requirements for developing an Internet-based application overlap to a great extent with those of distributed application development.
In this section, we review some of the features of Java that are of particular interest in distributed applications, and how they help to address some of the issues described in the previous section.
Java is a "pure" object-oriented language, in the sense that the smallest programmatic building block is a class. A data structure or function cannot exist or be accessed at runtime except as an element of a class definition. This results in a well-defined, structured programming environment in which all domain concepts and operations are mapped into class representations and transactions between them. This is advantageous for systems development in general, but also has benefits specifically for you as the distributed system developer. An object, as an instance of a class, can be thought of as a computing agent. Its level of sophistication as an autonomous agent is determined by the complexity of its methods and data representations, as well as its role within the object model of the system, and the runtime object community defining the distributed system. Distributing a system implemented in Java, therefore, can be thought of as simply distributing its objects in a reasonable way, and establishing networked communication links between them using Java's built-in network support. If you have the luxury of designing a distributed system from the ground up, then your object model and class hierarchy can be specified with distribution issues incorporated.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 2: Networking in Java
We saw in Chapter 1 how the socket and stream classes in the java.net and java.io packages could be used to do basic networking between agents. In this chapter we take a more detailed look at the networking support in Java, as the foundation for distributed systems. The topics we'll cover include:
  • Sockets for low-level network connections
  • Streams for formatted data and messaging protocols
  • URL, URLConnection, and ContentHandler classes
  • The ClassLoader as an object distribution scheme
We'll look at these topics in increasing pecking order from the networking perspective. Sockets first, since they are the most primitive communication object in the Java API; then streams, which let you impose some order on the data flowing over these sockets; next, the classes associated with the HTTP protocol, namely, the URL, URLConnection, and ContentHandler classes; finally, the ClassLoader, which, when coupled with the others, offers the ability to transmit actual Java classes over the wire.
The java.net package provides an object-oriented framework for the creation and use of Internet Protocol (IP) sockets. In this section, we'll take a look at these classes and what they offer.
Before communicating with another party, you must first know how to address your messages so they can be delivered correctly. Notice that I didn't say that you need to know where the other party is located—once a scheme for encoding a location is established, I simply need to know my party's encoded address to communicate. On IP networks, the addressing scheme in use is based on hosts and port numbers.
A given host computer on an IP network has a hostname and a numeric address. Either of these, in their fully qualified forms, is a unique identifier for a host on the network. The JavaSoft home page, for example, resides on a host named
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Sockets and Streams
The java.net package provides an object-oriented framework for the creation and use of Internet Protocol (IP) sockets. In this section, we'll take a look at these classes and what they offer.
Before communicating with another party, you must first know how to address your messages so they can be delivered correctly. Notice that I didn't say that you need to know where the other party is located—once a scheme for encoding a location is established, I simply need to know my party's encoded address to communicate. On IP networks, the addressing scheme in use is based on hosts and port numbers.
A given host computer on an IP network has a hostname and a numeric address. Either of these, in their fully qualified forms, is a unique identifier for a host on the network. The JavaSoft home page, for example, resides on a host named www.javasoft.com, which currently has the IP address 204.160.241.98. Either of these addresses can be used to locate the machine on an IP network. The textual name for the machine is called its Domain Name Services (DNS) name, which can be thought of as a kind of alias for the numeric IP address.
In the Java API, the InetAddress class represents an IP address. You can query an InetAddress for the name of the host using its getHostName() method, and for its numeric address using getAddress() . Notice that, even though we can uniquely specify a host with its IP address, we do not necessarily know its physical location. I look at the web pages on www.javasoft.com regularly, but I don't know where the machine is (though I could guess that it's in California somewhere). Conversely, even if I knew where the machine was physically, it wouldn't do me a bit of good if I didn't know its IP address (unless someone was kind enough to label the machine with it, or left a terminal window open on the server's console for me to get its IP address directly).
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
URLs, URLConnections, and ContentHandlers
The java.net package, in addition to object-oriented representations of IP sockets, also provides objects that support the HTTP protocol for accessing data in the form of addressable documents. HTTP is really an extension of the underlying IP protocol we discussed earlier, designed specifically to provide a way to address different kinds of documents, or pieces of data, distributed on the network. In the rest of this book, we'll see numerous examples of distributed applications whose agents use customized or standard communications protocols to talk to each other. If there is an HTTP server "agent" available on one of the hosts in our distributed application, then we can use the classes discussed in this section to ask it for data documents using the standard HTTP protocol.
To address a specific document or data object, we use a Uniform Resource Locator (URL), which includes four address elements: the protocol, host, port, and document. The Java representation for a URL is the URL class, which is constructed with a given protocol, host, port, and document filename. Once the URL object is constructed, it allows the user to make the necessary requests to connect to the HTTP server of the data object, query for information about the object, and download the object. The content of the object can be accessed using the getContent(), openConnection(), or openStream() methods on the URL object. Of these three methods, openStream() is simplest. The openStream() method returns an InputStream that can be used to read the data contents directly.
When you call openConnection() on a URL object, you get a URLConnection in return. You can use the URLConnection to query the data connection's header information for the data object's length, the type of data it contains, the data encoding, etc. You can also control aspects of the data connection that determine when the data object can be pulled from a local cache, whether input or output is to be done over the data connection, and when unmodified data should be read from the server.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The ClassLoader
The Java runtime environment is based upon a virtual machine that interprets, verifies, and executes classes in the form of platform-independent bytecodes. In addition, the Java API includes a mechanism for you to load class definitions in their bytecode form, and integrate them into the runtime environment so that instances of the classes can be constructed and used. When your Java files are compiled, a similar mechanism is invoked whenever an import statement is encountered. The referenced class or package of classes is loaded from files in bytecode format, using the CLASSPATH environment variable to locate them on the local file system.
In addition to this default policy for loading classes, the java.lang.ClassLoader class allows the user to define custom policies and mechanisms for locating and loading classes into the runtime environment. The ClassLoader is an abstract class. Subclasses must define an implementation for the loadClass() method, which is responsible for locating the class based upon the given string name, loading the bytecodes comprising the class definition, and (optionally) resolving the class. A class has to be resolved before it can be constructed or before any of its methods can be called. Resolving a class includes finding all of the other classes that it depends on, and loading them into the runtime as well.
The ClassLoader is an important element of the network support in the Java API. It's used as the basis for supporting Java applets in most Java-enabled web browsers, for example. When an HTML page includes an APPLET tag that references a Java class on the HTTP server, a ClassLoader instance within the browser's Java runtime is used to load the bytecodes of the class into the virtual machine, create an instance of the class, and then execute methods on the new object. Note that this is different from the concept of distributing objects using RMI or CORBA. Rather than creating an object on one host and allowing a process on a remote host to call methods on that object, the
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 3: Distributing Objects
Distributed objects are a potentially powerful tool that has only become broadly available for developers at large in the past few years. The power of distributing objects is not in the fact that a bunch of objects are scattered across the network. The power lies in that any agent in your system can directly interact with an object that "lives" on a remote host. Distributed objects, if they're done right, really give you a tool for opening up your distributed system's resources across the board. And with a good distributed object scheme you can do this as precisely or as broadly as you'd like.
The first three sections of this chapter go over the motivations for distributing objects, what makes distributed object systems so useful, and what makes up a "good" distributed object system. Readers who are already familiar with basic distributed object issues can skip these sections and go on to the following sections, where we discuss two major distributed object protocols that are available in the Java environment: CORBA and RMI.
Although this chapter will cover the use of both RMI and CORBA for distributing objects, the rest of the book primarily uses examples that are based on RMI, where distributed objects are needed. We chose to do this because RMI is a simpler API and lets us write relatively simple examples that still demonstrate useful concepts, without getting bogged down in CORBA API specifics. Some of the examples, if converted to be used in production environments, might be better off implemented in CORBA.
In Chapter 1, we discussed some of the optimal data/function partitioning capabilities that you'd like to have available when developing distributed applications. These included being able to distribute data/function "modules" freely and transparently, and have these modules be defined based on application structure rather than network distribution influences. Distributed object systems try to address these issues by letting developers take their programming objects and have them "run" on a remote host rather than the local host. The goal of most distributed object systems is to let any object reside anywhere on the network, and allow an application to interact with these objects exactly the same way as they do with a local object. Additional features found in some distributed object schemes are the ability to construct an object on one host and transmit it to another host, and the ability for an agent on one host to create a new object on another host.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Why Distribute Objects?
In Chapter 1, we discussed some of the optimal data/function partitioning capabilities that you'd like to have available when developing distributed applications. These included being able to distribute data/function "modules" freely and transparently, and have these modules be defined based on application structure rather than network distribution influences. Distributed object systems try to address these issues by letting developers take their programming objects and have them "run" on a remote host rather than the local host. The goal of most distributed object systems is to let any object reside anywhere on the network, and allow an application to interact with these objects exactly the same way as they do with a local object. Additional features found in some distributed object schemes are the ability to construct an object on one host and transmit it to another host, and the ability for an agent on one host to create a new object on another host.
The value of distributed objects is more obvious in larger, more complicated applications than in smaller, simpler ones. That's because much of the trade-off between distributed objects and other techniques, like message passing, is between simplicity and robustness. In a smaller application with just a few object types and critical operations, it's not difficult to put together a catalog of simple messages that would let remote agents perform all of their critical operation through on-line transactions. With a larger application, this catalog of messages gets complicated and difficult to maintain. It's also more difficult to extend a large message-passing system if new objects and operations are added. So being able to distribute the objects in our system directly saves us a lot of design overhead, and makes a large distributed system easier to maintain in the long run.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
What's So Tough About Distributing Objects?
OK, so we think distributing objects is a good idea, but why do distributed object systems like CORBA and, to a lesser degree, Java RMI, seem so big and complicated? In Chapter 2 we saw how the core Java API, especially the java.net and java.io packages, gives us easy access to the network and key network protocols. They also let us layer application-specific operations on top of the network pretty easily. It seems like all that we'd need to do is extend these packages to allow objects to invoke each other's methods over the network, and we'd have a basic distributed object system. To get a feeling for the complexity of distributed object systems, let's look at what it would take to put together one of our own using just the core Java API, without utilizing the RMI package or the object input/output streams in the java.io package.
The essential requirements in a distributed object system are the ability to create or invoke objects on a remote host or process, and interact with them as if they were objects within our own process. It seems logical that we would need some kind of message protocol for sending requests to remote agents to create new objects, to invoke methods on these objects, and to delete the objects when we're done with them. As we saw in Chapter 2, the networking support in the Java API makes it very easy to implement a message protocol. But what kinds of things does a message protocol have to do if it's supporting a distributed object system?
To create a remote object, we need to reference a class, provide constructor arguments for the class, and receive a reference to the created object in return. This object reference will be used to invoke methods on the object, and eventually to ask the remote agent to destroy the object when we are done with it. So the data we will need to send over the network include class references, object references, method references, and method arguments.
The first item is easy—we already saw in Chapter 2 how the
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Features of Distributed Object Systems
From our exercise in the previous section, we uncovered some of the features that distributed object systems need. These features, plus some others, are illustrated in Figure 3.1. An object interface specification is used to generate a server implementation of a class of objects, an interface between the object implementation and the object manager, sometimes called an object skeleton , and a client interface for the class of objects, sometimes called an object stub. The skeleton will be used by the server to create new instances of the class of objects and to route remote method calls to the object implementation. The stub will be used by the client to route transactions (method invocations, mostly) to the object on the server. On the server side, the class implementation is passed through a registration service , which registers the new class with a naming service and an object manager, and then stores the class in the server's storage for object skeletons.
Figure 3.1: General architecture for distributed object systems
With an object fully registered with a server, the client can now request an instance of the class through the naming service. The runtime transactions involved in requesting and using a remote object are shown in Figure 3.2. The naming service routes the client's request to the server's object manager, which creates and initializes the new object using the stored object skeleton. The new object is stored in the server's object storage area, and an object handle is issued back to the client in the form of an object stub interface. This stub is used by the client to interact with the remote object.
Figure 3.2: Remote object transactions at runtime
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Distributed Object Schemes for Java
While there are several distributed object schemes that can be used within the Java environment, we'll only cover two that qualify as serious options for developing your distributed applications: CORBA and RMI. Both of them have their advantages and their limitations, which we'll look at in detail in the following sections.
During this discussion, we'll be using an example involving a generic problem solver, which we'll distribute using both CORBA and RMI. We'll show in each case how instances of this class can be used remotely using these various object distribution schemes. A Java interface for the example class, called Solver , is shown in Example 3.1. The Solver acts as a generic compute engine that solves numerical problems. Problems are given to the Solver in the form of ProblemSet objects; the ProblemSet interface is shown in Example 3.2. The ProblemSet holds all of the information describing a problem to be solved by the Solver. The ProblemSet also contains fields for the solution to the problem it represents. In our highly simplified example, we're assuming that any problem is described by a single floating-point number, and the solution is also a single floating-point value.
Example 3.1. A Problem Solver Interface
package dcj.examples;

import java.io.OutputStream;

//
// Solver:
// An interface to a generic solver that operates on ProblemSets
//

public interface Solver
{
  // Solve the current problem set
  public boolean solve(); 

  // Solve the given problem set
  public boolean solve(ProblemSet s, int numIters); 

  // Get/set the current problem set
  public ProblemSet getProblem();
  public void setProblem(ProblemSet s);

  // Get/set the current iteration setting
  public int getInterations();
  public void setIterations(int numIter);

  // Print solution results to the output stream
  public void printResults(OutputStream os);
}
Example 3.2. A Problem Set Class
package dcj.examples;

public class ProblemSet
{
  protected double value = 0.0;
  protected double solution = 0.0;

  public double getValue() { return value; }
  public double getSolution() { return solution; }
  public void setValue(double v) { value = v; }
  public void setSolution(double s) { solution = s; }
}
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
CORBA
CORBA, the Common Object Request Broker Adapter, is a distributed object standard developed by members of the Object Management Group (OMG) and their corporate members and sponsors. The first versions of the CORBA standard were developed long before Java was publicized by Sun (the OMG was formed in 1989, the CORBA 1.1 specification was released in 1991, and the first pre-release versions of Java and the HotJava browser were made public in the 1994-1995 timeframe). CORBA is meant to be a generic framework for building systems involving distributed objects. The framework is meant to be platform- and language-independent, in the sense that client stub interfaces to the objects, and the server implementations of these object interfaces, can be specified in any programming language. The stubs and skeletons for the objects must conform to the specifications of the CORBA standard in order for any CORBA client to access your CORBA objects.
The CORBA framework for distributing objects consists of the following elements:
  • An Object Request Broker (ORB), which provides clients and servers of distributed objects with the means to make and receive requests of each other. ORBs can also provide object services, such as a Naming Service that lets clients look-up objects by name, or Security Services that provide for secure inter-object communications.
  • Methods for specifying the interfaces that objects in the system support. These interfaces specify the operations that can be requested of the object, and any data variables available on the object. CORBA offers two ways to specify object interfaces: an Interface Definition Language (IDL) for static interface definitions, and a Dynamic Invocation Interface (DII), which lets clients access interfaces as first-class objects from an Interface Repository. The DII is analogous in some ways to the Java Reflection API, which was introduced in JDK 1.1.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Java RMI
The Java Remote Method Invocation (RMI) package is a Java-centric scheme for distributed objects that is now a part of the core Java API. RMI offers some of the critical elements of a distributed object system for Java, plus some other features that are made possible by the fact that RMI is a Java-only system. RMI has object communication facilities that are analogous to CORBA's IIOP, and its object serialization system provides a way for you to transfer or request an object instance by value from one remote process to another.
Since RMI is a Java-only distributed object scheme, all object interfaces are written in Java. Client stubs and server skeletons are generated from this interface, but using a slightly different process than in CORBA. First, the interface for the remote object has to be written as extending the java.rmi.Remote interface. The Remote interface doesn't introduce any methods to the object's interface; it just serves to mark remote objects for the RMI system. Also, all methods in the interface must be declared as throwing the java.rmi.RemoteException . The RemoteException is the base class for many of the exceptions that RMI defines for remote operations, and the RMI engineers decided to expose the exception model in the interfaces of all RMI remote objects. This is one of the drawbacks of RMI: it requires you to alter an existing interface in order to apply it to a distributed environment.
Once the remote object's Java interface is defined, a server implementation of the interface can be written. In addition to implementing the object's interface, the server also typically extends the java.rmi.server.UnicastRemoteObject class. UnicastRemoteObject is an extension of the RemoteServer class, which acts as a base class for server implementations of objects in RMI. Subclasses of
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
RMI vs. CORBA
In this chapter we've implemented the simple distributed compute engine using both CORBA and RMI, and we've seen many similarities between the two in terms of functionality. There are also some critical differences between the two technologies. In order for you to understand which distributed object scheme is right for whatever system you're facing, it's important to spell out these differences.
As we mentioned before, RMI is a Java-centric distributed object system. The only way currently to integrate code written in other languages into a RMI system is to use the Java native-code interface to link a remote object implementation in Java to C or C++ code. This is a possibility, but definitely not something for the faint of heart. The native-code interface in Java is complicated, and can quickly lead to fragile or difficult-to-maintain code. CORBA, on the other hand, is designed to be language-independent. Object interfaces are specified in a language that is independent of the actual implementation language. This interface description can then be compiled into whatever implementation language suits the job and the environment.
This distinction is really at the heart of the split between the two technologies. RMI, as a Java-centric system, inherits all of the benefits of Java. An RMI system is immediately cross-platform; any subsystem of the distributed system can be relocated to any host that has a Java virtual machine handy. Also, the virtual machine architecture of Java allows us to do some rather interesting things in an RMI system that just aren't possible in CORBA. For example, using RMI and the object serialization in the java.io package, we can implement an agent-based system where clients subclass and specialize an Agent interface, set the operating parameter values for the agent, and then send the object in its entirety to a remote "sandbox" server, where the object will act in our name to negotiate on some issue (airline ticket prices, stocks and bonds, order-fulfillment schedules, etc.). The remote server only knows that each agent has to implement an agreed-upon interface, but doesn't know anything about how each agent is implemented, even though the agent is running on the server itself. In CORBA, objects can never really leave their implementation hosts; they can only roam the network in the virtual sense, sending stub references to themselves and to clients. We don't have the option of offloading an object from one host to another.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 4: Threads
In this chapter we will take a look at Java's support for multithreaded applications. The ability to create multithreaded applications is critical in distributed computing systems, since in many cases you'll want multiple clients to be able to make requests to agents in your system, and you'd like the agents to be as responsive as possible. Supporting asynchronous transactions introduces some new issues in developing any distributed application, and we'll take a look at how the thread support in Java helps you manage these issues.
The Java API includes two classes that embody the core thread support in the language. These classes are java.lang.Thread and java.lang.Runnable. They allow you to define threads of control in your application, and to manage threads in terms of runtime resources and running state.
As the name suggests, java.lang.Thread represents a thread of control. It offers methods that allow you to set the priority of the thread, to assign a thread to a thread group (more on these in a later section), and to control the running state of the thread (e.g., whether it is running or suspended).
The java.lang.Runnable interface represents the body of a thread. Classes that implement the Runnable interface provide their own run() methods that determine what their thread actually does while running. In fact, run() is the only method defined by the Runnable interface. If a Thread is constructed with a Runnable object as its body, the run() method on the Runnable will be called when the thread is started.
The choice between extending the Thread class or implementing the Runnable interface with your application objects is sometimes not an obvious one. It's also usually not very important. Essentially, the difference between the two classes is that a Thread is supposed to represent how a thread of control runs (its priority level, the name for the thread), and a Runnable defines what a thread runs. In both cases, defining a subclass usually involves implementing the
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Thread and Runnable
The Java API includes two classes that embody the core thread support in the language. These classes are java.lang.Thread and java.lang.Runnable. They allow you to define threads of control in your application, and to manage threads in terms of runtime resources and running state.
As the name suggests, java.lang.Thread represents a thread of control. It offers methods that allow you to set the priority of the thread, to assign a thread to a thread group (more on these in a later section), and to control the running state of the thread (e.g., whether it is running or suspended).
The java.lang.Runnable interface represents the body of a thread. Classes that implement the Runnable interface provide their own run() methods that determine what their thread actually does while running. In fact, run() is the only method defined by the Runnable interface. If a Thread is constructed with a Runnable object as its body, the run() method on the Runnable will be called when the thread is started.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Making a Thread
The choice between extending the Thread class or implementing the Runnable interface with your application objects is sometimes not an obvious one. It's also usually not very important. Essentially, the difference between the two classes is that a Thread is supposed to represent how a thread of control runs (its priority level, the name for the thread), and a Runnable defines what a thread runs. In both cases, defining a subclass usually involves implementing the run() method to do whatever work you want done in the separate thread of control.
Most of the time we want to specify what runs in a thread, so in most cases you may want to implement the Runnable interface. With a Runnable subclass, you can use the same object with different types of Thread subclasses, depending on the application. You might use your implementation of Runnable inside a standard Thread in one case, and in another you might run it in a subclass of Thread that sends a notice across the network when it's started.
On the other hand, directly extending Thread can make your classes slightly easier to use. You just create one of your Thread subclasses and run it, instead of creating a Runnable subclass, putting into another Thread, and running it. Also, if your application objects are subclasses of Thread, then you can access them directly by asking the system for the current thread, or the threads in the current thread group, etc. Then you can cast the object to its subclass and call specialized methods on it, maybe to ask it how far it's gotten on whatever task you gave it.
In the next sections we'll look at how to both implement Runnable and extend Thread to make an object that executes in an independent thread. We'll return to our Solver example, making it usable in a multithreaded agent within a distributed system. The examples in this section will use fairly basic network communications, based on sockets and I/O streams, but the concepts extend pretty easily to distributed object scenarios.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Managing Threads at Runtime
In addition to changing the running state of your application threads, the Java API allows you to do some basic thread management at runtime. The functionality provided includes thread synchronization, organization of threads into thread groups, and influencing the thread scheduler by setting thread priorities. Before we see how all of these can come into play in a distributed application, let's go over them briefly so that we have a feeling for what kinds of capabilities they provide.
When you have multiple threads in an application, it sometimes becomes necessary to synchronize them with respect to a particular method or block of code. This usually occurs when multiple threads are updating the same data asynchronously. To ensure that these changes are consistent throughout the application, we need to make sure that one thread can't start updating the data before another thread is finished reading or updating the same data. If we let this occur, then the data will be left in an inconsistent state, and one or both threads will not get the correct result.
Java allows you to define critical regions of code using the synchronized statement. A method or block of code is synchronized on a class, object, or array, depending on the context of the synchronized keyword. If you use the synchronized modifier on a static method of a class, for example, then before the method is executed, the Java virtual machine obtains an exclusive "lock" on the class. A thread that attempts to enter this block of code has to get the lock before the code in the synchronized block is executed. If another thread is executing in this critical section at the time, the thread will block until the running thread exits the critical section and the lock on the class is released.
If a non-static method is declared synchronized, then the virtual machine obtains a lock on the object on which the method is invoked. If you define a synchronized block of code, then you have to specify the class, object, or array on which to synchronize.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Networked Threads
We've seen how to make separate threads of control in a Java applet or application, and we've discussed the various ways that the Java API allows you to manage threads at runtime. Now we'll go over some of the issues that arise with multithreaded distributed applications, and how the Java environment helps you deal with them.
The threaded implementation of our Solver interface in Example 4.1 shows how multithreaded servers can be implemented in Java. This allows our server to respond to clients asynchronously and to service their requests in parallel, which can reduce the amount of time a client has to wait for a response. The alternative is to have a server with only one thread servicing clients on a first-come, first-serve basis. So if client A is the first client to make a request, the server begins processing it right away. If client B makes a request while the server is processing client A's job, then B will have to wait for the server to finish A's job before its job can be started. In fact, client B won't even get an acknowledgment from the server until client A's job is done. With the multithreaded server, an independent thread can listen for client requests and acknowledge them almost immediately (or as soon as the thread scheduler gives it a CPU time slice). And with the jobs being allocated to separate threads for processing, the CPU resources will be spread out between the two jobs, and B's job will potentially finish sooner (though client A's job might finish later, since it is now getting less than 100% of the CPU).
Threads are useful in any distributed system where we want an agent to respond to asynchronous messages. By isolating communications in a separate thread, the other threads in the process can continue to do useful work while the communications thread blocks on a socket waiting for messages. The client process shown in Example 4.3 only has a single thread, since it doesn't really have anything else to do but wait for the server to send a response. But we could easily reuse these classes in a multithreaded client as a single communications thread, or as multiple threads talking to multiple servers.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 5: Security
Security becomes an issue as soon as you allow your computing resources to come in contact with the rest of the world. With the recent explosion in the use of networks, preserving the security of data and the resources that carry data has become a primary concern. An open communications port on any computing device almost always carries the potential for abuse: a malicious party may steal or damage sensitive information, network bandwidth, or any other resource associated with your site. Security measures can increase the effort needed for an intruder to gain access to these resources.
In this chapter, we'll look at the Java Security API and how you can use it to make the agents in your distributed application safe from network hostility. We'll briefly discuss the kinds of security concerns you should have as a distributed application developer, and what tools are available in the Java environment for addressing these issues. Some of the issues we'll discuss are common across most applications, so the Java language developers have provided integrated features in the runtime environment that attempt to address them. An example of one of these features is the bytecode verifier, which prevents some kinds of malicious code from running on your machine. Other issues are only important in specific domains and applications, and it's your duty to determine how important these issues are to your particular application, what kinds of measures need to be taken, and how much effort needs to be invested in implementing these measures. For example, consider data theft from communications links. Is your data valuable enough to protect with data encryption, and if so, what level of encryption is appropriate, given the value of the data and the level of effort you can expect from those trying to steal it?
The subject of security in networked environments is worthy of several books' worth of material, and you can find many readings on the subject. In this book, we will only have a superficial discussion of the technical aspects of network security and cryptography, with limited excursions into the details only where it is necessary to support a solid understanding of the topic. From this foundation, we can take an educated look at the security options available to you in the Java Security API, and where you might find them useful.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Security Issues and Concerns
Just about everything making up a site on a computer network is a resource with potential value. The most obvious resource you need to worry about is information—the data being sent over the network and the information residing on your host computers. Other resources that could be targets are the applications on your hosts, the CPU resources of your computers, even the bandwidth available on your communications links. A hostile party may want to steal these resources or do damage to them.
Following are some of the things an attacker may do to steal or destroy your resources:
Eavesdrop on network communications
The hostile agent may physically tap into network lines, or set up rogue programs on other hosts to watch for interesting traffic. They may be trying to steal information, or gather information that will help them steal or damage other resources.
Set up imposter agents or data sources
This will let them fool you into sending valuable information or giving access to resources they shouldn't have. Our Solver servers could be accessed by intruders acting as legitimate clients, and used to solve their numerical problems; or a hostile party could flood the server with ProblemSets to be solved, rendering the server useless to the legitimate users. Clients of the Solver are also vulnerable, since a hostile party could set up an imposter Solver meant to steal the problem data submitted by clients, or they could purposely generate erroneous results. If the attacker manages to figure out that the clients are trying to solve for stress levels in a finite-element model, for example, then the imposter server could be set up to return results that indicate abnormally high or low stress levels in critical sections of the model. This could cause someone to design a bridge that will collapse, or that's too expensive to build.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The java.security Package
The Java Security API is a framework for implementing and using security measures in the Java environment. The Java Security API is included in the core Java API, in the form of the java.security package.
The security package really provides two APIs: one for users of security algorithms, and another for the implementors or providers of these algorithms.

Section 5.2.1.1: The User API

The user API is designed to let you use different cryptographic algorithms in your application without having to know how they are implemented by providers. All you need to know is an algorithm's name. As an example, you can create a new key-pair generator using the Digital Signature Algorithm (DSA) with the following call:
KeyPairGenerator myGen = KeyPairGenerator.getInstance("DSA");
You can take this new object and ask it for key pairs to be used to sign messages, without knowing how the key pairs are being generated. If you wanted to use a different algorithm to implement your key-pair generator, you would just change the algorithm name in the preceding line of code. The rest of your code that uses the object can usually remain unchanged.
In the same way that cryptographic algorithms are specified by name, providers of these algorithms are also specified by name. If you wanted to use an implementation of DSA from a specific provider, then you could ask for it by name when you create an object:
KeyPairGenerator myGen =
    KeyPairGenerator.getInstance("DSA", "MY_PROVIDER");
Although the Security API lets you hide from the details of cryptographic algorithms if you want to, it also lets you use those details if you need to. Underneath the generic, algorithm-independent interfaces provided by the Security API, like the KeyPairGenerator in our example, implementations of these interfaces will use algorithm-specific subclasses. If you need to give details about the algorithm and its specific parameters, then you can access these algorithm-specific interfaces for the objects you create by casting:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Identities and Access Control
The Identity class represents an agent within the Security API. Identity implements the Principal interface, which is a generic representation of a person, group, or other named entity. An Identity has a name, which it inherits from the Principal interface, and other information that verifies the identity of the agent (a public key and assorted certificates, for example). A Signer is a subclass of Identity that also includes a private key that can be used to sign data. We'll discuss public and private keys and how they are created in more detail later in the chapter.
An Identity is created using a name for the agent being represented:
Identity fredsID = new Identity("Fred");
A public key and any available certificates can be added to Fred's identity to support the validity of his identity:
PublicKey fredsKey = ... // Get Fred's key
Certificate fredsCert = ... // Get Fred's certificate
Certificate fredsRSACert = ... // Get another certificate for Fred
fredsID.setPublicKey(fredsKey);
fredsID.addCertificate(fredsCert);
fredsID.addCertificate(fredsRSACert);
If we are also able to sign data using Fred's identity, then we'll also have a private key for Fred, and we can create a Signer object for him:
Signer signingFred = new Signer("Fred");
PrivateKey fredsSigningKey = ... // Get Fred's private key
PublicKey fredsPublicKey = ... // Get Fred's public key
signingFred.setKeyPair(new KeyPair(fredsPublicKey, fredsSigning Key));;
The java.security.acl package includes interfaces that let you define specific access rights for individual agents or groups of agents. In the same style as the rest of the Security API, this package defines an API for access-control lists, with few of the interfaces actually implemented in the package. Sun has provided a default implementation of the ACL package in their sun.security.acl package. We'll use classes from Sun's implementation to demonstrate ACLs.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Keys: Public, Private, and Secret
Content preview·Buy PDF of this chapter|