BUY THIS BOOK

Safari Books Online

What is this?

Looking to Reprint this content?


Java I/O
Java I/O By Elliotte Rusty Harold
March 1999
Pages: 598

Cover | Table of Contents | Colophon


Table of Contents

Chapter 1: Introducing I/O
Input and output, I/O for short, are fundamental to any computer operating system or programming language. Only theorists find it interesting to write programs that don't require input or produce output. At the same time, I/O hardly qualifies as one of the more "thrilling" topics in computer science. It's something in the background, something you use every day—but for most developers, it's not a topic with much sex appeal.
There are plenty of reasons for Java programmers to find I/O interesting. Java includes a particularly rich set of I/O classes in the core API, mostly in the java.io package. For the most part I/O in Java is divided into two types: byte- and number-oriented I/O, which is handled by input and output streams; and character and text I/O, which is handled by readers and writers. Both types provide an abstraction for external data sources and targets that allows you to read from and write to them, regardless of the exact type of the source. You use the same methods to read from a file that you do to read from the console or from a network connection.
But that's just the tip of the iceberg. Once you've defined abstractions that let you read or write without caring where your data is coming from or where it's going to, you can do a lot of very powerful things. You can define I/O streams that automatically compress, encrypt, and filter from one data format to another, and more. Once you have these tools, programs can send encrypted data or write zip files with almost no knowledge of what they're doing; cryptography or compression can be isolated in a few lines of code that say, "Oh yes, make this an encrypted output stream."
In this book, I'll take a thorough look at all parts of Java's I/O facilities. This includes all the different kinds of streams you can use. We're also going to investigate Java's support for Unicode (the standard multilingual character set). We'll look at Java's powerful facilities for formatting I/O—oddly enough, not part of the java.io package proper. (We'll see the reasons for this design decision later.) Finally, we'll take a brief look at the Java Communications API (
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
What Is a Stream?
A stream is an ordered sequence of bytes of undetermined length. Input streams move bytes of data into a Java program from some generally external source. Output streams move bytes of data from Java to some generally external target. (In special cases streams can also move bytes from one part of a Java program to another.)
The word stream is derived from an analogy with a stream of water. An input stream is like a siphon that sucks up water; an output stream is like a hose that sprays out water. Siphons can be connected to hoses to move water from one place to another. Sometimes a siphon may run out of water if it's drawing from a finite source like a bucket. On the other hand, if the siphon is drawing water from a river, it may well provide water indefinitely. So too an input stream may read from a finite source of bytes like a file or an unlimited source of bytes like System.in. Similarly an output stream may have a definite number of bytes to output or an indefinite number of bytes.
Input to a Java program can come from many sources. Output can go to many different kinds of destinations. The power of the stream metaphor and in turn the stream classes is that the differences between these sources and destinations are abstracted away. All input and output are simply treated as streams.
The first source of input most programmers encounter is System.in . This is the same thing as stdin in C, generally some sort of console window, probably the one in which the Java program was launched. If input is redirected so the program reads from a file, then System.in is changed as well. For instance, on Unix, the following command redirects stdin so that when the MessageServer program reads from System.in, the actual data comes from the file data.txt instead of the console:
% java MessageServer < data.txt
               
The console is also available for output through the static field
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Numeric Data
Input streams read bytes and output streams write bytes. Readers read characters and writers write characters. Therefore, to understand input and output, you first need a solid understanding of how Java deals with bytes, integers, characters, and other primitive data types, and when and why one is converted into another. In many cases Java's behavior is not obvious.
The fundamental integer data type in Java is the int , a four-byte, big-endian, two's complement integer. An int can take on all values between -2,147,483,648 and 2,147,483,647. When you type a literal integer like 7, -8345, or 3000000000 in Java source code, the compiler treats that literal as an int. In the case of 3000000000 or similar numbers too large to fit in an int, the compiler emits an error message citing "Numeric overflow."
longs are eight-byte, big-endian, two's complement integers with ranges from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807. long literals are indicated by suffixing the number with a lower- or uppercase L. An uppercase L is preferred because the lowercase l is too easily confused with the numeral 1 in most fonts. For example, 7L, -8345L, and 3000000000L are all 64-bit long literals.
There are two more integer data types available in Java, the short and the byte . shorts are two-byte, big-endian, two's complement integers with ranges from -32,768 to 32,767. They're rarely used in Java and are included mainly for compatibility with C.
bytes, however, are very much used in Java. In particular they're used in I/O. A byte is an eight-bit, two's complement integer that ranges from -128 to 127. Note that like all numeric data types in Java, a byte is signed. The maximum byte value is 127. 128, 129, and so on through 255 are not legal values for bytes.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Character Data
Numbers are only part of the data a typical Java program needs to read and write. Most programs also need to handle text, which is composed of characters. Since computers only really understand numbers, characters are encoded by matching each character in a given script to a particular number. For example, in the common ASCII encoding, the character A is mapped to the number 65; the character B is mapped to the number 66; the character C is mapped to the number 67; and so on. Different encodings may encode different scripts or may encode the same or similar scripts in different ways.
Java understands several dozen different character sets for a variety of languages, ranging from ASCII to the Shift Japanese Input System (SJIS) to Unicode. Internally, Java uses the Unicode character set. Unicode is a two-byte extension of the one-byte ISO Latin-1 character set, which in turn is an eight-bit superset of the seven-bit ASCII character set.
ASCII, the American Standard Code for Information Interchange, is a seven-bit character set. Thus it defines 27 or 128 different characters whose numeric values range from to 127. These characters are sufficient for handling most of American English and can make reasonable approximations to most European languages (with the notable exceptions of Russian and Greek). It's an often used lowest common denominator format for different computers. If you were to read a byte value between and 127 from a stream, then cast it to a char, the result would be the corresponding ASCII character.
ASCII characters 0-31 and character 127 are nonprinting control characters. Characters 32-47 are various punctuation and space characters. Characters 48-57 are the digits 0-9. Characters 58-64 are another group of punctuation characters. Characters 65-90 are the capital letters A-Z. Characters 91-96 are a few more punctuation marks. Characters 97-122 are the lowercase letters a-z. Finally, characters 123 through 126 are a few remaining punctuation symbols. The complete ASCII character set is shown in Table 2.1 in Appendix B.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Readers and Writers
In Java 1.1 and later, streams are primarily intended for data that can be read as pure bytes—basically byte data and numeric data encoded as binary numbers of one sort or another. Streams are specifically not intended for use when reading and writing text, including both ASCII text, like "Hello World," and numbers formatted as text, like "3.1415929." For these purposes, you should use readers and writers.
Input and output streams are fundamentally byte-based. Readers and writers are based on characters, which can have varying widths depending on the character set. For example, ASCII and ISO Latin-1 use one-byte characters. Unicode uses two-byte characters. UTF-8 uses characters of varying width (between one and three bytes). Since characters are ultimately composed of bytes, readers take their input from streams. However, they convert those bytes into chars according to a specified encoding format before passing them along. Similarly, writers convert chars to bytes according to a specified encoding before writing them onto some underlying stream.
The java.io.Reader and java.io.Writer classes are abstract superclasses for classes that read and write character-based data. The subclasses are notable for handling the conversion between different character sets. There are nine reader and eight writer classes in the core Java API, all in the java.io package:
BufferedReader
BufferedWriter
CharArrayReader
CharArrayWriter
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Ubiquitous IOException
As computer operations go, input and output are unreliable. They are subject to problems completely outside the programmer's control. Disks can develop bad sectors while a file is being read; construction workers drop backhoes through the cables that connect your WAN; users unexpectedly cancel their input; telephone repair crews shut off your modem line while trying to repair someone else's. (This last one actually happened to me while writing this chapter. My modem kept dropping the connection and then not getting a dial tone; I had to hunt down the telephone "repairman" in my building's basement and explain to him that he was working on the wrong line.)
Because of these potential problems and many more, almost every method that performs input or output is declared to throw IOException. IOException is a checked exception, so you must either declare that your methods throw it or enclose the call that can throw it in a try/catch block. The only real exceptions to this rule are the PrintStream and PrintWriter classes. Because it would be inconvenient to wrap a try/catch block around each call to System.out.println(), Sun decided to have PrintStream (and later PrintWriter) catch and eat any exceptions thrown inside a print() or println() method. If you do want to check for exceptions inside a print() or println() method, you can call checkError() :
public boolean checkError()
The checkError() method returns true if an exception has occurred on this print stream, false if one hasn't. It only tells you that an error occurred. It does not tell you what sort of error occurred. If you need to know more about the error, you'll have to use a different output stream or writer class.
IOException has many subclasses—15 in java.io—and methods often throw a more specific exception that subclasses IOException. (However, methods usually only declare that they throw an IOException.) Here are the subclasses of IOException that you'll find in
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Console: System.out, System.in, and System.err
The console is the default destination for output written to System.out or System.err and the default source of input for System.in . On most platforms the console is the command-line environment from which the Java program was initially launched, perhaps an xterm (Figure 1.1) or a DOS shell window (Figure 1.2). The word console is something of a misnomer, since on Unix systems the console refers to a very specific command-line shell, rather than being a generic term for command-line shells overall.
Figure 1.1: An xterm console on Unix
Figure 1.2: A DOS shell console on Windows NT
Many common misconceptions about I/O occur because most programmers' first exposure to I/O is through the console. The console is convenient for quick hacks and toy examples commonly found in textbooks, and I will use it for that in this book, but it's really a very unusual source of input and destination for output, and good Java programs avoid it. It behaves almost, but not completely, unlike anything else you'd want to read from or write to. While consoles make convenient examples in programming texts like this one, they're a horrible user interface and really have little place in modern programs. Users are more comfortable with a well-defined graphical user interface. Furthermore, the console is unreliable across platforms. The Mac, for example, has no native console. Macintosh Runtime for Java 2 and earlier has a console window that works only for output, but not for input; that is, System.out works but System.in does not. Figure 1.3 shows the Mac console window.
Figure 1.3: The Mac console, used exclusively by Java programs
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Security Checks on I/O
One of the original fears about downloading executable content like applets from the Internet was that a hostile applet could erase your hard disk or read your Quicken files. Nothing's happened to change that since Java was introduced. This is why Java applets run under the control of a security manager that checks each operation an applet performs to prevent potentially hostile acts.
The security manager is particularly careful about I/O operations. For the most part, the checks are related to these questions:
  • Can an applet read a file?
  • Can an applet write a file?
  • Can an applet delete a file?
  • Can an applet determine whether a file exists?
  • Can an applet make a network connection to a particular host?
  • Can applet accept an incoming connection from a particular host?
The short answer to all these questions is "No, it cannot." A slightly more elaborate answer would specify a few exceptions. Applets can make network connections to the host they came from; applets can read a few very specific files that contain information about the Java environment; and trusted applets may sometimes run without these restrictions. But for almost all practical purposes, the answer is almost always no.
For more exotic situations, such as trusted applets, see Java Security by Scott Oaks, (O'Reilly & Associates, 1998). Trusted applets are useful on corporate networks, but you shouldn't waste a lot of time laboring under the illusion that anyone on the Internet at large will trust your applets.
Because of these security issues, you need to be careful when using code fragments and examples from this book in an applet. Everything shown here works when run in an application, but when run in an applet, it may fail with a
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 2: Output Streams
The java.io.OutputStream class declares the three basic methods you need to write bytes of data onto a stream. It also has methods for closing and flushing streams.
public abstract void write(int b) throws IOException
public void write(byte[] data) throws IOException
public void write(byte[] data, int offset, int length) throws IOException
public void flush() throws IOException
public void close() throws IOException
OutputStream is an abstract class. Subclasses provide implementations of the abstract write(int b) method. They may also override the four nonabstract methods. For example, the FileOutputStream class overrides all five methods with native methods that know how to write bytes into files on the host platform. Although OutputStream is abstract, often you only need to know that the object you have is an OutputStream ; the more specific subclass of OutputStream is hidden from you. For example, the getOutputStream() method of java.net.URLConnection has the signature:
public OutputStream getOutputStream() throws IOException
Depending on the type of URL associated with this URLConnection object, the actual class of the output stream that's returned may be a sun.net.TelnetOutputStream , a sun.net.smtp.SmtpPrintStream , a sun.net.www.http.KeepAliveStream , or something else completely. All you know as a programmer, and all you need to know, is that the object returned is in fact some instance of OutputStream. That's why the detailed classes that handle particular kinds of connections are hidden inside the sun packages.
Furthermore, even when working with subclasses whose types you know, you still need to be able to use the methods inherited from OutputStream. And since methods that are inherited are not included in the online documentation, it's important to remember that they're there. For example, the java.io.DataOutputStream class does not declare a
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The OutputStream Class
The java.io.OutputStream class declares the three basic methods you need to write bytes of data onto a stream. It also has methods for closing and flushing streams.
public abstract void write(int b) throws IOException
public void write(byte[] data) throws IOException
public void write(byte[] data, int offset, int length) throws IOException
public void flush() throws IOException
public void close() throws IOException
OutputStream is an abstract class. Subclasses provide implementations of the abstract write(int b) method. They may also override the four nonabstract methods. For example, the FileOutputStream class overrides all five methods with native methods that know how to write bytes into files on the host platform. Although OutputStream is abstract, often you only need to know that the object you have is an OutputStream ; the more specific subclass of OutputStream is hidden from you. For example, the getOutputStream() method of java.net.URLConnection has the signature:
public OutputStream getOutputStream() throws IOException
Depending on the type of URL associated with this URLConnection object, the actual class of the output stream that's returned may be a sun.net.TelnetOutputStream , a sun.net.smtp.SmtpPrintStream , a sun.net.www.http.KeepAliveStream , or something else completely. All you know as a programmer, and all you need to know, is that the object returned is in fact some instance of OutputStream. That's why the detailed classes that handle particular kinds of connections are hidden inside the sun packages.
Furthermore, even when working with subclasses whose types you know, you still need to be able to use the methods inherited from OutputStream. And since methods that are inherited are not included in the online documentation, it's important to remember that they're there. For example, the java.io.DataOutputStream class does not declare a close() method, but you can still call the one it inherits from its superclass.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Writing Bytes to Output Streams
The fundamental method of the OutputStream class is write():
public abstract void write(int b) throws IOException
This method writes a single unsigned byte of data whose value should be between and 255. If you pass a number larger than 255 or smaller than zero, it's reduced modulo 256 before being written.
Example 2.1, AsciiChart , is a simple program that writes the printable ASCII characters (32 to 126) on the console. The console interprets the numeric values as ASCII characters, not as numbers. This is a feature of the console, not of the OutputStream class or the specific subclass of which System.out is an instance. The write() method merely sends a particular bit pattern to a particular output stream. How that bit pattern is interpreted depends on what's connected to the other end of the stream.
Example 2.1. The AsciiChart Program
import java.io.*;

public class AsciiChart {

  public static void main(String[] args) {
    
    for (int i = 32; i < 127; i++) {
      System.out.write(i);
      // break line after every eight characters.
      if (i % 8 == 7) System.out.write('\n');
      else System.out.write('\t');
    }
    System.out.write('\n');
   }
}
Notice the use of the char literals '\t' and '\n'. The compiler converts these to the numbers 9 and 10, respectively. When these numbers are written on the console, the console interprets those numbers as a tab and a linefeed, respectively. The same effect could have been achieved by writing the if clause like this:
if (i % 8 == 7) System.out.write(10);
else System.out.write(9);
Here's the output:
% java AsciiChart
            
!       "       #       $       %       &       '
(       )       *       +       ,       -       .       /
0       1       2       3       4       5       6       7
8       9       :       ;       <       =       >       ?
@       A       B       C       D       E       F       G
H       I       J       K       L       M       N       O
P       Q       R       S       T       U       V       W
X       Y       Z       [       \       ]       ^       _
`       a       b       c       d       e       f       g
h       i       j       k       l       m       n       o
p       q       r       s       t       u       v       w
x       y       z       {       |       }       ~
%
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Writing Arrays of Bytes
It's often faster to write larger chunks of data than to write byte by byte. Two overloaded variants of the write() method do this:
public void write(byte[] data) throws IOException
public void write(byte[] data, int offset, int length) throws IOException
The first variant writes the entire byte array data. The second writes only the sub-array of data starting at offset and continuing for length bytes. For example, the following code fragment blasts the bytes in a string onto System.out:
String s = "How are streams treating you?";
byte[] data = s.getBytes();
System.out.write(data);
Conversely, you may run into performance problems if you attempt to write too much data at a time. The exact turnaround point depends on the eventual destination of the data. Files are often best written in small multiples of the block size of the disk, typically 512, 1024, or 2048 bytes. Network connections often require smaller buffer sizes, 128 or 256 bytes. The optimal buffer size depends on too many system-specific details for anything to be guaranteed, but I often use 128 bytes for network connections and 1024 bytes for files.
Example 2.2 is a simple program that constructs a byte array filled with an ASCII chart, then blasts it onto the console in one call to write().
Example 2.2. The AsciiArray Program
import java.io.*;

public class AsciiArray {

  public static void main(String[] args) {
    
    byte[] b = new byte[(127-31)*2];
    int index = 0;
    for (int i = 32; i < 127; i++) {
      b[index++] = (byte) i;
      // Break line after every eight characters.
      if (i % 8 == 7) b[index++] = (byte) '\n';
      else b[index++] = (byte) '\t';
    }
    b[index++] = (byte) '\n';
    try {
      System.out.write(b);
    }
    catch (IOException e) { System.err.println(e); }
  }
}
The output is the same as in Example 2.1. Because of the nature of the console, this particular program probably isn't a lot faster than Example 2.1, but it certainly could be if you were writing data into a file rather than onto the console. The difference in performance between writing a
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Flushing and Closing Output Streams
Many output streams buffer writes to improve performance. Rather than sending each byte to its destination as it's written, the bytes are accumulated in a memory buffer ranging in size from several bytes to several thousand bytes. When the buffer fills up, all the data is sent at once. The flush() method forces the data to be written whether or not the buffer is full:
public void flush() throws IOException
This is not the same as any buffering performed by the operating system or the hardware. These buffers will not be emptied by a call to flush(). (Then sync() method in the FileDescriptor class, discussed in Chapter 12, can sometimes be used to empty these buffers.) For example, assuming out is an OutputStream of some sort, you would call out.flush() to empty the buffers.
If you only use a stream for a short time, you don't need to flush it explicitly. It should be flushed when the stream is closed. This should happen when the program exits or when you explicitly invoke the close() method:
public void close() throws IOException
For example, again assuming out is an OutputStream of some sort, calling out.close() closes the stream and implicitly flushes it. Once you have closed an output stream, you can no longer write to it. Attempting to do so will throw an IOException.
Again, System.out is a partial exception because as a PrintStream , all exceptions it throws are eaten. Once you close System.out, you can't write to it, but trying to do so won't throw any exceptions. However, your output will not appear on the console.
You only need to flush an output stream explicitly if you want to make sure data is sent before you're through with the stream. For example, a program that sends a burst of data across the network periodically should flush after each burst of data is written to the stream.
Flushing is often important when you're trying to debug a crashing program. All streams flush automatically when their buffers fill up, and all streams should be flushed when a program terminates normally. If a program terminates abnormally, however, buffers may not get flushed. In this case, unless there is an explicit call to
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Subclassing OutputStream
OutputStream is an abstract class that mainly describes the operations available with any particular OutputStream object. Specific subclasses know how to write bytes to particular destinations. For instance, a FileOutputStream uses native code to write data in files. A ByteArrayOutputStream uses pure Java to write its output in a potentially expanding byte array.
Recall that there are three overloaded variants of the write() method in OutputStream, one abstract, two concrete:
public abstract void write(int b) throws IOException
public void write(byte[] data) throws IOException
public void write(byte[] data, int offset, int length) throws IOException
Subclasses must implement the abstract write(int b) method. They often choose to override the third variant, write(byte[], data int offset, int length), for reasons of performance. The implementation of the three-argument version of the write() method in OutputStream simply invokes write(int b) repeatedly; that is:
public void write(byte[] data, int offset, int length) throws IOException {
  for (int i = offset; i < offset+length; i++) write(data[i]);
}
Most subclasses can provide more efficient implementations of this method. The one-argument variant of write() merely invokes write(data, 0, data.length); if the three-argument variant has been overridden, this method will perform reasonably well. However, a few subclasses may override it anyway.
Example 2.3 is a simple program called NullOutputStream that mimics the behavior of /dev/null on Unix operating systems. Data written into a null output stream is lost.
Example 2.3. The NullOutputStream Class
package com.macfaq.io;

import java.io.*;

public class NullOutputStream extends OutputStream {

  public void write(int b) { }
  public void write(byte[] data) { }
  public void write(byte[] data, int offset, int length) { }

}
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
A Graphical User Interface for Output Streams
As a useful example, I'm going to show a subclass of java.awt.TextArea that can be connected to an output stream. As data is written onto the stream, it is appended to the text area in the default character set (generally ISO Latin-1). (This isn't ideal. Since text areas contain text, a writer would be a better source for this data; in later chapters I'll expand on this class to use a writer instead. For now this makes a neat example.) This subclass is shown in Example 2.4.
The actual output stream is contained in an inner class inside the StreamedTextArea class. Each StreamedTextArea component contains a TextAreaOutputStream object in its theOutput field. Client programmers access this object via the getOutputStream() method of the StreamedTextArea class. The StreamedTextArea class has five overloaded constructors that imitate the five constructors in the java.awt.TextArea class, each taking a different combination of text, rows, columns, and scrollbar information. The first four constructors merely pass their arguments and suitable defaults to the most general fifth constructor using this(). The fifth constructor calls the most general superclass constructor, then calls setEditable(false) to ensure that the user doesn't change the text while output is streaming into it.
I've chosen not to override any methods in the TextArea superclass. However, you might want to do so if you feel a need to change the normal abilities of a text area. For example, you could include a do-nothing append() method so that data can only be moved into the text area via the provided output stream or a setEditable() method that doesn't allow the client programmer to make this area editable.
Example 2.4. The StreamedTextArea Component
package com.macfaq.awt;

import java.awt.*;
import java.io.*;

public class StreamedTextArea extends TextArea {

  OutputStream theOutput = new TextAreaOutputStream();

  public StreamedTextArea() {
    this("", 0, 0, SCROLLBARS_BOTH);
  }

  public StreamedTextArea(String text) {
    this(text, 0, 0, SCROLLBARS_BOTH);
  } 

  public StreamedTextArea(int rows, int columns) {
    this("", rows, columns, SCROLLBARS_BOTH);
  }

  public StreamedTextArea(String text, int rows, int columns) {
    this(text, rows, columns, SCROLLBARS_BOTH);
  }

  public StreamedTextArea(String text, int rows, int columns, int scrollbars) {
    super(text, rows, columns, scrollbars);
    setEditable(false);
  }

  public OutputStream getOutputStream() {
    return theOutput;
  }

  class TextAreaOutputStream extends OutputStream {

    public synchronized void write(int b) {
      // recall that the int should really just be a byte
      b &= 0x000000FF;

      // must convert byte to a char in order to append it
      char c = (char) b;
      append(String.valueOf(c));
    }

    public synchronized void write(byte[] data, int offset, int length) {
      append(new String(data, offset, length));
    }
  }
}
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 3: Input Streams
The java.io.InputStream class is the abstract superclass for all input streams. It declares the three basic methods needed to read bytes of data from a stream. It also has methods for closing and flushing streams, checking how many bytes of data are available to be read, skipping over input, marking a position in a stream and resetting back to that position, and determining whether marking and resetting are supported.
public abstract int read() throws IOException
public int read(byte[] data) throws IOException
public int read(byte[] data, int offset, int length) throws IOException
public long skip(long n) throws IOException
public int available() throws IOException
public void close() throws IOException
public synchronized void mark(int readlimit)
public synchronized void reset() throws IOException
public boolean markSupported()
The fundamental method of the InputStream class is read() , which reads a single unsigned byte of data and returns the integer value of the unsigned byte. This is a number between and 255:
public abstract int read() throws IOException
The following code reads 10 bytes from the System.in input stream and stores them in the int array data:
int[] data = new int[10];
for (int i = 0; i < data.length; i++) {
  data[i] = System.in.read();
}
Notice that although read() is reading a byte, it returns an int. If you want to store the raw bytes instead, you can cast the int to a byte. For example:
byte[] b = new byte[10];
for (int i = 0; i < b.length; i++) {
  b[i] = (byte) System.in.read();
}
Of course, this produces a signed byte instead of the unsigned byte returned by the read() method (that is, a byte in the range -128 to 127 instead of to 255). As long as you're clear in your mind and your code about whether you're working with signed or unsigned data, you won't have any trouble. Signed bytes can be converted back to
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The InputStream Class
The java.io.InputStream class is the abstract superclass for all input streams. It declares the three basic methods needed to read bytes of data from a stream. It also has methods for closing and flushing streams, checking how many bytes of data are available to be read, skipping over input, marking a position in a stream and resetting back to that position, and determining whether marking and resetting are supported.
public abstract int read() throws IOException
public int read(byte[] data) throws IOException
public int read(byte[] data, int offset, int length) throws IOException
public long skip(long n) throws IOException
public int available() throws IOException
public void close() throws IOException
public synchronized void mark(int readlimit)
public synchronized void reset() throws IOException
public boolean markSupported()
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The read( ) Method
The fundamental method of the InputStream class is read() , which reads a single unsigned byte of data and returns the integer value of the unsigned byte. This is a number between and 255:
public abstract int read() throws IOException
The following code reads 10 bytes from the System.in input stream and stores them in the int array data:
int[] data = new int[10];
for (int i = 0; i < data.length; i++) {
  data[i] = System.in.read();
}
Notice that although read() is reading a byte, it returns an int. If you want to store the raw bytes instead, you can cast the int to a byte. For example:
byte[] b = new byte[10];
for (int i = 0; i < b.length; i++) {
  b[i] = (byte) System.in.read();
}
Of course, this produces a signed byte instead of the unsigned byte returned by the read() method (that is, a byte in the range -128 to 127 instead of to 255). As long as you're clear in your mind and your code about whether you're working with signed or unsigned data, you won't have any trouble. Signed bytes can be converted back to ints in the range to 255 like this:
int i = (b >= 0) ? b : 256 + b;
When you call read(), you also have to catch the IOException that it might throw. As I've observed, input and output are often subject to problems outside of your control: disks fail, network cables break, and so on. Therefore, virtually any I/O method can throw an IOException, and read() is no exception. You don't get an IOException if read() encounters the end of the input stream; in this case, it returns -1. You use this as a flag to watch for the end of stream. The following code shows how to catch the IOException and test for the end of the stream:
try {
  int[] data = new int[10];
  for (int i = 0; i < data.length; i++) {
    int datum = System.in.read();
    if (datum  == -1) break;
    data[i] = datum;
  }
}
catch (IOException e) {System.err.println("Couldn't read from System.in!");}
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Reading Chunks of Data from a Stream
Input and output are often the performance bottlenecks in a program. Reading from or writing to disk can be hundreds of times slower than reading from or writing to memory; network connections and user input are even slower. While disk capacities and speeds have increased over time, they have never kept pace with CPU speeds. Therefore, it's important to minimize the number of reads and writes a program actually performs.
All input streams have overloaded read() methods that read chunks of contiguous data into a byte array. The first variant tries to read enough data to fill the array data. The second variant tries to read length bytes of data starting at position offset into the array data. Neither of these methods is guaranteed to read as many bytes as they want. Both methods return the number of bytes actually read, or -1 on end of stream.
public int read(byte[] data) throws IOException
public int read(byte[] data, int offset, int length) throws IOException
The default implementation of these methods in the java.io.InputStream class merely calls the basic read() method enough times to fill the requested array or subarray. Thus, reading 10 bytes of data takes 10 times as long as reading one byte of data. However, most subclasses of InputStream override these methods with more efficient methods, perhaps native, that read the data from the underlying source as a block.
For example, to attempt to read 10 bytes from System.in, you could write the following code:
try {
  byte[] b = new byte[10];
  System.in.read(b);
}
catch (IOException e) {System.err.println("Couldn't read from System.in!");}
Reads don't always succeed in getting as many bytes as you want. Conversely, there's nothing to stop you from trying to read more data into the array than will fit. If you read more data than the array can hold, an ArrayIndexOutOfBoundsException will be thrown. For example, the following code loops repeatedly until it either fills the array or sees the end of stream:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Counting the Available Bytes
It's sometimes convenient to know how many bytes are available to be read before you attempt to read them. The InputStream class's available() method tells you how many bytes you can read without blocking. It returns if there's no data available to be read.
public int available() throws IOException
For example:
try {
  byte[] b = new byte[100];
  int offset = 0;
  while (offset < b.length) {
    int a = System.in.available();
    int bytesRead = System.in.read(b, offset, a);
    if (bytesRead == -1) break; // end of stream
    offset += bytesRead;
}
catch (IOException e) {System.err.println("Couldn't read from System.in!");}
There's a potential bug in this code. There may be more bytes available than there's space in the array to hold them. One common idiom is to size the array according to the number available() returns, like this:
try {
  byte[] b = new byte[System.in.available()];
  System.in.read(b);
}
catch (IOException e) {System.err.println("Couldn't read from System.in!");}
This works well if you're only going to perform a single read. For multiple reads, however, the overhead of creating multiple arrays is excessive. You should probably reuse the array and only create a new array if more bytes are available than will fit in the array.
The available() method in java.io.InputStream always returns 0. Subclasses are supposed to override it, but I've seen a few that don't. You may be able to read more bytes from the underlying stream without blocking than available() suggests; you just can't guarantee that you can. If this is a concern, you can place input in a separate thread so that blocked input doesn't block the rest of the program.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Skipping Bytes
Although you can just read from a stream and ignore the bytes read, Java provides a skip() method that jumps over a certain number of bytes in the input:
public long skip(long bytesToSkip) throws IOException
The argument to skip() is the number of bytes to skip. The return value is the number of bytes actually skipped, which may be less than bytesToSkip. -1 is returned if the end of stream is encountered. Both the argument and return value are longs, allowing skip() to handle extremely long input streams. Skipping is often faster than reading and discarding the data you don't want. For example, when an input stream is attached to a file, skipping bytes just requires that an integer called the file pointer be changed, whereas reading involves copying bytes from the disk into memory. For example, to skip the next 80 bytes of the input stream in:
try {
  long bytesSkipped = 0;
  long bytesToSkip = 80;
  while (bytesSkipped < bytesToSkip) {
    long n = in.skip(bytesToSkip - bytesSkipped);
    if (n == -1) break;
    bytesSkipped += n;
  }
}
catch (IOException e) {System.err.println(e);}
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Closing Input Streams
When you're through with a stream, you should close it. This allows the operating system to free any resources associated with the stream; exactly what these resources are depends on your platform and varies with the type of the stream. However, systems only have finite resources. For example, on most personal computer operating systems, no more than several hundred files can be open at once. Multiuser operating systems have larger limits, but limits nonetheless.
To close a stream, you invoke its close() method:
public void close() throws IOException
Not all streams need to be closed—System.in generally does not need to be closed, for example. However, streams associated with files and network connections should always be closed when you're done with them. For example:
try {
  URL u = new URL("http://www.javasoft.com/");
  InputStream in = u.openStream();
  // Read from the stream...
  in.close();
}
catch (IOException e) {System.err.println(e);}
Once you have closed an input stream, you can no longer read from it. Attempting to do so will throw an IOException.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Marking and Resetting
It's often useful to be able to read a few bytes and then back up and reread them. For example, in a Java compiler, you don't know for sure whether you're reading the token <, <<, or <<= until you've read one too many characters. It would be useful to be able to back up and reread the token once you know which token you've read. Compiler design and other parsing problems provide many more examples, and this need occurs in other domains as well.
Some (but not all) input streams allow you to mark a particular position in the stream and then return to it. Three methods in the java.io.InputStream class handle marking and resetting:
public synchronized void mark(int readLimit)
public synchronized void reset() throws IOException
public boolean markSupported()
The boolean markSupported() method returns true if this stream supports marking and false if it doesn't. If marking is not supported, reset() throws an IOException and mark() does nothing. Assuming the stream does support marking, the mark() method places a bookmark at the current position in the stream. You can rewind the stream to this position later with reset() as long as you haven't read more than readLimit bytes. There can be only one mark in the stream at any given time. Marking a second location erases the first mark.
The only two input stream classes in java.io that always support marking are BufferedInputStream (of which System.in is an instance) and ByteArrayInputStream. However, other input streams, like DataInputStream , may support marking if they're chained to a buffered input stream first.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Subclassing InputStream
Immediate subclasses of InputStream must provide an implementation of the abstract read() method. They may also override some of the nonabstract methods. For example, the default markSupported() method returns false, mark() does nothing, and reset() throws an IOException. Any class that allows marking and resetting must override these three methods. Furthermore, they may want to override methods that perform functions like skip() and the other two read() methods to provide more efficient implementations.
Example 3.2 is a simple class called RandomInputStream that "reads" random bytes of data. This provides a useful source of unlimited data you can use in testing. A java.util.Random object provides the data.
Example 3.2. The RandomInputStream Class
package com.macfaq.io;

import java.util.*;
import java.io.*;

public class RandomInputStream extends InputStream {

  private transient Random generator = new Random();

  public int read() {

    int result = generator.nextInt() % 256;
    if (result < 0) result = -result;
    return result;

  }

  public int read(byte[] data, int offset, int length) throws IOException {

    byte[] temp = new byte[length];
    generator.nextBytes(temp);
    System.arraycopy(temp, 0, data, offset, length);
    return length;

  }

  public int read(byte[] data) throws IOException {

    generator.nextBytes(data);
    return data.length;

  }

  public long skip(long bytesToSkip) throws IOException {
  
    // It's all random so skipping has no effect.
    return bytesToSkip;
  
  }
}
The no-argument read() method returns a random int in the range of an unsigned byte (0 to 255). The other two read() methods fill a specified part of an array with random bytes. They return the number of bytes read (in this case the number of bytes created).
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
An Efficient Stream Copier
As a useful example of both input and output streams, in Example 3.3 I'll present a StreamCopier class that copies data between two streams as quickly as possible. (I'll reuse this class in later chapters.) This method reads from the input stream and writes onto the output stream until the input stream is exhausted. A 256-byte buffer is used to try to make the reads efficient. A main() method provides a simple test for this class by reading from System.in and copying to System.out.
Example 3.3. The StreamCopier Class
package com.macfaq.io;
import java.io.*;

public class StreamCopier {

  public static void main(String[] args) {
    try {

    }
    catch (IOException e) {System.err.println(e);}
  }

  public static void copy(InputStream in, OutputStream out) 
   throws IOException {

    // Do not allow other threads to read from the input
    // or write to the output while copying is taking place
    synchronized (in) {
      synchronized (out) {
        byte[] buffer = new byte[256];
        while (true) {
          int bytesRead = in.read(buffer);
          if (bytesRead == -1) break;
          out.write(buffer, 0, bytesRead);
        }
      }
    }
  }
}
Here's a simple test run:
D:\JAVA\ioexamples\03>java com.macfaq.io.StreamCopier
            
this is a test
this is a test
0987654321
0987654321
^Z
Input was not fed from the console (DOS prompt) to the StreamCopier program until the end of each line. Since I ran this in Windows, the end-of-stream character is Ctrl-Z. On Unix it would have been Ctrl-D.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 4: File Streams
Until now, most of the examples in this book have used the streams System.in and System.out. These are convenient for examples, but in real life, you'll more commonly attach streams to data sources like files and network connections. You'll use the java.io.FileInputStream and java.io.FileOutputStream classes, which are concrete subclasses of java.io.InputStream and java.io.OutputStream, to read and write files. FileInputStream and FileOutputStream provide input and output streams that let you read and write files. We'll discuss these classes in detail in this chapter; they provide the standard methods for reading and writing data. What they don't provide is a mechanism for file-specific operations, like finding out whether a file is readable or writable. For that, you may want to look forward to Chapter 12, which talks about the File class itself and the way Java works with files.
java.io.FileInputStream is a concrete subclass of java.io.InputStream. It provides an input stream connected to a particular file.
public class FileInputStream extends InputStream
FileInputStream has all the usual methods of input streams, such as read(), available(), skip(), and close(), which are used exactly as they are for any other input stream.
public native int read() throws IOException
public int read(byte[] data) throws IOException
public int read(byte[] data, int offset, int length) throws IOException
public native long skip(long n) throws IOException
public native int available() throws IOException
public native void close() throws IOException
These methods are all implemented in native code, except for the two multibyte read() methods. These, however, just pass their arguments on to a private native method called readBytes(), so effectively all these methods are implemented with native code. (In Java 2,
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Reading Files
java.io.FileInputStream is a concrete subclass of java.io.InputStream. It provides an input stream connected to a particular file.
public class FileInputStream extends InputStream
FileInputStream has all the usual methods of input streams, such as read(), available(), skip(), and close(), which are used exactly as they are for any other input stream.
public native int read() throws IOException
public int read(byte[] data) throws IOException
public int read(byte[] data, int offset, int length) throws IOException
public native long skip(long n) throws IOException
public native int available() throws IOException
public native void close() throws IOException
These methods are all implemented in native code, except for the two multibyte read() methods. These, however, just pass their arguments on to a private native method called readBytes(), so effectively all these methods are implemented with native code. (In Java 2, read(byte[] data, int offset, int length) is a native method that read(byte[] data) invokes.)
There are three FileInputStream() constructors, which differ only in how the file to be read is specified:
public FileInputStream(String fileName) throws IOException
public FileInputStream(File file) throws FileNotFoundException
public FileInputStream(FileDescriptor fdObj)
The first constructor uses a string containing the name of the file. The second constructor uses a java.io.File object. The third constructor uses a java.io.FileDescriptor object. Filenames are platform-dependent, so hardcoded file names should be avoided where possible. Using the first constructor violates Sun's rules for "100% Pure Java" immediately. Therefore, the second two constructors are much preferred. Nonetheless, the second two will have to wait until File objects and file descriptors are discussed in Chapter 12. For now, I will use only the first.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Writing Files
Content preview·