The java.util.zip
package
contains classes you can use for data compression in streams or files. The
classes in the java.util.zip
package
support two widespread compression formats: GZIP and ZIP. In this section,
we’ll talk about how to use these classes. We’ll also present two useful
example programs that build on what you have learned in this chapter.
After that, we’ll talk about a higher-level way to work with ZIP
archives—as filesystems—introduced with Java 7.
The java.util.zip
package provides two filter streams for writing compressed data. The
GZIPOutputStream
is for
writing data in GZIP compressed format. The ZIPOutputStream
is for
writing compressed ZIP archives, which can contain one or many files. To
write compressed data in the GZIP format, simply wrap a GZIPOutputStream
around an underlying stream
and write to it. The following is a complete example that shows how to
compress a file using the GZIP format, but the stream could just as well
be sent over a network connection or to any other type of stream
destination. Our GZip
example is a
command line utility that compresses a file.
import
java.io.*
;
import
java.util.zip.*
;
public
class
GZip
{
public
static
int
sChunk
=
8192
;
public
static
void
main
(
String
[]
args
)
{
if
(
args
.
length
!=
1
)
{
System
.
out
.
println
(
"Usage: GZip source"
);
return
;
}
// create output stream
String
zipname
=
args
[
0
]
+
".gz"
;
GZIPOutputStream
zipout
;
try
{
FileOutputStream
out
=
new
FileOutputStream
(
zipname
);
zipout
=
new
GZIPOutputStream
(
out
);
}
catch
(
IOException
e
)
{
System
.
out
.
println
(
"Couldn't create "
+
zipname
+
"."
);
return
;
}
byte
[]
buffer
=
new
byte
[
sChunk
];
// compress the file
try
{
FileInputStream
in
=
new
FileInputStream
(
args
[
0
]);
int
length
;
while
((
length
=
in
.
read
(
buffer
,
0
,
sChunk
))
!=
-
1
)
zipout
.
write
(
buffer
,
0
,
length
);
in
.
close
();
}
catch
(
IOException
e
)
{
System
.
out
.
println
(
"Couldn't compress "
+
args
[
0
]
+
"."
);
}
try
{
zipout
.
close
();
}
catch
(
IOException
e
)
{}
}
}
First, we check to make sure we have a command-line argument
representing a filename. We then construct a GZIPOutputStream
wrapped around a FileOutputStream
representing the given
filename, with the .gz suffix appended. With this
in place, we open the source file. We read chunks of data and write them
into the GZIPOutputStream
. Finally,
we clean up by closing our open streams.
While GZIP is simple compression format for a stream or
file, a ZIP archive is a file that is actually a collection of files,
some (or all) of which may be compressed. Writing data to a ZIP
archive file is a little more involved than simply wrapping a stream,
but not difficult. Each item in the ZIP file is represented by a
ZipEntry
object. When
writing to a ZipOutputStream
,
you’ll need to call putNextEntry()
before
writing the data for each item. The following example shows how to
create a ZipOutputStream
. You’ll
notice that it starts out with a stream wrapper just like it did when
creating a GZIPOutputStream
:
ZipOutputStream
zipout
;
try
{
FileOutputStream
out
=
new
FileOutputStream
(
"archive.zip"
);
zipout
=
new
ZipOutputStream
(
out
);
}
catch
(
IOException
e
)
{}
Let’s say we have two files we want to write into this archive.
Before we begin writing, we need to call putNextEntry()
to set the name of the file
within the archive and initialize the stream to the correct position
for it. Here we create a simple ZipEntry with just a file name. You
can set other ZIP format specific fields in ZipEntry
, but most of the time, you won’t
need to bother with them.
try
{
ZipEntry
entry
=
new
ZipEntry
(
"first.dat"
);
zipout
.
putNextEntry
(
entry
);
zipout
.
write
(
...
)
// Write data for first file
ZipEntry
entry
=
new
ZipEntry
(
"second.dat"
);
zipout
.
putNextEntry
(
entry
);
zipout
.
write
(
...
)
// Write data for second file
.
.
.
zipout
.
close
();
}
catch
(
IOException
e
)
{}
To decompress data in the GZIP format, simply wrap a
GZIPInputStream
around
an underlying FileInputStream
and
read from it. The following example complements our earlier GZip
example and shows how to decompress a
GZIP file:
import
java.io.*
;
import
java.util.zip.*
;
public
class
GUnzip
{
public
static
int
sChunk
=
8192
;
public
static
void
main
(
String
[]
args
)
{
if
(
args
.
length
!=
1
)
{
System
.
out
.
println
(
"Usage: GUnzip source"
);
return
;
}
// create input stream
String
zipname
,
source
;
if
(
args
[
0
].
endsWith
(
".gz"
))
{
zipname
=
args
[
0
];
source
=
args
[
0
].
substring
(
0
,
args
[
0
].
length
()
-
3
);
}
else
{
zipname
=
args
[
0
]
+
".gz"
;
source
=
args
[
0
];
}
GZIPInputStream
zipin
;
try
{
FileInputStream
in
=
new
FileInputStream
(
zipname
);
zipin
=
new
GZIPInputStream
(
in
);
}
catch
(
IOException
e
)
{
System
.
out
.
println
(
"Couldn't open "
+
zipname
+
"."
);
return
;
}
byte
[]
buffer
=
new
byte
[
sChunk
];
// decompress the file
try
{
FileOutputStream
out
=
new
FileOutputStream
(
source
);
int
length
;
while
((
length
=
zipin
.
read
(
buffer
,
0
,
sChunk
))
!=
-
1
)
out
.
write
(
buffer
,
0
,
length
);
out
.
close
();
}
catch
(
IOException
e
)
{
System
.
out
.
println
(
"Couldn't decompress "
+
args
[
0
]
+
"."
);
}
try
{
zipin
.
close
();
}
catch
(
IOException
e
)
{}
}
}
First, we check to make sure we have a command-line argument
representing a filename. If the argument ends with
.gz, we figure out what the filename for the
uncompressed file should be. Otherwise, we use the given argument and
assume the compressed file has the .gz suffix. Then we construct a
GZIPInputStream
wrapped around a
FileInputStream
that represents the
compressed file. With this in place, we open the target file. We read
chunks of data from the GZIPInputStream
and write them into the target
file. Finally, we clean up by closing our open streams.
Reading a ZIP archive is also the mirror of writing. When reading
from a ZipInputStream
, you
should call getNextEntry()
before
reading each item. When getNextEntry()
returns null
, there are no more items to read. The
following example shows how to create a ZipInputStream
:
ZipInputStream
zipin
;
try
{
FileInputStream
in
=
new
FileInputStream
(
"archive.zip"
);
zipin
=
new
ZipInputStream
(
in
);
}
catch
(
IOException
e
)
{}
Suppose we want to read two files from this archive. Before we
begin reading, we need to call getNextEntry()
. At the very least, the entry
gives us a name of the item we are reading from the archive:
try
{
ZipEntry
first
=
zipin
.
getNextEntry
();
zipin
.
read
(
...
)
// Read the file data
}
catch
(
IOException
e
)
{}
Now, you can read the contents of the first item in the archive.
When you come to the end of the item,
the read()
method returns -1
. At this point, you can call getNext
Entry()
again to read the
second item from the archive. If you call getNextEntry()
and it returns null
, there are no more items and you have
reached the end of the archive.
One of the benefits of the new java.nio.file
package introduce with Java 7 is
the ability to implement custom filesystems in Java. (We talked about
the File API for the NIO file package earlier in this chapter and we’ll
return to the more general NIO facilities in the next section.) Java 7
ships with one such custom filesystem implementation bundled within it:
the Zip Filesystem Provider.[35] Using the Zip Filesystem Provider, we can open a ZIP
archive and treat it like a filesystem: reading, writing, copying, and
renaming files using all of the standard java.nio.file
APIs, except that all of these
operations happen inside the ZIP archive file instead of on the host
computer filesystem (as you might otherwise expect).
The key to making this possible is that the NIO File API starts
with a FileSystem
abstraction that
serves as a factory for Path
objects.
In our previous discussion of the NIO File API we always simply asked
for the default filesystem using Filesystems.getDefault()
. This time, we are
going to target a particular custom filesystem type and destination by
constructing a special URI for our ZIP archive. (As we’ll discuss in the
networking chapters, a URI is kind of like a URL except that it can be
more abstract).
// Construct the URI pointing to the ZIP archive
URI
zipURI
=
URI
.
create
(
"jar:file:/Users/pat/tmp/MyArchive.zip"
);
// Open or create it and write a file
Map
<
String
,
String
>
env
=
new
HashMap
<>();
env
.
put
(
"create"
,
"true"
);
try
(
FileSystem
zipfs
=
FileSystems
.
newFileSystem
(
zipURI
,
env
)
)
{
Path
path
=
zipfs
.
getPath
(
"/README.txt"
);
OutputStream
out
=
Files
.
newOutputStream
(
path
);
try
(
PrintWriter
pw
=
new
PrintWriter
(
new
OutputStreamWriter
(
out
)
)
)
{
pw
.
println
(
"Hello World!"
);
}
}
In this snippet, we constructed a URI for our ZIP archive using
the URI
create()
method and the special
jar:file: prefix. (The Java JAR format is really
just the ZIP format with some additional conventions.) We then used that
URI with the Filesystems
newFileSystem()
method to create the right kind of filesystem
reference for us. The FileSystem
it
returns will perform all of its operations on entries within the ZIP,
but otherwise will behave just like we’ve seen previously. The other
argument to the newFileSystem()
method
is a Map
containing string properties
that are specific to the provider. In this case, we pass in the value
“create” as “true,” indicating that we want the ZIP filesystem provider
to create the archive if it does not already exist. In order to know
what properties can be passed, you’ll have to consult the documentation
for the particular filesystem provider.
In our preceding snippet, we then create a Path
for a file
/README.txt at the root folder of the filesystem
and write a string to it. Because we are using try
-with-resources clauses to encapsulate
opening the filesystem and writing to the file, the resources will be
automatically closed for us when the operation is complete.
Other operations proceed just as with “normal” files. For example,
we can move a file by creating a path for the existing file and a path
for the new location and then using the standard Files move()
method.
// Move the file
try
(
FileSystem
zipfs
=
FileSystems
.
newFileSystem
(
fsURI
,
env
)
)
{
Path
path
=
zipfs
.
getPath
(
"/README.txt"
);
Path
toPath
=
zipfs
.
getPath
(
"/README2.txt"
);
Files
.
move
(
path
,
toPath
);
}
[35] The Zip Filesystem Provider is also supplied as an example along with sample source code even though it’s unclear if Oracle intends it to be a standard. But at the time of this writing, it is bundled with the JDK and JRE of Java 7 on all platforms.
Get Learning Java, 4th Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.