Credit: Mitch Chapman
Before overwriting an existing file, it
is often desirable to make a backup. Example 4-1
emulates the behavior of Emacs by saving versioned backups.
It’s also compatible with the
marshal
module, so you can use versioned output
files for output in marshal format. If you find other file-writing
modules that, like marshal
, type-test rather than
using file-like objects polymorphically, the class supplied here will
stand you in good stead.
When Emacs saves a file foo.txt
, it first checks
to see if foo.txt
already exists. If it does,
the current file contents are backed up. Emacs can be configured to
use versioned backup files, so, for example,
foo.txt
might be backed up to
foo.txt.~1~
. If other versioned backups of the
file already exist, Emacs saves to the next available version. For
example, if the largest existing version number is 19, Emacs will
save the new version to foo.txt.~20~
. Emacs can
also prompt you to delete old versions of your files. For example, if
you save a file that has six backups, Emacs can be configured to
delete all but the three newest backups.
Example 4-1 emulates the versioning backup behavior
of Emacs. It saves backups with version numbers (e.g., backing up
foo.txt
to foo.txt.~n~
when
the largest existing backup number is n-1. It
also lets you specify how many old versions of a file to save. A
value that is less than zero means not to delete any old versions.
The marshal
module lets you marshal an object to a
file by way of the dump
function, but
dump
insists that the file object you provide
actually be a Python file object, rather than any arbitrary object
that conforms to the file-object interface. The versioned output file
shown in this recipe provides an asFile
method for
compatibility with marshal.dump
. In many (but,
alas, far from all) cases, you can use this approach to use wrapped
objects when a module type-tests and thus needs the unwrapped object,
solving (or at least ameliorating) the type-testing issue mentioned
in Recipe 5.9. Note that Example 4-1 can be seen as one of many uses of the
automatic-delegation idiom mentioned there.
The only true solution to the problem of modules using type tests rather than Python’s smooth, seamless polymorphism is to change those errant modules, but this can be hard in the case of errant modules that you did not write (particularly ones in Python’s standard library).
Example 4-1. Saving backups when writing files
""" This module provides versioned output files. When you write to such a file, it saves a versioned backup of any existing file contents. """ import sys, os, glob, string, marshal class VersionedOutputFile: """ Like a file object opened for output, but with versioned backups of anything it might otherwise overwrite """ def _ _init_ _(self, pathname, numSavedVersions=3): """ Create a new output file. pathname is the name of the file to [over]write. numSavedVersions tells how many of the most recent versions of pathname to save. """ self._pathname = pathname self._tmpPathname = "%s.~new~" % self._pathname self._numSavedVersions = numSavedVersions self._outf = open(self._tmpPathname, "wb") def _ _del_ _(self): self.close( ) def close(self): if self._outf: self._outf.close( ) self._replaceCurrentFile( ) self._outf = None def asFile(self): """ Return self's shadowed file object, since marshal is pretty insistent on working with real file objects. """ return self._outf def _ _getattr_ _(self, attr): """ Delegate most operations to self's open file object. """ return getattr(self._outf, attr) def _replaceCurrentFile(self): """ Replace the current contents of self's named file. """ self._backupCurrentFile( ) os.rename(self._tmpPathname, self._pathname) def _backupCurrentFile(self): """ Save a numbered backup of self's named file. """ # If the file doesn't already exist, there's nothing to do if os.path.isfile(self._pathname): newName = self._versionedName(self._currentRevision( ) + 1) os.rename(self._pathname, newName) # Maybe get rid of old versions if ((self._numSavedVersions is not None) and (self._numSavedVersions > 0)): self._deleteOldRevisions( ) def _versionedName(self, revision): """ Get self's pathname with a revision number appended. """ return "%s.~%s~" % (self._pathname, revision) def _currentRevision(self): """ Get the revision number of self's largest existing backup. """ revisions = [0] + self._revisions( ) return max(revisions) def _revisions(self): """ Get the revision numbers of all of self's backups. """ revisions = [] backupNames = glob.glob("%s.~[0-9]*~" % (self._pathname)) for name in backupNames: try: revision = int(string.split(name, "~")[-2]) revisions.append(revision) except ValueError: # Some ~[0-9]*~ extensions may not be wholly numeric pass revisions.sort( ) return revisions def _deleteOldRevisions(self): """ Delete old versions of self's file, so that at most self._numSavedVersions versions are retained. """ revisions = self._revisions( ) revisionsToDelete = revisions[:-self._numSavedVersions] for revision in revisionsToDelete: pathname = self._versionedName(revision) if os.path.isfile(pathname): os.remove(pathname) def main( ): """ mainline module (for isolation testing) """ basename = "TestFile.txt" if os.path.exists(basename): os.remove(basename) for i in range(10): outf = VersionedOutputFile(basename) outf.write("This is version %s.\n" % i) outf.close( ) # Now there should be just four versions of TestFile.txt: expectedSuffixes = ["", ".~7~", ".~8~", ".~9~"] expectedVersions = [] for suffix in expectedSuffixes: expectedVersions.append("%s%s" % (basename, suffix)) expectedVersions.sort( ) matchingFiles = glob.glob("%s*" % basename) matchingFiles.sort( ) for filename in matchingFiles: if filename not in expectedVersions: sys.stderr.write("Found unexpected file %s.\n" % filename) else: # Unit tests should clean up after themselves: os.remove(filename) expectedVersions.remove(filename) if expectedVersions: sys.stderr.write("Not found expected file") for ev in expectedVersions: sys.sdterr.write(' '+ev) sys.stderr.write('\n') # Finally, here's an example of how to use versioned # output files in concert with marshal: import marshal outf = VersionedOutputFile("marshal.dat") # Marshal out a sequence: marshal.dump([1, 2, 3], outf.asFile( )) outf.close( ) os.remove("marshal.dat") if _ _name_ _ == "_ _main_ _": main( )
For a more lightweight, simpler approach to file versioning, see Recipe 4.26.
Recipe 4.26 and Recipe 5.9; documentation for the
marshal
module in the Library Reference.
Get Python Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.