Subversion does a real good job at maintaining a file's history - perhaps too good a job. Over a period of time, the repository just grows and grows, with older versions that will probably never need to see the light of day again.
To archive older unwanted revisions, first dump the older revisions to store on removable media:
For example, to dump revisions 0 thru 500 from a repository:
# Dump revisions 0 - 500
ARCHIVE_UPTO=500
REPONAME=/srv/svn/repo/asporting
svnadmin dump ${REPONAME} -r 0:${ARCHIVE_UPTO} --delta | bzip2 -9c > repo-dump-$1-r0-${ARCHIVE_UPTO}.tar.bz2
Now we need to replace these dumped revisions with 'empty' revisions. To do this, I wrote a script called gen-empty-revs.sh:
#!/bin/bash
# Usage: gen-empty-revs.sh uuid max-rev-no
#
## Generate dump header
function dump_header()
{
# $1 must contain repo UID
cat <<-DUMP_HEADER
SVN-fs-dump-format-version: 2
UUID: $1
DUMP_HEADER
}
## Revison-number blocks
# $1 must contain a revision number
function dump_rev()
{
cat <<-REVNUM_BLOCKS
Revision-number: $1
Prop-content-length: 92
Content-length: 92
K 7
svn:log
V 18
Revision archived.
K 8
svn:date
V 27
2006-11-03T07:07:05.755934Z
PROPS-END
REVNUM_BLOCKS
}
[ $# -lt 2 ] && {
echo Usage: $0 upto_rev_num repo_uuid
exit 1
}
dump_header $2
for n in `seq 0 $1`
do
dump_rev $n
done
Redirect the output of the script to a file:
gen-empty-revs.sh 500 `svnlook uuid $REPONAME` >/tmp/gen-dump
Now create a new repository:
svnadmin create ${REPONAME}.tmp
We load the generated dump into the new repository:
svnadmin load ${REPONAME}.tmp--force-uuid </tmp/gen-dump
</pre>
Now make the original repository offline, and run the command:
mv $REPONAME ${REPONAME}.offline
svnadmin dump ${REPONAME}.offline -r $ARCHIVE_FROM:HEAD --delta | svnadmin load $REPONAME.tmp --force-uuid
mv ${REPONAME}.offline $REPONAME
Make the repository online once again. While this may seem tedious, it isn't if you script it.
Also note, while it's possible to make a new repository and load revisions from the existing repo directly to the new repo, this will renumber all existing revisions. This will break client working copies, ending with clients doing a fresh checkout. Loading those 'empty' revisions with the generated script has the advantage that the client working copies are not broken.
See also:
Version control with Subversion
It must be set to one higher than ARCHIVE_UPTO
i.e.