---
title: How to Mirror WordPress SVN Repositories
date: 2024-10-01T06:40:00+00:00
modified: 2025-05-13T08:27:24+00:00
image:: https://wpelevator.com/wp-content/uploads/sites/12/2024/10/how-to-mirror-wp-svn-repositories.png
permalink: https://wpelevator.com/guides/mirror-wordpress-svn-repositories
post_type: page
author:
  name: Kaspars
  avatar: https://secure.gravatar.com/avatar/92bfcd3a8c3a21a033a6484d32c25a40b113ec6891f674336081513d5c98ef76?s=96&d=robohash&r=g
---

# How to Mirror WordPress SVN Repositories

Creating a mirror of the SVN (Subversion) repositories hosting the source code of the WordPress core, plugins and themes is the first step to [replicating the install and update APIs](https://wpelevator.com/guides/replace-wordpress-update-apis) used by the WordPress itself.

There are several SVN repositories that need to be mirrored:

- WordPress core at [core.svn.wordpress.org](https://core.svn.wordpress.org)
- Plugins at [plugins.svn.wordpress.org](https://plugins.svn.wordpress.org)
- Themes at <https://themes.svn.wordpress.org>

After cloning the repositories, it is possible to keep them in sync with the origin using `<a href="https://svnbook.red-bean.com/en/1.7/svn.ref.svnsync.html">svnsync</a>` which would periodically pull in just the latest changes.

## Download SVN Dumps

I’ve gone through the steps outlined in this guide and created the SVN dumps of all three repositories and published them [on the Internet Archive](https://archive.org/details/@kasparsd/lists/1/wordpress.org-svn-repositories) so you can get right to local import and sync:

- [WordPress core](https://archive.org/details/wp-org-svn-core-dump) (1.9GB gzip, up until revision r58509)
- [Themes](https://archive.org/details/wp-org-svn-themes-dump) (198GB tar gzip)
- [Plugins](https://archive.org/details/wp-org-svn-plugins-dumps) (558GB tar gzip)

Be sure to use a download manager that enables resuming interrupted downloads such as `wget`:

```
wget --continue https://archive.org/download/wp-org-svn-themes-dump/wp-org-svn-themes-dump.gz
```

or `curl`:

```
curl --location --remote-name --continue-at - https://archive.org/download/wp-org-svn-themes-dump/wp-org-svn-themes-dump.gz
```

### Extract the Dumps

All dumps are compressed with gzip and both themes and plugins are also packed as tar bundles since they contain multiple dumpstream files.

Use the following commands to extract the single core dump (remove the `--keep` flag if you don’t want to preserve the compressed original):

```
gunzip --keep wp-org-core-svn-dump.gz
```

and the following command for `tar.gz` archives of theme and plugin dumps:

```
tar --extract --gunzip --file wp-org-svn-themes-dump.gz
tar --extract --gunzip --file wp-org-svn-plugins-dumps.gz
```

Below are steps for how to recreate the above dumps from the origin SVN repositories.

---

## How about a Checkout?

The first idea might be to use `svn checkout ...` for each of the repositories but that doesn’t work due to the following reasons:

1. Network errors and disconnects leave the local repository in a broken state and require `svn cleanup` before resuming the checkout, which often fails.
2. The SVN checkout allows only a single process and can’t be parallelized due to [locking of commits](https://svnbook.red-bean.com/en/1.8/svn.advanced.locking.html) to the same files.

## Alternatives to Checkout

Ideally, the process would:

1. download the whole SVN revision history as a single file,
2. or allow specifying a range of revisions to enable parallel downloads.

![SVN mirror workflow using svnrdump and svnadmin](https://wpelevator.com/wp-content/uploads/sites/12/2024/10/svn-mirror-workflow.png?strip=all&quality=90&resize=2100,800)The `<a href="https://svnbook.red-bean.com/en/1.7/svn.ref.svnadmin.c.dump.html">svnadmin dump</a>` is a tool that creates [a dump stream](https://svnbook.red-bean.com/en/1.7/svn.reposadmin.maint.html#svn.reposadmin.maint.migrate) for specific ranges of revisions. Since it only works with local repositories, there is also `<a href="https://svnbook.red-bean.com/en/1.7/svn.ref.svnrdump.c.dump.html">svnrdump dump</a>` that supports remote repositories.

So the final workflow is this:

1. Create a file with ranges of revisions like `XXX:YYY` to download in each process.
2. Use `<a href="https://linux.die.net/man/1/parallel">parallel</a>` to call `svnrdump dump --incremental --revision {} > repo-{}.dump` where `{}` is replaced with one of the revision ranges.
3. Import individual dumps into a new local repository using `svnadmin load --file repo-XXX:YYY.dump repo-directory`

Run this in a `screen` session to ensure the processes keep running even if you log out of the computer.

## Requirements

Use [Homebrew](https://brew.sh) to install the required tooling:

```
brew install subversion parallel pv
```

while tools like `bash`, `grep`, `cut` and `gzip` are already included i macOS by default.

## Step 1: Define Revision Ranges

The examples below are for the WordPress core SVN repository. Replace `https://core.svn.wordpress.org` with other repository URLs as needed.

Save this bash script as `rev-ranges.sh`:

```
#!/usr/bin/env bash

if [ $# -ne 1 ]; then
	echo "Usage: $0 <repository-url>"
	exit 1
fi

# Set the number of revisions included in a single dump.
REV_STEP=10000

LATEST_REV=$(svn info "$1" | grep "Revision:" | cut -c11-)

for (( rev_start = 0; rev_start < $LATEST_REV; rev_start += $REV_STEP )); do
	if (( $rev_start + $REV_STEP < $LATEST_REV )); then
		echo "$rev_start:$(($rev_start + $REV_STEP - 1))"
	else
		echo "$rev_start:$LATEST_REV"
	fi
done
```

Customise the `REV_STEP` as necessary, and mark it as executable:

```
chmod +x rev-ranges.sh
```

Finally, run it to generate `core-revs.txt` with a revision range per line:

```
./rev-ranges.sh https://core.svn.wordpress.org > core-revs.txt
```

which produces the following contents:

```
0:9999
10000:19999
20000:29999
30000:39999
40000:49999
50000:58547
```

## Step 2: Dump Revision Ranges

Create a new directory to store the revision dumps:

```
mkdir core-dumps
```

Then to start parallel downloads of the revision range dumps pass the contents of `core-revs.txt` to `parallel` using `cat` along with the command above:

```
cat core-revs.txt | parallel "svnrdump dump --revision {} --incremental https://core.svn.wordpress.org > core-dumps/core-{}.dump"
```

where:

- `--revision {}` specifies the revision range piped from the file,
- `--incremental` makes the dumps standalone for incremental import.

This starts one `svnrdump dump` process per CPU. Pass a `-j NN` flag to `parallel` to specify a custom job count.

Note, that **there is no output to terminal while the commands are running** as all of the `stdout` is sent to the dump files. Use `watch "ls -lh core-dumps"` in another window to monitor the size of the individual dumps:

```
Every 2.0s: ls -lh core-dumps

total 734M
-rw-r--r-- 1 root root  28M Sep 28 15:24 core-0:9999.dump
-rw-r--r-- 1 root root  34M Sep 28 15:24 core-10000:19999.dump
-rw-r--r-- 1 root root  67M Sep 28 15:24 core-20000:29999.dump
-rw-r--r-- 1 root root  45M Sep 28 15:24 core-30000:39999.dump
-rw-r--r-- 1 root root 284M Sep 28 15:24 core-40000:49999.dump
-rw-r--r-- 1 root root 279M Sep 29 05:29 core-50000:58547.dump
```

For reference — here is the [source code of the svnrdump](https://svn.apache.org/repos/asf/subversion/trunk/subversion/svnrdump/svnrdump.c) — the `dump_cmd` function invokes `replay_revisions` which in turn calls `<a href="https://subversion.apache.org/docs/api/latest/svn__ra_8h.html#a9fbcde06ba0b9ddb331631852f3277bd">svn_ra_replay_range</a>`.

Each dump file should be anywhere from 20MB to 1.5GB depending on the repository. The combined size of all dumps for WP core is 730MB.

In order to save the disk space, you can compress the dump stream with `gzip` before streaming to a file:

```
cat core-revs.txt | parallel "svnrdump dump --revision {} --incremental https://core.svn.wordpress.org | gzip > core-dumps/core-{}.dump.gz"
```

Remember to decompress the files when importing!

## Step 3: Import Dumps Locally

Unfortunately, the import process can’t be parallelized because the revisions are referring to previous revisions which *must exist* in the SVN database before the later ones can be inserted.

Therefore, we must ensure that `svnadmin load` is called sequentially with dump ranges from the lowest revisions to the highest. We use `sort --version-sort` to list the dump file names in the [natural order](https://en.wikipedia.org/wiki/Natural_sort_order).

Save this bash script as `load-dumps.sh` (adjust this if using gzipped dumps):

```
#!/usr/bin/env bash

if [ $# -ne 1 ]; then
	echo "Usage: $0 <dumps-source-dir>"
	exit 1
fi

DUMPS_DIR=$1
SVN_DIR="$DUMPS_DIR-svn"

if [ -d $SVN_DIR ]; then
	echo "SVN directory $SVN_DIR already exists. Not sure how to merge dumps."
	exit 1
fi

svnadmin create "$SVN_DIR"

for dumpfile in $(ls "$DUMPS_DIR" | sort --version-sort); do
	pv "$DUMPS_DIR/$dumpfile" | svnadmin load --quiet --no-flush-to-disk --bypass-prop-validation --force-uuid --memory-cache-size 2048 "$SVN_DIR"
done
```

where you must customize the value of `--memory-cache-size` argument based on the available RAM.

Make it executable:

```
chmod +x load-dumps.sh
```

and run it by specifying the source directory containing the dump files as the first argument:

```
./load-dumps.sh core-dumps
```

which produces the following output for each file:

```
Loading core-0:9999.dump:
3.12MiB 0:00:05 [ 422KiB/s] [>  ]  28% ETA 0:00:07
```

If one of the imports fails, you can attempt to manually restart the specific import and specify `--revision AAA:BBB` to the known remaining revision range. You might also need to run `svnadmin recover repo-directory` if the repo was left in a corrupt state.

### Import Performance

The best load performance I’ve seen is 20MB/s on average which leads to the following import times for each repository:

RepositoryDump Size UncompressedRepository SizeImport TimeWordPress Core779 MB935 MB5 minutesWordPress Themes353 GBTBDTBDWordPress Plugins1258 GBTBDTBD## Syncing with Origin

After creating the repositories locally, use `svnsync` to keep it with sync with the remotes. It is important to note that this tool works with repository URLs so we must specify the local repository as `file:///Users/yourname/to/core-dumps-svn`. Run `pwd` to print the full path to the current directory and append it to `file://`.

First, set the contents of `hooks/pre-revprop-change` in the repository directory to do nothing:

```
#!/bin/sh
exit 0
```

and make it executable:

```
chmod +x core-dumps-svn/hooks/pre-revprop-change
```

Then associate the local repository with a remote origin:

```
svnsync init --allow-non-empty <span style="font-family: inherit; font-size: inherit;">file:///Users/yourname/wp-svn/core-dumps-svn</span> https://core.svn.wordpress.org
```

which returns:

```
Copied properties for revision 58547.
```

And finally, run the actual sync:

```
svnsync sync <span style="font-family: inherit; font-size: inherit;">file:///Users/yourname/wp-svn/core-dumps-svn</span>
```

You can setup a cronjob to run this at regular intervals.

## Checkout Working Copy

The repository we’ve created only contains the revision history and the associated meta data. To get the actual files or a working copy, we run:

```
svn checkout <span style="font-family: inherit; font-size: inherit;">file:///Users/yourname/wp-svn/core-dumps-svn</span> core-svn-checkout
```

where the last argument is the directory path for the working copy. If you skip the last argument, it will dump the working files in the SVN repository directory.