Monday, 2 August 2010

Migrating a Subversion repository

In my current project, there are a number of development teams whose code repositories have grown, separately or together, in a somewhat haphazard fashion. As part of the initiative to introduce a more streamlined development process, therefore, I want to host all repositories on one server and reorganise them in a more uniform and logical way.

Using Subversion, it is relatively straightforward to move whole repositories, but taking one branch out of an existing one to add to a new repository (without losing the change history!) turns out to be surprisingly hard. After much experimentation, the best way turns out to be via Mercurial's excellent "convert" extension. Unfortunately, conversion from Hg to Subversion simply fails under Windows, so you need to use Linux.

This particular customer has a totally Windows-based infrastructure, as well as a particularly draconian firewall that uses Windows authentication exclusively. The following recipe therefore had to take that into account. Names have been blanked out where necessary to preserve client confidentiality.

  1. Install the VMWare player and a Linux appliance, such as the Ubuntu 10.04 LTS image with VMware Tools.

  2. Lightly edit the VMWare virtual machine configuration file (Ubuntu.vmx) to ensure that it has at least 1024MB of RAM. Start up the Linux VM and upgrade existing packages. This turns out to be a little more difficult than you might expect, because the Synaptic package manager (unlike Firefox) is unable to pass your Windows username and password as NTLM credentials to the network proxy server. Therefore you have to proceed as follows:

    1. Make sure that the VMWare player's network connection is set to "NAT"

    2. In System -> Preferences -> Network Proxy, set the manual HTTP proxy to the server and and port that your Windows web browser uses. If the Web browser's Internet settings nominate a proxy auto-configuration script, you may be able to use that instead. Don't forget to set your Windows username and password. Apply system-wide (you will have to supply your Linux user password twice).

    3. Open Firefox and navigate to http://sourceforge.net/projects/cntlm/ - download the .deb package and allow the package installer to install it. This is a local proxy server that logs onto the firewall on your behalf.

    4. Now configure the cntlm service as follows:
      • As superuser, edit /etc/cntlm.conf and set the following values (see the documentation):
        • Username (your Windows user ID)
        • Domain (your Windows domain)
        • Workstation (your PC name - not the full FQDN)
        • Proxy (proxy FQDN:proxy port)
        • Proxy (backup proxy FQDN:proxy port)
        • Listen 3128
      • Make sure that the permissions on cntlm.conf are -rw-r--r--
      • As a normal user, run the following command:
        cntlm -v -I -M http://test.com
      • Copy and paste the resulting profile (the lines between -------------- markers) into the cntlm.conf file - search for "#Auth" to find the insertion point
      • As the superuser, start up the cntlm service:
        /etc/init.d/cntlm start

    5. In System -> Preferences -> Network Proxy, set the manual HTTP and FTP proxy servers to localhost and port 3128. Username and password should still be your Windows credentials. Apply system-wide as before. Shutdown Firefox and open it again to check that the proxy settings are correct.

    6. Open System -> Administration -> Synaptic Package Manager (it will request your Linux user password). Go to Settings and configure a manual proxy just as in System Preferences.

    7. Still in the package manager, click Reload. You should see the package files being downloaded successfully. Click "Mark All Upgrades".

    8. Still in the package manager, type "subversion" into the search box. Mark both "subversion" and its dependencies and "python-subversion" for installation. Similarly for "mercurial". Finally, click "apply", click the Apply button, and sit back.

    9. Part way through the installation process, you may encounter a warning that the Linux system image cannot be safely installed without GRUB. However, the grub loader has been superseded by grub-pc, so you can safely ignore this (click the "go ahead without grub" checkbox).

  3. You can also upgrade the VMWare tools, which the player will claim are out of date. This is of unproven usefulness. It doesn't take terribly long, so you may as well. After downloading the upgrade, VMWare Player mounts it as a CD image under /media (run "df" to find out what it's called). The trick is to unzip from there into /tmp or your home directory, because the mounted volume is read-only. Then run "sudo perl <install-script>" to upgrade the software.

  4. Assuming all has gone well, you should now be able to run Mercurial and Subversion from the command line. Test it by running the following commands:
    hg help
    svn help
    svnadmin help

  5. You need to cache the subversion authentication parameters in order that Mercurial can invoke subversion non-interactively. To do this, make sure you are logged in as an ordinary user (not root) and run:
    svn log -l 2 REPOSITORY_URL
    where REPOSITORY_URL is the address of the repository from which you wish to migrate a branch.
    When prompted for the "user" password, just hit return. Now enter the username of a real user with access to the target repository, and hit RETURN. Next enter that user's password and hit RETURN. You should be rewarded with two lines of history from that repository.

  6. You must enable the convert extension by uncommenting the line containing "hgext.convert =" in /etc/mercurial/hgrc.d/hgext.rc.

  7. Now you're all set to do the export. Create a folder (e.g. "repo_migration") and change directory into it. Create a file-mapping file such as myproject_mapping.txt:

    include branches/Development/myproject
    rename branches/Development/myproject trunk

    See the hg convert documentation for details of the mapping format.

  8. The following command does the business. It will probably process around 100 revisions per minute on average, so you've got time to make lots of cups of tea:
    - hg convert -d svn --filemap myproject_mapping.txt -s svn REPOSITORY_URL NEW_REPOSITORY
    where REPOSITORY_URL is the address of the repository from which you are migrating, and NEW_REPOSITORY is the name of the local Subversion repository you want to create as the result.

  9. If the conversion starts to slow down, you can open a second terminal and, as superuser, set the nice level of the running hg process to -20, which should increase its priority to maximum. I'm not sure how much this helps, since the process seems mainly network-bound.

  10. Finally, copy the new local repository to a location from which your Subversion server can serve it, and Bob's your parent's brother.

It has just occurred to me that the reason this doesn't work under Windows is probably because we didn't have the python-subversion package installed. However, that is just a guess because the diagnostics given (exit code 1) provided no clue.

Ideally of course, you should just export the existing repository to Mercurial and let everyone use that instead of Subversion from now on. Unfortunately some tools we want to use, such as JIRA are not yet Mercurial-aware. Besides, we don't want to incur the overhead of training developers to work with a new SCM tool - it isn't that long since they even began to use Subversion.

1 comment:

Immo Hüneke said...

After much, much experimentation, it turns out that the only reliable way to migrate part of one Subversion repository with all its change history to another repository is using the old, pedestrian dump / filter / load cycle. It is described very well in Greg Ippolito's blog.