Tuesday, 26 January 2016

Use Vagrant to set up a Centos 7 VM in AWS EC2

Introduction

Everyone tells us that Infrastructure as Code is the way to go, right? So when I was recently asked to set up a continuous integration service for a development project, the obvious option was an approach that allows us to script the server setup and deployment, and put the whole lot under version control. Particularly since a colleague had already assembled a Vagrant file that allowed him to deploy a Jenkins server in a VM on his workstation.

However, there were a number of gotchas, which caused me a couple of days' unexpected work. I'm trying to record these here in case anyone else benefits from my experience.

One of the requirements was to run the CI server under CentOS 7, the same as the target environment, so that all tests would run under near-identical production-environment conditions and deployment would be very simple using a tar file containing all the dependencies. But CentOS comes with enterprise-grade security features, which sometimes get in the way of what you want to do. Read on...

Set up Vagrant

To use Vagrant, set up a folder in your project (e.g. called "CI") and in it, create a Vagrantfile. If you have not used Vagrant before, please work through the quick Getting Started exercise to familiarise yourself with the concepts.

The Vagrantfile is basically a Ruby script, so it is common practice to prefix it with:

# -*- mode: ruby -*-
# vi: set ft=ruby :

To use Vagrant with AWS, first you have to locally install the provider and the AWS box:

vagrant plugin install vagrant-aws
vagrant box add dummy \
https://github.com/mitchellh/vagrant-aws/raw/master/dummy.box

The README for the Vagrant AWS provider helpfully provides a starter Vagrantfile for you to copy and extend. However, we need to add a number of parameters to the AWS provider configuration and change all the ones supplied.

aws.access_key_id = "**"
aws.secret_access_key = "****"
aws.session_token = "***"
aws.keypair_name = "*****"

aws.ami = "ami-e68f82fb" # CentOS 7 64-bit
aws.instance_type = "t2.medium"
aws.region = "eu-central-1" # Pick the appropriate region
aws.security_groups = [ 'sg-e430808d' ]
aws.block_device_mapping = [{ 'DeviceName' => '/dev/sda1', 'Ebs.VolumeSize' => 50 }]
aws.associate_public_ip = true
aws.subnet_id = "subnet-f7569b8c"
aws.ssh_host_attribute = :dns_name
aws.tags = { 'NAME' => 'Continuous Integration' }

aws.user_data = File.read ("boothook.sh")
override.ssh.username = "centos"
override.ssh.pty = true
override.ssh.private_key_path = "~/.ec2/*****.pem"

The private key file for your instance (or cluster) will be generated for you by AWS EC2 when you launch an instance through the EC2 management console. The recommendation is to put this in ~/.ec2 along with any other EC2 private keys. Do not share it via version control, or it soon won't be secret any more.

The aws.tags hash allows you to set any tag values. NAME is a typical example, which allows you to identify the instance in the EC2 management console.

The aws.subnet_id should be set the same as for any EC2 instance launched manually in the same subnet.

aws.block_device_mapping is only needed if you want to allocate a non-default volume size. For CI purposes, the default volume size is probably too small if any significant amount of build history is to be kept.

The aws.security-groups should list one or more security groups that you have created via the EC2 management console. Make sure that at least the SSH port and HTTP(s) are permitted inbound. I set up an nginx server as a reverse proxy on port 80 (see below) to allow client browsers to access multiple back-end services through the standard HTTP port.

The aws.instance_type allows you to choose the size of virtual machine. You may find that t2.small is sufficient for your needs, but if it runs out of CPU credits, it will be throttled severely (which can actually cause builds to fail due to timeouts). While trying to perfect the Vagrantfile, however, you may wish to specify t2.micro to minimise AWS usage charges.

Notice the "boothook.sh" reference. This is based on the answer to a commonly encountered issue with Vagrant and certain AWS AMIs. The contents of the file are:

#!/bin/bash
# CentOS 7 normally requires a TTY for sudo,
# which kills Vagrant's rsync command.
# By loading this sequence of commands into
# aws.user_data, that problem is defeated.
SUDOERS_FILE=/etc/sudoers.d/999-vagrant-cloud-init-requiretty
echo "Defaults:centos !requiretty" > $SUDOERS_FILE
echo "Defaults:root !requiretty" >> $SUDOERS_FILE
chmod 440 $SUDOERS_FILE

The Vagrantfile has a number of aws access parameters that need to be configured (shown by asterisks above). Insert the name of your private key, which must be the one contained in the override.ssh.private_key_path parameter. Navigate to the Identity and Access Management section of the AWS console. Create an IAM role for your Vagrant execution and generate an access key. Then grant that user the AmazonEC2FullAccess permission. This allows Vagrant to provision the virtual machine.

Confusingly, this role and its keys are not inserted into the Vagrantfile at all. Instead, a session token is required. This is how to obtain it:

  1. Download and install the Amazon Command Line Interface. On Mac OS X with Python and pip already installed, this turned out to consist simply of a one-line command:
    sudo pip install awscli
  2. Configure the command line interface:
    aws configure
    (Leave the default region name and default output format as "none")
  3. Request the session token with a duration of 36 hours (maximum):
    aws sts get-session-token --duration-seconds 129600

Update the Vagrantfile using the session token as well as the new key and secret key returned. After 36 hours, if you want to deploy using this Vagrantfile again, you will have to repeat the procedure.

Following the AWS provider configuration in the Vagrantfile (just above the final "end" statement in the file), specify any further configuration steps required, e.g. synchronized folders and custom software installations (see next section).

After this, deploying the box should be straightforward (make sure you are in the same current working directory as the Vagrantfile):

vagrant up --provider=aws

Note the public IP and host FQDN shown for the virtual machine in the AWS console. This is the address you will need to access your CI application from a browser. For example, if your machine FQDN is ec2-54-93-105-248.eu-central-1.compute.amazonaws.com, your Jenkins dashboard (assuming you went on to install Jenkins, as shown below) will be at http://ec2-54-93-105-248.eu-central-1.compute.amazonaws.com/jenkins/.

Continue configuring your CI server manually and add these configuration steps to the Vagrantfile if possible.

To terminate the machine, use

vagrant destroy

Synchronize Folders

Between the end of the config.vm.provider configuration and the end of the Vagrantfile, you can insert further configuration instructions. Configure SSH for folder synchronization using the same parameters as above:

config.ssh.username = "centos"
config.ssh.pty = true
config.ssh.private_key_path = "~/.ec2/*****.pem"

Then specify which folders you want to synchronize to the server. Because folder synchronization precedes any shell scripts run on the target VM, I find it best to synchronize mostly to /tmp subfolders on the target VM and then copy or move the contents from there during the subsequent software installation. For example:

config.vm.synced_folder "./nginx", "/tmp/nginx", \
type: "rsync", create: true, owner: 'root', group: 'root'

where the local folder "nginx" contains a subfolder "default.d", which in turn contains "jenkins.conf" to specify the reverse proxy configuration for Nginx to access the Jenkins server on port 8080 (see below). The copied folder "default.d" is subsequently moved from /tmp/nginx to /etc/nginx once the Nginx software has been installed.

Install Software

Introduction

So-called "here documents" are a neat way to separate bits of installation script into identifiable blocks that can be invoked from the configuration section. However, there is a "gotcha" here too - any backslashes ("\") must be escaped ("\\"). This caught me out several times when developing sed or awk scripts in a shell and pasting them into the Vagrantfile. (In the pieces of Vagrantfile shown in this blog post, please interpret a single backslash at the end of a line to mean a soft line wrap. Join with the following line and delete the backslash after copying! And don't insert any spaces, particularly in the middle of sed or awk scripts!)

Place your here documents one after the other directly above the Vagrant.configure(2) block.

Set up tools

The first thing is to install some tools that will be used by the subsequent installations. The time and date should of course be set to whatever is appropriate for you.

Here Document

BaseConfig=<<EOF
sudo yum -y update
sudo yum -y groupinstall 'Development Tools'
sudo yum -y install epel-release
sudo yum -y install nano byobu bzip2 wget
sudo yum -y install fontforge # Not required for production
sudo timedatectl set-timezone Europe/London
EOF

Invocation

config.vm.provision "shell", inline: BaseBox

Set up Nginx

Note the copy command, which makes use of the folder synchronisation shown earlier.

The setsebool command is required to allow Nginx to proxy HTTP or HTTPs to local TCP sockets.

Here Document

Nginx=<<EOF
sudo yum -y install epel-release
sudo yum -y install nginx
sudo cp -r /tmp/nginx/default.d/*.conf /etc/nginx/default.d/
sudo setsebool httpd_can_network_connect 1 -P
sudo systemctl enable nginx
sudo systemctl restart nginx
EOF

Invocation

config.vm.provision "shell", inline: Nginx

Set up Jenkins

This is slightly complicated by the fact that we need Jenkins to have a URL prefix - otherwise reverse-proxying becomes next to impossible (see the sed script below). The jobs folder is relocated to /home/jenkins and symbolically linked, which should make it easier to upgrade Jenkins later without losing build configurations and histories. The initial set of build configurations is stored in version control and synchronised to /tmp/jenkins/jobs by Vagrant before this installation occurs.

The installation of plugins is separated into a second block. You may of course require a different selection. The best way I have found to determine the name of the plugins to install is to install them manually once, while using the "list-plugins" command before and afterwards to find which new plugin names have appeared. You can do this after the Vagrant machine has been deployed by means of the following commands:

vagrant ssh

...
wget http://localhost:8080/jenkins/jnlpJars/jenkins-cli.jar
java -jar jenkins-cli.jar -s http://localhost:8080/jenkins/ \
     list-plugins

NB if you have already enabled security on the Jenkins instance, you must log in to the Jenkins CLI before you can list the plugins.

java -jar jenkins-cli.jar -s http://localhost:8080/jenkins/ \
login 
--username **** --password ****

Here Document

# Install Java 8 update 65 because there's a bug
# in the Jenkins update module that makes
# signature checks fail in Java 8 > u65
Jenkins=<<EOF
sudo yum -y install java-1.8.0-openjdk-1.8.0.65
sudo curl -sLo /etc/yum.repos.d/jenkins.repo \
     http://pkg.jenkins-ci.org/redhat/jenkins.repo
sudo rpm --import \
     https://jenkins-ci.org/redhat/jenkins-ci.org.key
sudo yum -y install jenkins
sudo sed -i 's/^\\(PARAMS=.\\)/\\1--prefix=\\/jenkins /' \
     /etc/init.d/jenkins
# Enable Jenkins to upgrade itself automatically
sudo chgrp -R jenkins /usr/lib/jenkins
sudo chmod -R g+w /usr/lib/jenkins
sudo systemctl daemon-reload
sudo systemctl enable jenkins.service
sudo systemctl restart jenkins.service
sudo mv /tmp/jenkins /home/jenkins
sudo chown -R jenkins:jenkins /home/jenkins
sudo ln -s /home/jenkins/jobs /var/lib/jenkins/jobs
sudo chown jenkins:jenkins /var/lib/jenkins/jobs
EOF

JenkinsPlugins=<<SCRIPT
wget http://localhost:8080/jenkins/jnlpJars/jenkins-cli.jar
java -jar jenkins-cli.jar -s http://localhost:8080/jenkins/ \
     install-plugin authentication-tokens
java -jar jenkins-cli.jar -s http://localhost:8080/jenkins/ \
     install-plugin copyartifact
java -jar jenkins-cli.jar -s http://localhost:8080/jenkins/ \
     install-plugin ghprb
java -jar jenkins-cli.jar -s http://localhost:8080/jenkins/ \
     install-plugin git
java -jar jenkins-cli.jar -s http://localhost:8080/jenkins/ \
     install-plugin git-changelog
java -jar jenkins-cli.jar -s http://localhost:8080/jenkins/ \
     install-plugin git-client
java -jar jenkins-cli.jar -s http://localhost:8080/jenkins/ \
     install-plugin git-parameter
java -jar jenkins-cli.jar -s http://localhost:8080/jenkins/ \
     install-plugin git-tag-message
java -jar jenkins-cli.jar -s http://localhost:8080/jenkins/ \
     install-plugin github
java -jar jenkins-cli.jar -s http://localhost:8080/jenkins/ \
     install-plugin github-api
java -jar jenkins-cli.jar -s http://localhost:8080/jenkins/ \
     install-plugin github-pullrequest
java -jar jenkins-cli.jar -s http://localhost:8080/jenkins/ \
     install-plugin nodejs
java -jar jenkins-cli.jar -s http://localhost:8080/jenkins/ \
     restart
SCRIPT

Invocation

Insert some other software installations between these two in order to allow Jenkins to initialise itself before calling it via the CLI.

config.vm.provision "shell", inline: Jenkins
...
config.vm.provision "shell", inline: JenkinsPlugins

Set up NodeJS

The installation of NodeJS (or node.js) under CentOS 7 is fairly straightforward, but I needed to specify the exact versions of node, grunt and bower in order to comply with the technical policy of the project. If you don't need to do that, just omit the version details (e.g. sudo yum install -y nodejs).

Here Document

Node423=<<EOF
curl -sL https://rpm.nodesource.com/setup_4.x | sudo -E bash -
sudo yum install -y nodejs-4.2.3-1nodesource.el7.centos.x86_64
sudo npm install -g grunt-cli@0.1.13 bower@1.7.0
EOF

Invocation

config.vm.provision "shell", inline: Node423

Set up PostgreSQL

Here again an exact version was needed, otherwise installation could have been much more straightforward. Note the use of double-backslashes in the awk and sed scripts.

Here Document

PostgreSQL=<<SCRIPT
sudo cp /etc/yum.repos.d/CentOS-Base.repo /tmp
sudo awk '{print}; $1 ~/\\[base\\]/ || $1 ~/\\[updates\\]/'\
' {print "exclude=postgresql*"}' /tmp/CentOS-Base.repo \
> /etc/yum.repos.d/CentOS-Base.repo
sudo yum -y localinstall \
'http://yum.postgresql.org/9.5/redhat/rhel-7-x86_64/'\
'pgdg-centos95-9.5-2.noarch.rpm'
sudo yum -y install postgresql95-server postgresql95-contrib
sudo /usr/pgsql-9.5/bin/postgresql95-setup initdb
sudo sed -i \
"s/#listen_addresses = 'localhost'/listen_addresses = '*'/" \
/var/lib/pgsql/9.5/data/postgresql.conf
sudo sed -i \
's/^host [^:]*$/host    all             all'\
'             all                     md5/' \
/var/lib/pgsql/9.5/data/pg_hba.conf
sudo systemctl restart postgresql-9.5.service
sudo systemctl enable postgresql-9.5.service
sudo su -c "createdb mydatabase" - postgres
sudo su -c "createdb mydatabase_test" - postgres
cat <<EOF | sudo su -c "psql mydatabase" - postgres
CREATE USER myapp WITH PASSWORD 'myapp';
CREATE USER myapp_test WITH PASSWORD 'myapp';
GRANT USAGE ON SCHEMA public TO myapp;
GRANT USAGE ON SCHEMA public TO myapp_test;
EOF
SCRIPT

Invocation

config.vm.provision "shell", inline: PostgreSQL

Set up Redis

Here Document

Redis=<<SCRIPT
sudo yum -y install redis
sudo systemctl enable redis.service
sudo systemctl restart redis.service
SCRIPT

Invocation

config.vm.provision "shell", inline: Redis

Conclusion

I hope this has helped you in your quest for DevOps Nirvana. Please drop me a line to tell me about your experiences. I can't promise to help, but I'll lend a sympathetic ear!

Tuesday, 17 November 2015

Practical GUI testing using SikuliX

Rejoice with me - after much banging of head against various walls, I have finally succeeded in building a GUI test out of Java, jUnit, Gradle and SikuliX 1.1.0. It starts a new game in the Mac OS Chess application, enters the opening move using drag/drop, and quits the application (answering correctly in the "save" dialog).

The secret was to give up on the idea of using the Python test script fragments created in the SikuliX IDE directly. Rather, I now translate these scripts line by line into Java; e.g. from
wait("xyz.png")
to
screen.wait("xyz.png");

Each short Python script is converted to a simple method on an object representing the application or individual screen under test. These methods can then be invoked as fixtures from any test framework, e.g. jUnit or Fit. The resulting test suite is built with Gradle and can be run as a standalone application to test the target application end-to-end.

One of the tricky bits was to set the image search path correctly so that the images embedded in the jar file can be located by the SikuliX API, whether you're running in the IDE or standalone. By storing a dummy SikuliX script within the src/main/resources folder (e.g. images.sikuli) you can use the SikuliX IDE directly to capture and fine-tune images for use in your tests.

Depending on the CI environment of the project, another tricky part may be to provide both the application under test and the test suite with a graphical pseudo display on which to run, but there is quite a lot of advice in the SikuliX documentation as well as on Stack Overflow about that.

Next steps:

  • Test this approach under Windows and Linux too
  • Enable the use of FIT in place of jUnit (more appropriate for whole-system tests)
  • Make use of the optional Tesseract library to read back values from the screen
  • Build and run test suite in various CI environments

Saturday, 14 February 2015

The end of FM Radio?

A matter which concerns me greatly at the moment is the impending switch from FM to DAB radio, which could happen in the UK as soon as 2016. In 2013, communications minister Ed Vaizey announced a postponement of the digital switchover to "after 2015", following public pressure.

Manufacturers of broadcast and receiving equipment are naturally keen to sell lots of shiny new kit. Ironically, there are technically superior alternatives to DAB already available. The technology is over 20 years old and was superseded by the incompatible DAB+ system in 2007. Canada has already abandoned DAB entirely. Here are some of the things wrong with it:
  • The level of coverage leaves a lot to be desired - currently it is thought that 9% of the country will not have adequate DAB reception by the end of 2016
  • The quality of DAB leaves a lot to be desired. We have a couple of DAB radios in the house and usually leave them tuned to FM because the quality is better
  • DAB is not suitable for reception in a moving car
  • There is a random digitisation delay, meaning that the "pips" are not accurate enough to set a watch by. If you have radios in different rooms tuned to the same station, there's a ghastly "echo" effect because each radio has a different delay factor
Most households in the country will have to have to junk hundreds of pounds' worth of perfectly functional equipment. That's in effect a stealth tax and very bad for the environment. Many radio stations, some of which are run on a shoestring, may be forced to close down as they cannot afford the investment in new equipment.

It was bad enough when analogue TV ended, but at least the digital alternative was of better quality and there were add-on Freeview receivers available at a reasonably low price to adapt existing equipment.

If there's a technology that is really superior to FM radio, it should succeed on its own merits in the marketplace and not by regulation. Let's put more pressure on the government to postpone the FM switch-off indefinitely. I cannot find any on-line campaigns or petitions currently running to keep FM radio, but it could become an issue in the May 2015 UK General Election if enough of us raise the subject.

Wednesday, 15 October 2014

Simple, secure way to share a git repository

The problem with Windows or NFS file shares is that the contents are not version-controlled and (unless you use the Windows "available off line" feature) not available while you're disconnected from the office network. However, it is very easy to set up a git repository for a team to share code and documents, if they all have at least intermittent access to the same file share.

First install yourself a git command-line client if you have not done so yet (see http://git-scm.com/downloads).

Creating the repository

Assuming that
  • You already have a local git repository called "test-repo", which you want to share
  • Your local git repositories are under %HOMEDRIVE%\%HOMEPATH%\git
  • Your shared git repositories are going to be under \\myserver\myshare\git

To add a remote copy of your own git repository to the share:

>%HOMEDRIVE%
>cd %HOMEPATH%\git\test-repo
>pushd "\\myserver\myshare\git"
>git init --bare test-repo.git
>popd
>git remote add origin "//myserver/myshare/git/test-repo.git"
>git push origin master

Note that the git commands require slashes to be forward - unlike normal Windows commands.

Verify that the file .git\config within your local git repository looks something like this:

[core]
    repositoryformatversion = 0
    filemode = false
    bare = false
    logallrefupdates = true
    symlinks = false
    ignorecase = true
    hideDotFiles = dotGitOnly
[remote "origin"]
    url = //myserver/myshare/git/test-repo.git
    fetch = +refs/heads/*:refs/remotes/origin/*
[branch "master"]
    remote = origin
    merge = refs/heads/master
    rebase = true


Ensure that the sections [remote "origin"] and [branch "master"] in particular are set up correctly.

Cloning the repository

Another team member can now share your code - you both push your local commits to the shared repository and pull other team members' commits from it to your local repository. Here's what each team member wishing to access the shared repository should do, assuming that they also have their local git repositories under %HOMEDRIVE%\%HOMEPATH%\git.

>%HOMEDRIVE%
>cd %HOMEPATH%\git
>git clone "//myserver/myshare/git/test-repo.git" test-repo
>cd test-repo
>git pull
>git status

The status should look like this:

# On branch master
nothing to commit (working directory clean)

Making cycling safer!

I have just been alerted to the existence of Collideoscope - a service that aims to collect data about cycling accidents and near misses directly from cyclists, and use it to press for improvements to road safety at accident black spots. What a brilliant idea. You can also use it to educate yourself about areas in which to take extra care when cycling.

Friday, 17 January 2014

Carbon - the EU's chance to shine

I am really concerned that we may be causing irreversible and harmful climate change.

While there is no conclusive proof that humankind is behind global warming, the evidence against us is stacking up more and more heavily.

The planet has some amazing mechanisms for keeping the environment stable, such as soaking CO2 up in the oceans, but these are not infinitely flexible and we may be approaching a tipping point. As an illustration of this, consider a glass filled with water and ice cubes, standing in a warm room. The ice/water mixture stays at 0 degrees as the ice gradually turns to water, but once all the ice has melted the temperature rapidly rises.

While we have not proved that our wasteful use of fossil energy is causing global warming, there is ample evidence that using a lot less won't cause any harm and may in fact stimulate the economy. So it would be logical to reduce our emissions of both greenhouse gases and waste heat as quickly as possible.

More research is needed to identify renewable energy sources that are much more efficient and less environmentally damaging than wind turbines or biofuels. Better insulation and more efficient heating, manufacturing, agriculture and transport are also needed.

Right now, the President of the EU Commission, Jose Manuel Barroso, is weighing a decision that could make or break our planet. His team have just days to pin down carbon emissions and renewable energy targets for Europe until 2030. It’s a very important deal that has been kept very quiet. Some EU countries, prompted by vested energy interests, are lobbying to set extremely unambitious targets. Barroso may opt to avoid a fight by playing it safe. He works in a bubble of politicians, officials and lobbyists, but he is very sensitive to what people think of him, especially now as he embarks on his last few months in the job and hopes to leave a shining legacy.

Avaaz is publishing an ad in the key Brussels paper today calling on Barroso to follow the science. If supported by tens of thousands of e-mail messages from across Europe, this could just jolt him out of the bubble. You can read the ad and send Barroso a message today that the the future of the world is in his hands.

Monday, 8 April 2013

Music by Programmers - Launch Party

Jason Gorman and five of his programming chums have put together an album of original electronic music to raise money for educational programmes at Bletchley Park and The National Museum Of Computing. The charities get every penny of the profits, split 50/50 between them.

The album goes on sale in late April, and Jason's target is to sell 2,000 downloads and raise £10,000. The music industry isn't what it was, and selling music is much harder than it used to be - even for a good cause. Between now and the release date, they're going to need as much help as they can get to spread the word.
Please feel free to publicise their efforts by any means at your disposal.

A special listening party to launch the album on 23rd April is not to be missed. It's going to be small and intimate at a "jolly spiffy pub" near Holborn. Every penny of the £20 ticket price goes directly to Bletchley Park for maths workshops and a programming club at The National Museum Of Computing. And every ticket holder will get a very limited edition CD with bonus tracks!!! Only 100 of these special edition CDs will be produced, so it promises to be a veritable collector's item.