Anthony Goddard

OPS

PaaS in Under 10 Minutes

PaaS in under 10 minutes (or, “Let’s see if building a PaaS and deploying an app is faster than formatting a disk”).

We recently had some 4TB (RAID-0) external drives shipped to us to fill with BHL(biodiversitylibrary.org) content, we needed to format them as ext4 and then fill them up with books. Formatting the disks took a while, so I decided that for the next disk, I would see if I could setup a PaaS using Dokku, a cool little set of bash scripts that do some proxy, building & assorted magic on top of docker. And then deploy an app to it. Here’s how it panned out.

start the ext4 format:

[16:10:41] [root@clustr-03 /root]# date;mkfs.ext4 /dev/sdz1
Thu Jun 13 16:12:35 EDT 2013

Writing inode tables:  1/29809

prep the server with dokku

$ wget -qO- https://raw.github.com/progrium/dokku/master/bootstrap.sh | sudo bash

We’ll let that bake for a little while

setup app

~/dev $ date;mkdir awyeah
Thu Jun 13 16:12:55 EDT 2013
~/dev $ cd awyeah/
~/dev/awyeah $ bundle init
~/dev/awyeah $ touch web.rb
~/dev/awyeah $ touch config.ru
~/dev/awyeah $ vim Gemfile
web.rb
1
2
3
    source "https://rubygems.org"
    ruby "1.9.3"
    gem 'sinatra'
~/dev/awyeah $ vim web.rb
web.rb
1
2
3
4
5
require 'sinatra'

get '/' do
  "awwwwyeah!!!"
end

time check

~/dev/awyeah $ date;vim config.ru
Thu Jun 13 16:14:05 EDT 2013
config.ru
1
2
require './web.rb'
run Sinatra::Application
~/dev/awyeah $ git init .
Initialized empty Git repository in /Users/agoddard/dev/awyeah/.git/
~/dev/awyeah (master) $ git add .
~/dev/awyeah (master) $ git ci -am "awwyeah initial commit"
[master (root-commit) ade6ea2] awwyeah initial commit
 3 files changed, 10 insertions(+)
 create mode 100644 Gemfile
 create mode 100644 config.ru
 create mode 100644 web.rb
~/dev/awyeah (master) $ git remote add deploy git@deploy.eol.org:awwyeah
~/dev/awyeah (master) $ bundle
Fetching gem metadata from https://rubygems.org/...........
Fetching gem metadata from https://rubygems.org/..
Resolving dependencies...
Installing rack (1.5.2)
Installing rack-protection (1.5.0)
Installing tilt (1.4.1)
Installing sinatra (1.4.3)
Using bundler (1.3.5)
Your bundle is complete!
Use `bundle show [gemname]` to see where a bundled gem is installed.
~/dev/awyeah (master) $ git ci -am "added Gemfile.lock"
[master 787cdaa] added Gemfile.lock
 1 file changed, 17 insertions(+)
 create mode 100644 Gemfile.lock

back to server

[...]
nginx-reloader start/running, process 26946
Be sure to upload a public key for your user:
 cat ~/.ssh/id_rsa.pub | ssh root@deploy.eol.org "gitreceive upload-key ag"

looks good, let’s finish it off

cat ~/.ssh/id_rsa.pub | ssh root@deploy.eol.org "gitreceive upload-key ag"

ship it!

~/dev/awyeah (master) $ git push deploy master
Counting objects: 4, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 500 bytes | 0 bytes/s, done.
Total 3 (delta 0), reused 0 (delta 0)
remote: -----> Building awwyeah ...
remote:        Ruby/Rack app detected
remote: -----> Using Ruby version: ruby-1.9.3
remote: -----> Installing dependencies using Bundler version 1.3.2
remote:        Running: bundle install --without development:test --path vendor/bundle --binstubs vendor/bundle/bin --deployment
remote:        Fetching gem metadata from https://rubygems.org/..........
remote:        Fetching gem metadata from https://rubygems.org/..
remote:        Installing rack (1.5.2)
remote:        Installing rack-protection (1.5.0)
remote:        Installing tilt (1.4.1)
remote:        Installing sinatra (1.4.3)
remote:        Using bundler (1.3.2)
remote:        Your bundle is complete! It was installed into ./vendor/bundle
remote:        Cleaning up the bundler cache.
remote: -----> Discovering process types
remote:        Default process types for Ruby/Rack -> rake, console, web
remote: -----> Build complete!
remote: -----> Deploying awwyeah ...
remote: -----> Application deployed:
remote:        http://awwyeah.deploy.eol.org
remote:
To git@deploy.eol.org:awwyeah
   ade6ea2..787cdaa  master -> master

and let’s see if the site is online

~/dev/awyeah (master) $ curl http://awyeah.deploy.eol.org
oh hell yeah

time?

~/dev/awyeah (master) $ date
Thu Jun 13 16:18:59 EDT 2013

let’s check on that formatting:

Writing inode tables:  4289/29809

plenty of time (also, ext4/4TB is kinda slow…)

Saving the World With DevOps

I had a great time this morning as a guest of the Food Fight podcast with Bryan Berry, Nathen Harvey, Brandon Burton and the always awesome James Cuff.

We started a discussion about the role of DevOps in research computing and solving ‘the big problems’ of the world and I had a chance to give a brief look into some of the work we’re doing on the Encyclopedia of Life project. Hats off to Bryan and Nathen for bringing up the topic and dedicating a show to it, I think it was really great to get a feel for what issues are important to this community, even from such a small conversation. Climate change and cancer research were the two that immediately rose to the surface. Of course, within both are a myriad of research fields – genomics, microbiology, pharmacology, regenerative biology, cellular physiology, ecosystems, oceanography, biodiversity, meteorology, glaciology, that’s just a start. There are some really awesome intersections between devops and science, far more than there were with traditional dev and ops teams in my opinion. The ability to automate works well with scientists who generally need tools and compute capacity as fast as they can get it, with the lowest barrier to entry. While James spoke of his time setting up and running the Research Computing group at Harvard, he was also able to speak about his new gig at Cycle Computing, where they’ve really blown the traditional HPC world apart by leveraging massive automation and cloud infrastructure for HPC.

Brandon’s question about how our community can get involved stirred up some thoughts I’ve had over the past few months about exactly that, so I’ve decided to finally get something in the works and I’ll try to post about it soon.

It was my first time on the Food Fight show and it was a ton of fun, I hope the conversation continues and hopefully we’ll get an opportunity to record a sequel.

Here’s the show –

Demystifying OpenStack Folsom Quotas

With the changes introduced in OpenStack Folsom, there are a few areas where quotas can trip you up if you’re not careful, both in Nova and in Cinder.

Nova Quotas

When setting quotas for users or tenants in Folsom, you need to specify the UUID of the user/tenant.

agoddard@control1:~# keystone tenant-list
+----------------------------------+---------+---------+
|                id                |   name  | enabled |
+----------------------------------+---------+---------+
| 040113b0d7824477aacf4ce39df69344 | service |   True  |
| 9af32c41cee140779d50afb2bc93e322 |   ops   |   True  |
+----------------------------------+---------+---------+

agoddard@control1:~# nova quota-show 9af32c41cee140779d50afb2bc93e322
+-----------------------------+--------+
| Property                    | Value  |
+-----------------------------+--------+
| cores                       | 500    |
| floating_ips                | 100    |
| gigabytes                   | 50000  |
| injected_file_content_bytes | 10240  |
| injected_files              | 5      |
| instances                   | 500    |
| metadata_items              | 128    |
| ram                         | 200000 |
| volumes                     | 5000   |
+-----------------------------+--------+

The tricky part is that setting or viewing the quota using a name will appear to work but it won’t actually reference the tenant (or any tenant for that matter):

agoddard@control1:~# nova quota-show ops
+-----------------------------+-------+
| Property                    | Value |
+-----------------------------+-------+
| cores                       | 500   |
| floating_ips                | 10    |
| gigabytes                   | 50000 |
| injected_file_content_bytes | 10240 |
| injected_files              | 5     |
| instances                   | 500   |
| metadata_items              | 128   |
| ram                         | 51200 |
| volumes                     | 10    |
+-----------------------------+-------+

As you can see, floating IPs, ram and volumes are all different when using the name rather than UUID.

It get’s a little stranger – what if we just make up a tenant:

agoddard@control1:~# nova quota-show nonexistent-tenant
+-----------------------------+-------+
| Property                    | Value |
+-----------------------------+-------+
| cores                       | 20    |
| floating_ips                | 10    |
| gigabytes                   | 1000  |
| injected_file_content_bytes | 10240 |
| injected_files              | 5     |
| instances                   | 10    |
| metadata_items              | 128   |
| ram                         | 51200 |
| volumes                     | 10    |
+-----------------------------+-------+
agoddard@control1:~# nova quota-update nonexistent-tenant --cores 100
agoddard@control1:~# nova quota-show nonexistent-tenant
+-----------------------------+-------+
| Property                    | Value |
+-----------------------------+-------+
| cores                       | 100   |
| floating_ips                | 10    |
| gigabytes                   | 1000  |
| injected_file_content_bytes | 10240 |
| injected_files              | 5     |
| instances                   | 10    |
| metadata_items              | 128   |
| ram                         | 51200 |
| volumes                     | 10    |
+-----------------------------+-------+

Nova created an entry for the tenant, even though the tenant doesn’t exist. This is what makes it appear that a change has been made when it hasn’t, and without an error being thrown, things can get pretty confusing.

Long story short, always use the UUID of a tenant when setting quotas.

Cinder Quotas

I recently found myself having to increase cinder volume quotas for a tenant, from 50TB to 60TB, and got caught in a similar situation to above, however in this case even the UUID’s wouldn’t work for me.

First, we can check the tenant’s quota, which is 50TB:

agoddard@control1:~# nova quota-show 9af32c41cee140779d50afb2bc93e322
+-----------------------------+--------+
| Property                    | Value  |
+-----------------------------+--------+
| cores                       | 500    |
| floating_ips                | 100    |
| gigabytes                   | 50000  |
| injected_file_content_bytes | 10240  |
| injected_files              | 5      |
| instances                   | 500    |
| metadata_items              | 128    |
| ram                         | 200000 |
| volumes                     | 5000   |
+-----------------------------+--------+

Now, we simply increase the quota, remembering to use the UUID so the change takes effect.

agoddard@control1:~# nova quota-update 9af32c41cee140779d50afb2bc93e322 --gigabytes 60000

And we can confirm the change:

agoddard@control1:~# nova quota-show 9af32c41cee140779d50afb2bc93e322
+-----------------------------+--------+
| Property                    | Value  |
+-----------------------------+--------+
| cores                       | 500    |
| floating_ips                | 100    |
| gigabytes                   | 60000  |
| injected_file_content_bytes | 10240  |
| injected_files              | 5      |
| instances                   | 500    |
| metadata_items              | 128    |
| ram                         | 200000 |
| volumes                     | 5000   |
+-----------------------------+--------+

It appears like everything is fine, however when I try to add additional volume space about 50TB, I get an API error about exceeding my quota, and Horizon shows the quota as 50TB. By checking the quota with the cinder command, we can see that the quota update didn’t actually have any effect at all:

agoddard@control1:~# cinder quota-show 9af32c41cee140779d50afb2bc93e322
+-----------+-------+
|  Property | Value |
+-----------+-------+
| gigabytes | 50000 |
|  volumes  |   10  |
+-----------+-------+

By running cinder quota-update rather than nova quota-update, the correct quota is set:

agoddard@control1:~# cinder quota-update 9af32c41cee140779d50afb2bc93e322 --gigabytes=60000
agoddard@control1:~# cinder quota-show 9af32c41cee140779d50afb2bc93e322
+-----------+-------+
|  Property | Value |
+-----------+-------+
| gigabytes | 60000 |
|  volumes  |   10  |
+-----------+-------+

Long story short, when setting Cinder quotas, always use cinder quota-update, and not nova quota-update.

Both of the above issues seem to be artifacts from changes that Folsom brought about, such as separating volumes from nova. A bugfix for the wrong quota being shown in horizon has been accepted for the Grizzly-3 milestone, due on Feb 21.

OpenStack Local LVM Instance Storage

I’ve been playing with OpenStack on and off since it was released, but recently I had the opportunity to finally build a production cluster. One of our requirements was to keep our storage as fast as possible, and we already had a bunch of hosts with quick disks, so this meant keeping instance storage on local disk and using raw disk backed VMs rather than file backed VMs. While it’s always been easy to attach local disk to VMs, doing it automatically through orchestration tools hasn’t been simple. As of the latest release (Folsom), OpenStack supports the provisioning of instance storage onto local LVM volumes, which is exactly what we needed. In order to configure local LVM storage for instances. I’ve read a few different docs that describe how to do it, but they seem to use different syntax, the following is what worked for me:

nova.conf on compute node
1
2
libvirt_images_type=lvm
libvirt_images_volume_group=nova_local

Any compute node you want to use local storage on requires those lines in the nova.conf file. You will also need to create a local LVM volume group called “nova_local” which can be done as follows.

1
2
3
# make sure /dev/sda1 is a free disk, formatted as "Linux LVM"
pvcreate /dev/sda1 #create an LVM physical volume from the disk
vgcreate nova_local /dev/sda1 #create the volume group

Running vgs should now show a “nova_local” volume group.

Nova will by default store disk image files in the /var/lib/nova/images directory, so these LVM volumes aren’t any more susceptible to local disk failures as the default configuration, but many people will mount shared storage to the nova images directory for high availability. In my case, I opted for high availability of persistent storage (through OpenStack Cinder), and performance on local stoage. One of the main reasons for this is that the bulk of our infrastructure can be quickly rebuilt by Chef, so in this case day to day performance trumps high availability.

Update: If you’re running OpenStack on Ubuntu 12.04+, or CentOS 6.2, there’s a bug which prevents LVM volumes being deleted when you attempt delete an instance. Until a patch is released for OpenStack, the workaround is to patch /usr/lib/python2.7/dist-packages/nova/virt/libvirt/utils.py as follows:

/usr/lib/python2.7/dist-packages/nova/virt/libvirt/utils.pylink
1
2
- out, err = execute('lvs', '--noheadings', '-o', 'lv_path', vg,
+ out, err = execute('lvs', '--noheadings', '-o', 'lv_name', vg,

Update #2: There’s a security bug in the folsom and grizzly implementation where a volume being reallocated could potentially contain data from its original allocation, there’s a patch for folsom and grizzly – details here

Vagrant

User Story:

  • As a sysadmin
    • I want to provision virtual machines quickly and in a repeatable fashion
    • so that I can setup test clusters in development and easily share these with others

As a sysadmin, I’m a big fan of simplicity of the libvirt ecosystem. Installing and configuring KVM and libvirt is a straightforward experience, and when learning the API, it’s nice to know that knowledge will extend beyond the next release cycle, product licensing change or corportate buyout. Using vm-builder makes provisioning simple guests fast, efficient and flexible with a ton of automated configuration options. On the dev side, provisioning Virtual machines for testing and development have always been a bit of a pain. Creating multiple images for devs and customizing all their settings can be a tedious process, it can be slow to physically get the VMs to the devs, and multiple variations either take a bunch of handholding or multiple images and lots more disk space. Of course, libvirt is an option on workstations also, it’s nice and simple to install these days thanks to brew, but still requires a decent amount of configuration, especially in a dev environment when things Should Just Work™.

Enter Vagrant

Vagrant is a ruby gem which performs automated building and provisioning of VirtualBox machines. Vagrant takes care of all the behind the scenes work with VirtualBox, so while you need VirtualBox installed, you technically never even need to launch the app.

So what? Double clicking an app doesn’t take much time, and I get to use a nice GUI.

A ha, this is where the awesomeness begins. To see why vagrant is so awesome, here’s a simple example of getting up and running (I’ve trimmed a bit of the verbosity from the responses, but this is honestly it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
[~/dev/vagrant-demo]$ gem install vagrant Successfully installed vagrant-0.8.2
[~/dev/vagrant-demo]$ vagrant box add lucid64 http://files.vagrantup.com/lucid64.box
[vagrant] Downloading box: http://files.vagrantup.com/lucid64.box
[vagrant] Verifying box...
[~/dev/vagrant-demo]$ vagrant init lucid64 create Vagrantfile
[~/dev/vagrant-demo]$ vagrant up
[default] Importing base box 'lucid64'...
[default] Forwarding ports...
[default] -- ssh: 22 => 2222 (adapter 1)
[default] VM booted and ready for use!
[default] Mounting shared folders...
[default] -- v-root: /vagrant
[~/dev/vagrant-demo]$ vagrant ssh Welcome to the Ubuntu Server!
vagrant@lucid64:~$ cat /etc/issue Ubuntu 10.04.3 LTS

So what’s actually going on there? Basically, the lucid64.box is a preconfigured template, containing a base install of ubuntu 10.04, a pre-defined vagrant user and some tools for open box. When initializing the machine, vagrant creates a Vagrantfile which is a simple ruby configuration script. How simple you ask? It doesn’t get much simpler than this:

1
2
3
4

Vagrant::Config.run do |config|
  config.vm.box = "lucid64"
end

Of course, simple doesn’t mean it’s lacking in ability – there’s a whole host of configuration options you can specify from simple ram settings to auto provisioning with chef, puppet or even simple bash scripts. Here’s an example with a few more options thrown in:

1
2
3
4
5
6
7

Vagrant::Config.run do |config|
  config.vm.box = "lucid64"
  config.vm.memory_size = 4096
  config.vm.host_name = 'awesome'
  config.vm.network "33.33.33.105"
end

What about when you’re done? you can simply power off the VM, suspend it, or destroy it, deleting its disks There really is a ton of customization you can run on these things, so before I get too carried away, I’ll suggest you check out the [docs] and see for yourself.

What about security? Who made this “box” file?

I hear you, but fear not. For those who want to build their own boxes, not only is this possible, but with a tool called veewee, @patrickdebois has made is insanely easy. I’ll cover that in a followup post, but rest assured, this is Not a Problem™.

But wait, there’s more…

You can have more than one VM in a Vagrantfile and when assigned IPs, such as in the example above, these hosts can communicate over a private, host-only network. This paves the way for setting up whole stacks / clusters on a single host, bringing dev/test-like-prod nirvana just one step closer.

Some Nice New Features in Chef 0.10.0

No sooner than I drafted my last post, Opscode released released a huge update to Chef, 0.10.0. Being on on vacation during the lead up to the release, I missed a few of the pre-release announcements, so while I’d heard that some big features, such as environments were coming, I hadn’t realized how many cool features were due for release in 0.10.0 and I thought I’d mention a few of my favorite and perhaps lesser known new features here.

Knife Plugins

Co-incidentally, the 0.10.0 release came out the same week that we started work on our first knife plugin, and the new architecture has made the process very straightforward. You don’t need to delve into to depths of your gems to extend knife, you can simply throw the plugin in a .chef/plugins/knife directory in your home folder or in your cookbook repo. There’s a quick guide to get you started on the Opscode blog.

Encrypted Data Bags

One of the first questions I get asked when talking about Chef generally relates to security. The new release being able to encrypt the contents of data bags is a big step forward on this front. This means you can now encrypt sensitive values which are stored in your data bags, such as database passwords. To keep things secure, you can also configure the decryption keys at a node level so that only nodes that should have access to the data can see it. @lusis also added a patch to support for storing decryption keys at a URL instead of a local file, a nice addition which should find its way into an upcoming release.

Chef Expander

This behind the scenes update will probably stay out of your way for the most part, but it’s cool all the same. Indexing is now taken care of by a new tool called chef-expander, which replaces the old chef-solr-indexer. The cool thing about chef-expander is the ability to setup a cluster of worker nodes to farm out the indexing process. Small installations are fine with just one process running, but it’s nice to know that this can scale horizontally as your infrastructure grows.

Cookbook Versioning

In my recent post cookbooks as gems I mentioned a few features I’d like to see in the cookbooks architecture, the main one being a straightforward way of managing different versions of cookbooks. Version 0.10 addresses this and more, including the ability to freeze cookbooks when uploading to the Chef server and the option to set cookbook version constraints in environments. Judging by the upgrades to the cookbook features in knife and their latest post on the community cookbook site cookbooks management is definitely a high priority in Chef I’m very excited to see how this evolves as Chef moves beyond 0.10

Upgrading

If the above features aren’t a good enough argument to upgrade straight away, check out the release notes for a full account of what 0.10.0 brings to the table. The 0.10.0 server is compatible with 0.9.x clients, and the process for upgrading both the server and clients is trivial. Instructions can be found on the Opscode wiki.

Cookbooks as Gems.

Update: Many of the questions below have been resolved in the 0.10.0 release of Chef and Opscode have also provided a great overview of the current state of the community cookbooks repository.

User Story:

  • As a sysadmin
    • I’d like to version cookbooks and download custom cookbooks
    • so that I can better manage cookbook dependencies.

I’ve been thinking a lot about what the future of the Opscode community cookbooks site / API might look like. Every way I think about it, I keep coming back to a model similar to that of Ruby gems and I’m interested in knowing if this view is shared and to what extent this parallel makes sense. I think the cookbooks site as it stands is great and in some senses, the cookbooks site is really the heart of chef. Without having such an easy path to ‘vendor’ the apache2 cookbook for example, the experience of first time chef users might not be the wonderful experience it is today. There are however some cases which users might come across which don’t (at least obviously) have a solution in the current cookbooks site, and it would be great to see if the community can solve some of these. The scenarios that immediately come to mind are: * “The new Y cookbook broke my X cookbook, I need the X cookbook to be dependent on the old version of Y until I can fix it ” * “My friend just sent me a cookbook she wrote, which depends on a version of the apache2 cookbook which she modified, I’d like to install her apache2 cookbook and use it alongside the default community apache2 cookbook” * “I’ve seen cookbooks on github, such as at 37Signals. Can I also use knife to install and manage these cookbooks?” Rather than get into too much detail about those specific cases, I thought I’d provide an example of what a rubygems-esque cookbook management process might look like

1
2
3
4
5
knife cookbook install apache2 # installs the apache2 cookbook along with dependencies 
knife cookbook install apache2 --version=1.02 # installs a legacy version of the apache2 cookbook 
knife cookbook install agoddard-apache2 # installs my fork of the apache2 cookbook along with dependencies 
knife cookbook install agoddard-custom_weird_app # installs my cookbook and dependencies for an obscure app that only I use but which I want to manage the same way 
knife cookbook install custom_weird_app # the above cookbook when the obscure app becomes more mainstream

…and how these might look in a recipe…

1
2
3
4
5
6
include_recipe "apache2"
include_recipe "agoddard-apache2"
include_recipe "apache2", "=1.02"
%w{ apache2 agoddard-apache2 apache2-1.02 }.each do |cb|
  depends cb
end

I’m sure there’s things I’m missing here, but I’d love to see where this concept leads.. cookbooks already support versioning but I’m not sure if there’s a simple way of maintaining two versions and having legacy cookbooks support the older one. It’s obviously currently possible to manually install any cookbook you like, though you then loose the added benefits of using the knife site vendor command, such as branching and dependency resolution. Another area this might help is in keeping cookbooks up to date. Currently if a user wants to update a cookbook, they will fork the cookbook, apply their changes and then send the maintainer a pull request. If the maintainer is unavailable, it’s hard for the changes to get back to the community. If in this case the user submitting the changes could simply upload their modified cookbook to the cookbooks site, prefixed with their username (to distinguish it from the main / official cookbook) then the cookbook will be available to the community while they wait for the official version to be patched, tested etc. This isn’t ideal, but it’s more ideal than an update to your chef server breaking an important cookbook which is reliant on a deprecated feature of the server, for example. I’m sure there are many other approaches to this same issue and I’d love to know what others think. The cookbooks site is a fantastic feature and when I look at it, I feel like I’m looking at the beginnings of something big, like github or rubygems and I’m excited to see where it goes.

Updating Rubygems for Chef-client on Debian Lenny

User Story:

As a sysadmin I need the latest rubygems package so that chef-client and other tools that rely on rubygems will work.

We have a host which we needed to run chef-client on, unfortunately, the version of rubygems installed by apt was 1.2.0, which resulted in the following error:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
# chef-client
[Thu, 20 Jan 2011 20:45:43 +0000] INFO: Starting Chef Run (Version 0.9.12)
[Thu, 20 Jan 2011 20:45:43 +0000] WARN: Missing gem 'mysql'
[Thu, 20 Jan 2011 20:45:44 +0000] ERROR: Running exception handlers
[Thu, 20 Jan 2011 20:45:44 +0000] ERROR: Exception handlers complete
/usr/lib/ruby/1.8/chef/provider/package/rubygems.rb:89:in `with_gem_sources': undefined method `sources=' for Gem:Module (NoMethodError)
from /usr/lib/ruby/1.8/chef/provider/package/rubygems.rb:206:in `candidate_version_from_remote'
from /usr/lib/ruby/1.8/chef/provider/package/rubygems.rb:373:in `candidate_version'
from /usr/lib/ruby/1.8/chef/provider/package.rb:44:in `action_install'
from /usr/lib/ruby/1.8/chef/resource.rb:395:in `send'
from /usr/lib/ruby/1.8/chef/resource.rb:395:in `run_action'
from /var/cache/chef/cookbooks/mysql/recipes/client.rb:62:in `from_file'
from /usr/lib/ruby/1.8/chef/cookbook_version.rb:472:in `load_recipe'
from /usr/lib/ruby/1.8/chef/mixin/language_include_recipe.rb:40:in `include_recipe'
from /usr/lib/ruby/1.8/chef/mixin/language_include_recipe.rb:27:in `each'
from /usr/lib/ruby/1.8/chef/mixin/language_include_recipe.rb:27:in `include_recipe'
from /var/cache/chef/cookbooks/mysql/recipes/server.rb:20:in `from_file'
from /usr/lib/ruby/1.8/chef/cookbook_version.rb:472:in `load_recipe'
from /usr/lib/ruby/1.8/chef/mixin/language_include_recipe.rb:40:in `include_recipe'
from /usr/lib/ruby/1.8/chef/mixin/language_include_recipe.rb:27:in `each'
from /usr/lib/ruby/1.8/chef/mixin/language_include_recipe.rb:27:in `include_recipe'
from /var/cache/chef/cookbooks/drupal/recipes/default.rb:23:in `from_file'
from /usr/lib/ruby/1.8/chef/cookbook_version.rb:472:in `load_recipe'
from /usr/lib/ruby/1.8/chef/mixin/language_include_recipe.rb:40:in `include_recipe'
from /usr/lib/ruby/1.8/chef/mixin/language_include_recipe.rb:27:in `each'
from /usr/lib/ruby/1.8/chef/mixin/language_include_recipe.rb:27:in `include_recipe'
from /usr/lib/ruby/1.8/chef/run_context.rb:94:in `load'
from /usr/lib/ruby/1.8/chef/run_context.rb:91:in `each'
from /usr/lib/ruby/1.8/chef/run_context.rb:91:in `load'
from /usr/lib/ruby/1.8/chef/run_context.rb:55:in `initialize'
from /usr/lib/ruby/1.8/chef/client.rb:166:in `new'
from /usr/lib/ruby/1.8/chef/client.rb:166:in `run'
from /usr/lib/ruby/1.8/chef/application/client.rb:222:in `run_application'
from /usr/lib/ruby/1.8/chef/application/client.rb:212:in `loop'
from /usr/lib/ruby/1.8/chef/application/client.rb:212:in `run_application'
from /usr/lib/ruby/1.8/chef/application.rb:62:in `run'
from /usr/bin/chef-client:26

A quick post to the opscode discussion forum (actually the wrong forum but jtimberman was kind enough to help out) told us that the error was the result of the outdated rubygems package. The easy solution was to simply use rubygem’s own self-updating mechanism by running gem update --system. Easy enough, however the version installed by apt doesn’t actually permit this:

1
2
ERROR:  While executing gem ... (RuntimeError)
    gem update --system is disabled on Debian. RubyGems can be updated using the official Debian repositories by aptitude or apt-get.

At this point we could have introduced new apt sources to see if they contained more recent versions, but we decided we really wanted rubygems to be able to update itself. Rather than reinstall rubygems from source, we found that there was a rubygem on rubygems.org for exactly this purpose “rubygems-update”. Unfortunately for some reason, perhaps an incompatibility between rubygems 1.2.0 and rubygems.org, even after adding rubygems.org as a gem source, gem complained it was unable to find the repository online. Downloading the gem and installing it locally provided the fix for that:

1
2
3
4
5
6
7
8
gem -v
#1.2.0
wget http://production.cf.rubygems.org/gems/rubygems-update-1.4.2.gem
gem install rubygems-update-1.4.2.gem --local
cd /var/lib/gems/1.8/bin
./update_rubygems
gem -v
# 1.4.2

Problem solved. With the new version of rubygems installed, we’re now able to run gem update —system to stay up to date.

I believe rubygems was originally installed as a dependency of the chef package in the opscode apt repo, and the rubygems version in there is 1.2.0. So it appears that this problem would always occur when bootstrapping a new debian lenny machine with chef-client using only the apt packages, so I’ll be switching to bootstrapping via the chef gems rather than using apt. I’ll pose this question to the smart folks at opscode though and see if there isn’t a simple answer to the package dependency issue.

Basic XenServer CLI Use

I recently found myself in a situation where I had to manage a XenServer resource pool over SSH. XenServer’s ‘xsconsole’ tool provided me with a lot of options, but none that would allow me to boot a guest which was powered off.

User Story:

As a sysadmin I need to start a XenServer guest from the command line so that I can start guests without having to use a GUI.

When using Resource Pools, there is no way within the console to boot machines which are powered off – attempting to view all machines results in the error message: “This feature is unavailable in Pools with more than 100 Virtual Machines”, even if you have less than 100 virtual machines.

Here’s where the XenServer command line applications come in handy. In order to boot a machine which is powered off, you can ssh to any machine in the pool and run xe vm-list

1
2
3
4
5
[root@ubio-vmh07 ~]# xe vm-list
uuid ( RO): 0ebb9d7d-1743-f9a0-f5b8-692930cc3ad0
name-label ( RW): app10
power-state ( RO): halted
[root@ubio-vmh07 ~]#

This will show you all of your machines by name, state and uuid. Once you find the machine you want to boot, simply run xe vm-start and pass it the uuid you want to boot: [code lang=‘bash’] [root@ubio-vmh07 ~]# xe vm-start uuid=0ebb9d7d-1743-f9a0-f5b8-692930cc3ad0 [/code]

and the machine will boot right up. You can then continue to manage the VM using xe or via the xsconsole on the host which is running the VM.

The xe help will show you a list of available commands:

1
[root@ubio-vmh07 ~]# xe help --all

Querying Chef Using the REST API

Assumptions: You have a working knife configuration.

The initial user story that prompted me to figure out how to interact with Chef Server REST API was this:

  • As a System Administrator
  • I want to be able to see what IP addresses are being used
  • In order for me to correctly assign a free IP to a new machine.

In order to access the Chef Server programmatically I first had to figure out authentication. Unfortunately, I couldn’t find any good examples of how to actually do this. The Chef Server API wiki page has some basic requirements and concepts, but no concrete examples.

However, it gave me a good starting place—the Chef::REST library. The REST library that comes bundled with Chef in every version > 0.9.0. Let’s just make sure that we actually have that version:

1
2
$ knife --version
Chef: 0.9.12

Now that we know the version is ok we’re basically done. All we have to do is load the knife configuration file and then we can use the built in rest library. Here’s the code I ended up with:

1
2
3
4
5
6
7
require 'bundler/setup'
require 'chef'

Chef::Config.from_file("/path/to/knife.rb")
rest = Chef::REST.new(http://host:port)

nodes = rest.get_rest(/nodes)

Now you can do something like:

1
2
3
nodes.keys.each do |key|
  puts rest.get_rest(/nodes/#{key}”)[:name]
end

So now we just print each IP address that is being used and we can figure out the next free IP address in any given range. That completes this user story!

There was only one gotcha in the whole process. Trying to print a node just throws an ArgumentError:

1
puts node #=> ArgumentError: Attribute to_ary is not defined!