Ansible and EC2 Auto Scaling Groups: False-positive idempotency errors and a workaround

When using Ansible to deploy and manage EC2 auto scaling groups (ASGs) in AWS, you may encounter, like I have recently, an issue with idempotency errors that can be somewhat befuddling. Basically, when the ec2_asg module is called, one of its properties, vpc_zone_identifier, is used to define the subnets used by the ASG. A typical ASG configuration is to use two subnets, each one in a different availability zone, for a robust HA configuration, like so:

- name: "create auto scaling group"
  local_action:
    module: ec2_asg
    name: "{{ asg_name }}"
    desired_capacity: "{{ desired_capacity }}"
    launch_config_name: "{{ launch_config }}"
    min_size: 2
    max_size: 3
    desired_capacity: 2
    region: "{{ region }}"
    vpc_zone_identifier: "{{ subnet_ids }}"
    state: present

Upon subsequent Ansible plays, when ec2_asg is called, but no changes are made, you can still experience a changed=true result because of how Ansible is ordering the subnet-id’s used in vpc_zone_identifier versus how AWS is ordering them. This makes the play non-idempotent. How does this happen?

It turns out that Ansible’s ec2_asg module sorts the subnet-ids, while AWS does not when it returns those values. Here is the relevant code from the v2.3.0.0 version of ec2_asg.py, notice the sorting that happens in an attempt to match what AWS provides as an order:

 518 for attr in ASG_ATTRIBUTES:
 519     if module.params.get(attr, None) is not None:
 520         module_attr = module.params.get(attr)
 521         if attr == 'vpc_zone_identifier':
 522             module_attr = ','.join(module_attr)
 523         group_attr = getattr(as_group, attr)
 524         # we do this because AWS and the module may return the same list
 525         # sorted differently
 526         if attr != 'termination_policies':
 527             try:
 528                 module_attr.sort()
 529             except:
 530                 pass
 531             try:
 532                 group_attr.sort()
 533             except:
 534                 pass
 535         if group_attr != module_attr:
 536             changed = True
 537             setattr(as_group, attr, module_attr)
 538

While this is all well and good, AWS does not follow any specific ordering algorithm when it returns values for subnet-ids in the ASG context. So, when AWS returns its subnet-id list for the ec2_asg call, Ansible will sometimes have a different order in its ec2_asg configuration and then incorrectly interpret the difference between the two lists as a change and mark it thusly. If you are counting on your Ansible plays to be perfectly idempotent, this is problematic. There is now an open GitHub issue about this specific problem.

The good news is that the latest development version of ec2_asg, which is also written using boto3, does not exhibit this false-positive idempotency error issue. The devel version of ec2_asg (i.e., unreleased 2.4.0.0) is altogether different than what ships in current stable releases. So, these false-positive idempotency errors can occur in releases up to and including version 2.3.1.0 (I have found it in 2.2.1.0, 2.3.0.0, and 2.3.1.0).  Sometime soon, we should have a version of ec2_asg that behaves idempotently. But what to do until then?

One approach is to write a custom library in Python that you use instead of ec2_asg. While feasible, it would involve a lot of time spent verifying integration with both AWS and existing Ansible AWS modules.

Another approach, and one I took recently, is to simply ask AWS what it has for the order of subnet-ids to be in vpc_zone_identifier and then plug that ordering into what I pass to ec2_asg during each run.

Prior to running ec2_asg, I use the command module to run the AWSCLI autoscaling utility and query for the contents of VPCZoneIdentifier. Then I take those results and use them as the ordered list that I pass into ec2_asg afterward:

- name: "check for ASG subnet order due to idempotency failures with ec2_asg"
  command: 'aws autoscaling describe-auto-scaling-groups --region "{{ region }}" --auto-scaling-group-names "{{ asg_name }}" '
  register: describe_asg
  changed_when: false

- name: "parse the json input from aws describe-auto-scaling-groups"
  set_fact: asg="{{ describe_asg.stdout | from_json }}"

- name: "get vpc_zone_identifier and parse for subnet-id ordering"
  set_fact: asg_subnets="{{ asg.AutoScalingGroups[0].VPCZoneIdentifier.split(',') }}"
  when: asg.AutoScalingGroups

- name: "update subnet_ids on subsequent runs"
  set_fact: my_subnet_ids="{{ asg_subnets }}"
  when: asg.AutoScalingGroups

# now use the AWS-sorted list, my_subnet_ids, as the content of vpc_zone_identifier

- name: "create auto scaling group"
  local_action:
    module: ec2_asg
    name: "{{ asg_name }}"
    desired_capacity: "{{ desired_capacity }}"
    launch_config_name: "{{ launch_config }}"
    min_size: 2
    max_size: 3
    desired_capacity: 2
    region: "{{ region }}"
    vpc_zone_identifier: "{{ my_subnet_ids }}"
    state: present

On each run, the following happens:

  1. A command task runs the AWSCLI to describe the autoscaling group in question. If it’s the first run, an empty array is returned. The result is registered as asg_describe.
  2. The JSON data in asg_describe is copied into a new Ansible fact called “asg”
  3. The subnets in use by the ASG and how they are ordered is determined by extracting the VPCZoneIdentifier attribute from the AutoScalingGroup (asg fact). If it’s the first run, this step is skipped because of the when: clause which limits task execution to runs where the ASG already exists (runs 2 and later). It puts this list into the fact called “asg_subnets”
  4. Using the AWS-ordered list from step 3, Ansible sets a new fact called “my_subnet_ids”, which is then specified as the value to vpc_zone_identifier when ec2_asg is called.

I did a test on the idempotency of the play by running Ansible one hundred times after the ASG was created; at no point did I receive a false-positive change. Prior to this workaround, it would happen every run if I happened to be specifying subnet-ids ordered differently than from what AWS returned in terms of their order.

While this is admittedly somewhat kludgy, at least I can be confident that my plays involving AWS EC2 autoscaling groups will actually behave idempotently when they should. In the meantime, while we wait for the next update to Ansible’s ec2_asg module, this workaround can be used successfully to avoid false positive idempotency errors.

Until next time, have fun getting your Ansible on!

Managing pre-commit hooks in Git

Git comes with support for action sequences based on repository activity. Pushes, pulls, merges, and commits can all be configured to trigger specific custom actions responsively. Often, the custom actions are geared towards promoting intra-team communication, process-gating like enforcing commit log standards and best practices for code content/syntax. Of course, CI workflows already are a popular frontline of defense for bad code pushes by using linters and syntax checks as part of an initial test stage in a pipeline. However, those linting processes have a cost in terms of compute resources they utilize. In a large development environment, this can translate into real money costs pretty quickly. Why spend compute cycles on pipeline jobs that can be just as easily run on a developer’s workstation or laptop? So, let’s take a closer look at git hooks…

Git hooks are configured in a given repo via files located in /.git/hooks. A new repository will be automagically populated with these handy tools, which are inactive by default thanks to the file suffix “.sample”:

[rcrelia@fuji hooks (GIT_DIR!)]$ ls -latr
total 40
-rwxr-xr-x 1 rcrelia staff 3611 Dec 22 2014 update.sample
-rwxr-xr-x 1 rcrelia staff 1239 Dec 22 2014 prepare-commit-msg.sample
-rwxr-xr-x 1 rcrelia staff 4951 Dec 22 2014 pre-rebase.sample
-rwxr-xr-x 1 rcrelia staff 1356 Dec 22 2014 pre-push.sample
-rwxr-xr-x 1 rcrelia staff 1642 Dec 22 2014 pre-commit.sample
-rwxr-xr-x 1 rcrelia staff  398 Dec 22 2014 pre-applypatch.sample
-rwxr-xr-x 1 rcrelia staff  189 Dec 22 2014 post-update.sample
-rwxr-xr-x 1 rcrelia staff  896 Dec 22 2014 commit-msg.sample
-rwxr-xr-x 1 rcrelia staff  452 Dec 22 2014 applypatch-msg.sample
drwxr-xr-x 11 rcrelia staff 374 Dec 22 2014 .
drwxr-xr-x 15 rcrelia staff 510 Jan 4 2015 ..

Having individual hooks like these provides a powerful framework for customizing your repository usage to your specific needs and workflows. Each one is triggered at the stage of git workflow described by the filename (pre-commit, post-update, etc.)

Tools like linters fit nicely into the pre-commit action sequence. By configuring the pre-commit hook with a linter, you are delivering higher quality code to your pipelines which makes for a more efficient use of your compute resource budget.

Hook Management: Yelp’s pre-commit

I recently started using an open source utility released by Yelp’s engineers called simply, “pre-commit”. Essentially, it is a framework for managing the pre-commit hook in a git repository, using a single configuration file with multiple action sequences. This allows for a single pre-commit hook to do many different sorts of actions. It includes some basic linter capabilities as well as other code quality control features, but also is integrated with other projects (e.g., ansible-lint has a pre-commit hook available).

Setup is straightforward, as is the usage. Here’s how I did it:

pip install pre-commit
cd repodir
pre-commit install
# edit a new file called .pre-commit-config.yaml in the root of your repo
git add .pre-commit-config.yaml
git commit -m "turn on pre-commit hook"

Immediately you should see the pre-commit utility do its thing when you commit this or any other change to your repository.

Here is the current working version of my pre-commit config file (one per repo):

- repo: git://github.com/pre-commit/pre-commit-hooks
  sha: v0.7.1
  hooks:
   - id: trailing-whitespace
     files: \.(js|rb|md|py|sh|txt|yaml|yml)$
   - id: check-json
     files: \.(json|template)$
   - id: check-yaml
     files: \.(yml|yaml)$
   - id: detect-private-key
   - id: detect-aws-credentials
- repo: git://github.com/detailyang/pre-commit-shell
  sha: 6f03a87e054d25f8a229cef9005f39dd053a9fcb
  hooks:
   - id: shell-lint

So, I’m using some of pre-commit’s built-in handlers for whitespace cleanup, JSON linting, YAML linting, and checking to make sure I don’t include any private keys or AWS credentials in my commits. Also, I’ve integrated a third-party tool, pre-commit-shell, that is a wrapper to shellcheck for syntax checking and enforcing best practices in any shell scripts I might add to the repo.

And here is an example of a code commit that triggers pre-commit’s operation:

[rcrelia@fuji aws-mojo (master +=)]$ git commit -m "pre-commit"
[INFO] Installing environment for git://github.com/pre-commit/pre-commit-hooks.
[INFO] Once installed this environment will be reused.
[INFO] This may take a few minutes...
Trim Trailing Whitespace.................................................Passed
Check JSON...........................................(no files to check)Skipped
Check Yaml...............................................................Passed
Detect Private Key.......................................................Passed
Detect AWS Credentials...................................................Passed
Shell Syntax Check...................................(no files to check)Skipped
[master 7d837e7] pre-commit
 1 file changed, 15 insertions(+)
 create mode 100644 .pre-commit-config.yaml

While pre-commit doesn’t handle management of the other available Git hooks, it does a very good job with what it does control, with a robust plugin interface and the ability to write custom hooks.

If you find yourself in need of some automated linting of your code before you push to your remote repositories, I highly recommend the use of pre-commit for its ease of use and operational flexibility.

Happy coding!

Gitlab Repo Best Practices

I recently had to come up with some guidelines for others to use when it comes to using shared Gitlab repositories in a CI/CD configuration. Here is my take based on my experiences so far, if you have any more to share please drop me a line/comment here.

Note: Gitlab uses the term Merge Request for what is commonly referred to in other CI frameworks as Pull Requests… just a little FYI 🙂

Gitlab Repo Usage – Best Practices and Tips

  • Create MR’s when you are at a point where you want/need to see your changes in action (i.e., merged into master, tested, and deployed).
  • If you will be making more related changes later in the branch, do not opt to have the source branch removed from the repository when submitting your MR.
  • At a minimum, you should merge at least once per day, especially if others are working on the same codebase at the same time. This makes it easier to resolve merge conflicts, which occur when two developers change the same repository content/object in their own respective branches and one merges ahead of the other.
  • Merge conflicts happen. Don’t worry if you experience one. Try to troubleshoot on your own, but if you cannot resolve it by yourself, pull in the other developer(s) whose changes are affecting your merge attempt and work together to resolve them.
  • When creating a MR, indicate in the Title whether or not it is time-sensitive by adding ” – ASAP” to the end of the Title text. This helps reviewers prioritize their review requests with minimal disruption.
  • Do NOT approve your own MR if it involves a code change. The peer-review component of Merge Requests is an opportunity to communicate and share awareness of changes on the team. That said, here are some scenarios where it is ok to approve your own MR’s:
    • you are pushing non-operational changes (e.g., comments, documentation)
    • you are the only developer available and it’s an important change or if waiting for MR review blocks progress signficantly (use good judgment)
  • When adding to a branch, keep your commits as specific as possible when modifying code. Each commit should be understandable on its own, even if there are other commits in the branch
  • Not all MR’s need to hit a pipeline. Depending on the repo pipeline configuration, some branch name filters may exist to insure a certain type of branch gets tested while other types do not. This is especially true of non-code changes (e.g., updating a README)
  • When starting new development as opposed to modifying existing code, it may make sense to create a personal repo or to use a fork of a shared repo to do a lot of iteration quickly without having to do a formal MR process in a shared repo. Once you’ve got some code ready for sharing, you can migrate it manually (copy) into the shared repo and work off MR’s going forward. Not required at all, but it can allow for more rapid iteration especially on small teams.

GitHub Pages

I’m taking GitHub Pages for a spin with the ansible-mojo repo… seems like a nice way to personalize your GitHub presence and contributions.

githubpages

Atom (is) Smashing

Ask a developer or sysadmin about their favorite code editor and you’re likely to get a passionate reply, one that might involve several minutes of frank words and trash talk about any editor besides THE ONE. Up until recently, I was a diehard CLI coder, with vi being my editor of choice. With over twenty years of experience as a sysadmin, I grew up on vi-style text editing, to the point that as I would enter brief dalliances with GUI editors, I would make sure to get my vi-compatible key mapping in place. The  motor memory savings alone was worth the effort.

A few years ago, I switched to Sublime Text, which never felt right to me, despite it being quite usable, feature-rich,  and popular. The proprietary nature of the software always got under my craw given my FOSS roots. Then, one day about a year ago, I discovered GitHub’s Atom and I haven’t looked back since.

Atom is a near-clone of Sublime in terms of look, feel, and functionality and yet it’s open-source.  It has a vast and rich community of plugin development that is over 5,000 packages strong and growing. Package installation and management is done easily within the Atom UI but also customizable via configuration files. Here is a list of some of the installed packages in my current Atom installation:

  • atom-beautify
  • atom-json-color
  • autocomplete-json
  • autocomplete-modules
  • autocomplete-python
  • editor-stats
  • ex-mode
  • file-icons
  • git-plus
  • highlight-line
  • highlight-selected
  • linter
  • linter-jsonlint
  • merge-conflicts
  • minimap
  • monokai-json
  • pretty-json
  • project-manager
  • rulerz
  • Sublime-Style-Column-Selection
  • vim-mode
  • vim-surround

You’ll notice I have my vi keymapping support in there via ex-mode, vim-mode, and vim-surround. 😉

My favorite package in terms of productivity boost is git-plus. Git-plus allows you to execute git commands within the Atom UI as you edit files. I highly recommend it. So much so that I made this screencast to demonstrate how easily I was able to push changes to a GitHub repo of mine after making a quick edit to a README file.

Ansible Shenanigans: Part II – Sample Playbook Usage

In Part I, I talked about why Ansible and how to configure your own installation using Vagrant, VirtualBox, and Ansible. Now, let’s take a closer look at using Ansible along with the details of my demo playbook collection ansible-mojo.

Once Ansible is up and running, it is extremely useful for managing nodes using ad-hoc commands. However, it really shines once you start developing collections of commands, or “plays”, in the form of “playbooks” to manage nodes. Similar to recipes and cookbooks in Chef, Ansible’s plays and playbooks are the basis for a best-practice implementation of Ansible to manage your infrastructure in a consistent, flexible, and repeatable fashion.

For ansible-mojo, I wanted to create a set of simple playbooks that would be helpful in demonstrating how to configure nodes with some basic things like:

  • a dedicated user account “ansible” for deployment standardization,
  • installation of standard packages,
  • management of users and sudoers content

Initial Ansible Playbook Run

The anisible-mojo repo contains several files: playbooks, a variables file, and a couple of shell environment files. All playbook content is based on YAML-formatted text files that are easily understandable. I opted to have a single primary playbook (main.yml) that does some initial node configuration then includes other playbooks for specific configuration changes like configuring users (user-config.yml) and installing sysstat for SAR reporting (sysstat-config.yml).

Before I go into details on each of the playbooks, let’s go ahead and do an initial playbook run against our Ubuntu Vagrant box so that we can issue further commands using our dedicated deployment user account “ansible” instead of the “vagrant” user.

NOTE: Be sure that you change authorized_keys in ansible-mojo to contain the public key that you configured your ssh-agent to use for deployment as mentioned in Part I.

In this case, I am using a Vagrant machine called “myvm” and will specify the -e override for ansible_ssh_user to ignore the remote_user setting in ansible.cfg:

[rcrelia@fuji ansible]$ ansible-playbook main.yml -e ansible_ssh_user=vagrant

PLAY [all] *********************************************************************

TASK [setup] *******************************************************************
ok: [myvm]

TASK [Ensure ntpdate is installed] *********************************************
changed: [myvm]

TASK [Ensure ntp is installed] *************************************************
changed: [myvm]

TASK [Ensure ntp is running and enabled at boot] *******************************
ok: [myvm]

TASK [Ensure aptitude is installed] ********************************************
changed: [myvm]

TASK [Update apt package cache if older than one day] **************************
changed: [myvm]

TASK [Add user group] **********************************************************
changed: [myvm] => (item={u'user_uid': 2000, u'user_rc': u'bashrc', u'user_profile': u'bash_profile', u'sudoers': True, u'user_groups': u'users', u'user_gecos': u'ansible user', u'user_shell': u'/bin/bash', u'user_name': u'ansible'})
changed: [myvm] => (item={u'user_uid': 2001, u'user_rc': u'bashrc', u'user_profile': u'bash_profile', u'sudoers': False, u'user_groups': u'users', u'user_gecos': u'Bob Dobbs', u'user_shell': u'/bin/bash', u'user_name': u'bdobbs'})

... SNIP ...

TASK [Install sysstat] *********************************************************
changed: [myvm]

TASK [Configure sysstat] *******************************************************
changed: [myvm]

TASK [Restart sysstat] *********************************************************
changed: [myvm]

PLAY RECAP *********************************************************************
myvm : ok=17 changed=15 unreachable=0 failed=0

Success! Now, we should have the “ansible” user account provisioned on the Vagrant machine and we will perform all future Ansible plays using that account as specified in ansible.cfg in the remote_user setting.

A Closer Look at Playbooks

ansible-mojo contains several files, with all Ansible syntax included in the YAML files:

  • main.yml
  • vars.yml
  • reboot.yml
  • Playbooks nested below main.yml:
    • user-config.yml
      • ssh-config.yml
      • sudoers-config.yml
    • sysstat-config.yml

Each of these files with the exception of vars.yml is an Ansible playbook. I created a primary playbook called “main” which in turn references a file containing miscellaneous variables (another YAML file called vars.yml), along with two other playbooks, user-config (configures user accounts) and sysstat-config (configures SAR reporting). These latter two files are nested playbooks: their execution is dependent on syntax in the main playbook and the vars_file. Finally, user-config includes two playbooks, one for configuring SSH in user accounts and one for configuring sudo access.

At the beginning of the main playbook, we see that the plays are scoped to all hosts in Ansible’s inventory (hosts: all), that plays will be run as a privileged user on the nodes (become: yes), and that some variables have been stored outside of playbooks in a single location called vars.yml. This pattern of using vars_files allows you to have a single place for information that you may not want to distribute along with playbooks (e.g., user account details) for security reasons.

Next, Ansible tasks (or actions) are defined for installing and configuring ntpd on our nodes, along with aptitude, and a command to update the apt packages on a node if the last update was longer ago than 24 hours.

The nesting of playbooks is a pattern that supports reusability and portability of playbook content, provided you don’t hardcode variables in them. Let’s take a closer look at some of these nested playbooks.

Managing users: user-config.yml

Since user-config is a nested playbook, it consists of a sequence of tasks without any operating parameters like host-scoping or privilege/role settings. It does five things before calling its own nested playbooks at the end:

  1. Creates a user’s primary group using the user’s UID as the GID, via the Ansible group module
  2. Creates a user via the Ansible user module
  3. Creates a user’s .bash_profile via the Ansible copy module
  4. Creates a user’s .bashrc via the Ansible copy module
  5. Creates a user’s $HOME/bin directory via the Ansible file module

The syntax is pretty clear about what is happening if you have even the most basic sort of experience managing user accounts on a UNIX/Linux server. Isn’t Ansible awesome?

What may not be so clear is the syntax that uses the “item.” prefix in variable names. Basically, I designed the playbook to use with_items feature of Ansible so I could iterate through multiple users without duplicating a lot of syntax. The “{{ users }}” variable is a referencing a YAML list called users that is stored in the variables file vars.yml. Looking at that list, it becomes apparent that we are cycling through attributes of each user without hardcoding any user-specific variables in our playbook:

users list from vars.yml:

users:
 - user_name: ansible
 user_gecos: 'ansible user'
 user_groups: "users"
 user_uid: 2000
 user_shell: "/bin/bash"
 user_profile: bash_profile
 user_rc: bashrc
 sudoers: yes
 - user_name: bdobbs
 user_gecos: 'Bob Dobbs'
 user_groups: "users"
 user_uid: 2001
 user_shell: "/bin/bash"
 user_profile: bash_profile
 user_rc: bashrc
 sudoers: no

When you write playbooks in Ansible, you should design your plays as generically as possible so that you can re-use your playbooks across different projects and nodes.

Next, user-config includes the ssh-config playbook which has two tasks: Setup a user’s .ssh directory and the user’s authorized_keys content. In this case, each user is being configured to use the same authorized_keys data, which is probably not how you would configure things in an actual deployment from a security best-practices perspective.

Lastly, user-config includes the sudoers-config playbook which uses Ansible’s lineinfile module to specify sudoers syntax to allow for passwordless sudo invocation. We need this for our ansible account, which will be performing Ansible operations for us non-interactively. This play is special in that it is constrained to only be run when the user is supposed to be added to sudoers (via use of Ansible’s when clause). How is this controlled? Through the sudoers attribute from the users list in vars.html:

users:
 - user_name: ansible
 user_gecos: 'ansible user'
 user_groups: "users"
 user_uid: 2000
 user_shell: "/bin/bash"
 user_profile: bash_profile
 user_rc: bashrc
 sudoers: yes

Managing packages: sysstat-config.yml

One of the classic UNIX/Linux performance monitoring tools is sar/sadc. In the open-source world, sar is packaged within the sysstat tool. One of the first things I do on a new machine is to make sure sar is installed, configured, and operational. So, I created a playbook that installs and configures the sysstat package.

One neat tool in Ansible is the lineinfile module which is useful to make sure a specific line is included in a text file, or some pattern within a line is replaced via a back-referenced regular expression. In the case of sysstat, there is a config file on Ubuntu, /etc/default/sysstat, that ships with a default “off” configuration (i.e., ENABLED=”false”). I used the lineinfile module in sysstat-config.yml to change that line and activate sysstat:

---
 # Install sysstat for sar reporting
 - name: Install sysstat
 apt:
 name: sysstat
 state: present

 - name: Configure sysstat
 lineinfile:
 dest: /etc/default/sysstat
 regexp: '^ENABLED='
 line: 'ENABLED="true"'
 state: present

 - name: Start sysstat
 service:
 name: sysstat
 state: started
 enabled: yes

After the sysstat package is installed (task #1), and its configuration file modified (task #2), I tell Ansible to make sure sysstat is started and enabled to start on reboot via the service module (task #3).

BONUS Play: Interactive Ansible and Server Reboots

Everything you do with Ansible is typically designed to be non-interactive. However, there may be some things that it makes sense to have some sort of interactive processing for depending on your workflow. I thought it might be interesting if I could trigger a server reboot and pause an Ansible playbook until the server(s) all came back online. This is the purpose of the reboot.yml playbook. This playbook could be used after updating kernel packages on hosts, for example. It would need to be modified to add control logic if rebooting all hosts simultaneously in Ansible’s inventory is undesirable. If you want to constrain the run of this all-hosts scoped playbook to a single host in your inventory, you can use the –limit filter:

ansible-playbook --limit myvm reboot.yml

Summary

This wraps up my overview of ansible-mojo’s playbook content and organization. Hopefully by now, you recognize the power and value of Ansible and appreciate just how easy it is to use. In Part I, you learned how to arrange and use Vagrant, VirtualBox, and a source-based copy of Ansible to create a lab environment for your Ansible testing.

In Part II, you learned how to create and use a sequence of Ansible plays to achieve some very common systems deployment goals: creating a deployment user, managing users, distributing ssh authorizations, configuring sudo, and installing packages.

You’ve also learned how to nest playbooks and why you may want to consider stashing certain variables and configuration lists in a file separate from your playbooks.

By downloading ansible-mojo, you can start using Ansible on your own machine immediately, which was my goal for releasing it. I hope you find Ansible as much of a joy to work with as I do.

Future changes to ansible-mojo and accompanying blog posts may or may not include:

  • creating more distro-agnostic playbooks (e.g., plays that work for both CentOS and Ubuntu)
  • integration with Vagrant for local provisioning
  • development of Ansible roles for publishing to Galaxy

Until then, happy hacking and may Ansible make your world better! Cheers!!

 

Ansible Shenanigans: Part I – Initial Setup and Configuration

I’ve been spending time learning Ansible, the Python-based configuration management framework created by Michael DeHaan. There are two main features that make Ansible worth considering for your configuration management needs: ease of implementation via an agentless design (based on SSH), and a DSL that closely resembles traditional shell scripting command syntax. Ansible “plays” are very easily read and understood whether you are a sysadmin, developer, or technical manager. Having used both Puppet and Chef in the past, which require a client/agent installation, I truly appreciate how quickly one can deploy Ansible to manage servers with minimal overhead and a small learning curve.

One of the best resources I’ve found so far to aid in learning Ansible, in addition to the extensive and quality official Ansible documentation, is Jeff Geerling’s most excellent “Ansible for DevOps.” The author steps you through using Vagrant-based VM’s to explore the use of Ansible for both ad-hoc commands and more complex playbook and role-based management.

All of the work I’ve done with Ansible for this post is publicly available on GitHub, so feel free to clone my ansible-mojo repo and follow along.

Lab setup – Vagrant, VirtualBox, and Ansible

I use a mix of custom VirtualBox VM’s and Vagrant-based VM’s for all of my home devops lab work. For the purposes of this post, I am limiting myself to a Vagrant-based solution as it’s extremely simple and dovetails nicely with the approach in “Ansible for DevOps”. So let’s take a closer look…

I’m using Vagrant 1.8.6 and VirtualBox 5.1.6 (r110634) on my MacBook Pro running Yosemite (10.10.5 w/Python 2.7.11). Historically, most of my recent experience has been with CentOS and AmazonLinux, so I decided to refresh my knowledge of Ubuntu, choosing to use Ubuntu 16.04.1 LTS (Xenial Xerus) for my VM’s using the bento/ubuntu-16.04 image hosted at HashiCorp’s Atlas.

To get started, simply add the bento Ubuntu image to your Vagrant/VirtualBox installation. I store all my Vagrant machines in a directory off my home directory called “vagrant-boxes”:

mkdir ~/vagrant-boxes/bento_ubuntu
cd ~/vagrant-boxes/bento_ubuntu
vagrant init bento/ubuntu-16.04; vagrant up --provider virtualbox

At this point, you should have a working Vagrant machine running Ubuntu 16.04.1 LTS!

Note: I originally started this work using Canonical’s ubuntu/xenial64 official build images for Vagrant. However, I ran into an issue immediately that made provisioning with Ansible a bit wonky, namely the fact that the Canonical image does not ship with Python 2.x installed (Python 3.x is there but is not used for Ansible operations). Be advised of this as you setup your own Ansible sandbox with Vagrant.

Because I like to be able to SSH into my Vagrant machines from anywhere inside my home network, I modify the Vagrantfile to access the VM using a hardcoded IP address that I’ve reserved in my router’s DHCP table. The relevant line if you want to do something similar is:

config.vm.network "public_network", ip: "192.168.0.99", bridge: "en0: Wi-Fi (AirPort)"

I then use this IP address in my local hosts file, which allows me to use it via a hostname of my choosing within the Ansible hosts file.

Next, I had to install Ansible on my MacBook. I could have used the package found within Homebrew, but that version is currently 2.1.0 and I wanted to work from the most current stable release with is v2.2.0. So, I opted to clone down that repo from Ansible’s GitHub project and work from that source:

git clone git://github.com/ansible/ansible.git --recursive
cd ./ansible
source ./hacking/env-setup

The last step configures your machine to run Ansible out of the source directory from the cloned repo. You can integrate the environment settings it generates into your shell profile so that the pathing is always current to your installation. You should now have a working copy of Ansible v2.2.0:

[rcrelia@fuji ansible (stable-2.2=)]$ ansible --version
ansible 2.2.0.0 (stable-2.2 e9b7d42205) last updated 2016/10/20 10:00:56 (GMT -400)
 lib/ansible/modules/core: (detached HEAD 42a65f68b3) last updated 2016/10/20 10:00:59 (GMT -400)
 lib/ansible/modules/extras: (detached HEAD ddd36d7746) last updated 2016/10/20 10:01:02 (GMT -400)
 config file = /etc/ansible/ansible.cfg
 configured module search path = Default w/o overrides

Note: I keep my Ansible files in a directory under my $HOME location, including ansible.cfg, which is normally expected by default to be in /etc/ansible. While you can use environment variables to change the expected location, I decided to just symlink /etc/ansible to the relevant location in my $HOME directory. YMMV.

sudo ln -s /etc/ansible /Users/rcrelia/code/ansible

Using Ansible With Your Vagrant Machine

In order to use Ansible, a minimum of two configuration files need to be used in whatever location you are using for your work: ansible.cfg and hosts. All other content will depend on whatever playbooks, host config files, and roles you create. The ansible.cfg in my repo is minimal with the defaults removed. However, you can find a full version in there named ansible.full.cfg for reference. Additionally, you will want to make sure you have a working log file for Ansible operations, with the default being /var/log/ansible.log. The output from all issued Ansible commands are logged in ansible.log.

Since Ansible uses SSH to communicate with managed nodes, you will want to use an account with root-level sudo privileges that is configured for SSH access, and ideally one that is passwordless. I personally use a ssh-agent process to store credentials and make sure that I configure the nodes to allow access using that private key via authorized_hosts. Do whatever makes sense for your environment.

By default, the bento Vagrant machine ships with a sudo-capable user called “vagrant”, whose private SSH key can be used for the initial Ansible run. I added that key to my ssh-agent:

ssh-add ~/vagrant-boxes/bento_ubuntu/.vagrant/machines/default/virtualbox/private_key

At this point, I can now communicate with my Vagrant Ubuntu VM using Ansible over a passwordless SSH connection. Let’s test that with a simple check on the node using Ansible’s setup module:

[rcrelia@fuji ansible]$ ansible myvm -m setup -e ansible_ssh_user=vagrant|head -25
myvm | SUCCESS => {
 "ansible_facts": {
 "ansible_all_ipv4_addresses": [
 "10.0.2.15",
 "192.168.0.99"
 ],
 "ansible_all_ipv6_addresses": [
 "fe80::a00:27ff:feb5:5b5e",
 "fe80::a00:27ff:fe28:3f0"
 ],
 "ansible_architecture": "x86_64",
 "ansible_bios_date": "12/01/2006",
 "ansible_bios_version": "VirtualBox",
 "ansible_cmdline": {
 "BOOT_IMAGE": "/vmlinuz-4.4.0-38-generic",
 "quiet": true,
 "ro": true,
 "root": "/dev/mapper/vagrant--vg-root"
 },
 "ansible_date_time": {
 "date": "2016-10-26",
 "day": "26",
 "epoch": "1477494044",
 "hour": "15",
 "iso8601": "2016-10-26T15:00:44Z",
[rcrelia@fuji ansible]$

Note that I specify the -e option to specify the default Vagrant user for my Ansible session. This is an override option and is only required for the initial playbook run from ansible-mojo. Once we’ve applied our main playbook, which sets up a user called “ansible”, we can then use that user for Ansible operations going forward (as specified by our remote_user setting in ansible.cfg).

At this point, we have a working installation of Ansible with a single manageable Ubuntu XenialXerus node based on Vagrant. In Part II, I will cover the workings of ansible-mojo and discuss various details around playbook construction, layering of plays, etc.