Configuring git credentials for CodeCommit and other repositories

Working with AWS CodeCommit repositories in addition to other repositories from the same git configuration can be a challenge depending on your git configuration. I’d like to share an approach that works for me when using HTTPS (instead of SSH keys) and hopefully it will be helpful for you, too. I will be describing a solution that works for macOS and MSFT VisualStudio Code.

The solution I’m going to suggest works with using static CodeCommit credentials from the Credentials section of your AWS IAM user object. An alternative solution is to use IAM keys with CodeCommit, but it requires credential helper configuration with the AWS CLI in order to handle the dynamics of IAM session management. With static git credentials, there is no need for AWS CLI integration. Either solution also requires adding a section to your $HOME/.gitconfig file for the credential being used.

I have found that it helps to break out different repositories (CodeCommit, GitLab, GitHub, etc.) into separate .git-credentials files for HTTPS access. This is because the git-credential helper logic is sensitive to credential sorting by top-level domain. For example, I’ve tried using a single .git-credentials file for multiple repos at the same top level domain and it works when a single set of credentials is used for access to all the repositories. However, if I have different credentials (e.g., different personal access tokens in GitLab) for different repositories within the same top-level domain, problems arise.

After you’ve generated your CodeCommit credentials in the IAM console for your IAM user, you will need to configure a .git-credentials file in your home directory for the repository(ies) you want access to via git commands. Let’s say we are working with a repo called “awesome-microservice” in us-east-1. Here’s what the HTTPS git credential string looks like:

Next, store this string in a file, I might name it something like:


Now that you have your credential file, we need to tell the git binaries where to find the credential. Create/update your $HOME/.gitconfig file with a new credential section for your CodeCommit credentials:

[credential ""]
  helper = store --file /Users/jsmith/.git-credentials.awesome-microservice

At this point, you should be able to clone down your repo from CodeCommit without having to input a username/password. If you have other repos to access, in CodeCommit or someplace else, you can repeat these steps if you use HTTPS and static git credentials to connect to those repos.

Ansible and EC2 Auto Scaling Groups: False-positive idempotency errors and a workaround

When using Ansible to deploy and manage EC2 auto scaling groups (ASGs) in AWS, you may encounter, like I have recently, an issue with idempotency errors that can be somewhat befuddling. Basically, when the ec2_asg module is called, one of its properties, vpc_zone_identifier, is used to define the subnets used by the ASG. A typical ASG configuration is to use two subnets, each one in a different availability zone, for a robust HA configuration, like so:

- name: "create auto scaling group"
    module: ec2_asg
    name: "{{ asg_name }}"
    desired_capacity: "{{ desired_capacity }}"
    launch_config_name: "{{ launch_config }}"
    min_size: 2
    max_size: 3
    desired_capacity: 2
    region: "{{ region }}"
    vpc_zone_identifier: "{{ subnet_ids }}"
    state: present

Upon subsequent Ansible plays, when ec2_asg is called, but no changes are made, you can still experience a changed=true result because of how Ansible is ordering the subnet-id’s used in vpc_zone_identifier versus how AWS is ordering them. This makes the play non-idempotent. How does this happen?

It turns out that Ansible’s ec2_asg module sorts the subnet-ids, while AWS does not when it returns those values. Here is the relevant code from the v2.3.0.0 version of, notice the sorting that happens in an attempt to match what AWS provides as an order:

 518 for attr in ASG_ATTRIBUTES:
 519     if module.params.get(attr, None) is not None:
 520         module_attr = module.params.get(attr)
 521         if attr == 'vpc_zone_identifier':
 522             module_attr = ','.join(module_attr)
 523         group_attr = getattr(as_group, attr)
 524         # we do this because AWS and the module may return the same list
 525         # sorted differently
 526         if attr != 'termination_policies':
 527             try:
 528                 module_attr.sort()
 529             except:
 530                 pass
 531             try:
 532                 group_attr.sort()
 533             except:
 534                 pass
 535         if group_attr != module_attr:
 536             changed = True
 537             setattr(as_group, attr, module_attr)

While this is all well and good, AWS does not follow any specific ordering algorithm when it returns values for subnet-ids in the ASG context. So, when AWS returns its subnet-id list for the ec2_asg call, Ansible will sometimes have a different order in its ec2_asg configuration and then incorrectly interpret the difference between the two lists as a change and mark it thusly. If you are counting on your Ansible plays to be perfectly idempotent, this is problematic. There is now an open GitHub issue about this specific problem.

The good news is that the latest development version of ec2_asg, which is also written using boto3, does not exhibit this false-positive idempotency error issue. The devel version of ec2_asg (i.e., unreleased is altogether different than what ships in current stable releases. So, these false-positive idempotency errors can occur in releases up to and including version (I have found it in,, and  Sometime soon, we should have a version of ec2_asg that behaves idempotently. But what to do until then?

One approach is to write a custom library in Python that you use instead of ec2_asg. While feasible, it would involve a lot of time spent verifying integration with both AWS and existing Ansible AWS modules.

Another approach, and one I took recently, is to simply ask AWS what it has for the order of subnet-ids to be in vpc_zone_identifier and then plug that ordering into what I pass to ec2_asg during each run.

Prior to running ec2_asg, I use the command module to run the AWSCLI autoscaling utility and query for the contents of VPCZoneIdentifier. Then I take those results and use them as the ordered list that I pass into ec2_asg afterward:

- name: "check for ASG subnet order due to idempotency failures with ec2_asg"
  command: 'aws autoscaling describe-auto-scaling-groups --region "{{ region }}" --auto-scaling-group-names "{{ asg_name }}" '
  register: describe_asg
  changed_when: false

- name: "parse the json input from aws describe-auto-scaling-groups"
  set_fact: asg="{{ describe_asg.stdout | from_json }}"

- name: "get vpc_zone_identifier and parse for subnet-id ordering"
  set_fact: asg_subnets="{{ asg.AutoScalingGroups[0].VPCZoneIdentifier.split(',') }}"
  when: asg.AutoScalingGroups

- name: "update subnet_ids on subsequent runs"
  set_fact: my_subnet_ids="{{ asg_subnets }}"
  when: asg.AutoScalingGroups

# now use the AWS-sorted list, my_subnet_ids, as the content of vpc_zone_identifier

- name: "create auto scaling group"
    module: ec2_asg
    name: "{{ asg_name }}"
    desired_capacity: "{{ desired_capacity }}"
    launch_config_name: "{{ launch_config }}"
    min_size: 2
    max_size: 3
    desired_capacity: 2
    region: "{{ region }}"
    vpc_zone_identifier: "{{ my_subnet_ids }}"
    state: present

On each run, the following happens:

  1. A command task runs the AWSCLI to describe the autoscaling group in question. If it’s the first run, an empty array is returned. The result is registered as asg_describe.
  2. The JSON data in asg_describe is copied into a new Ansible fact called “asg”
  3. The subnets in use by the ASG and how they are ordered is determined by extracting the VPCZoneIdentifier attribute from the AutoScalingGroup (asg fact). If it’s the first run, this step is skipped because of the when: clause which limits task execution to runs where the ASG already exists (runs 2 and later). It puts this list into the fact called “asg_subnets”
  4. Using the AWS-ordered list from step 3, Ansible sets a new fact called “my_subnet_ids”, which is then specified as the value to vpc_zone_identifier when ec2_asg is called.

I did a test on the idempotency of the play by running Ansible one hundred times after the ASG was created; at no point did I receive a false-positive change. Prior to this workaround, it would happen every run if I happened to be specifying subnet-ids ordered differently than from what AWS returned in terms of their order.

While this is admittedly somewhat kludgy, at least I can be confident that my plays involving AWS EC2 autoscaling groups will actually behave idempotently when they should. In the meantime, while we wait for the next update to Ansible’s ec2_asg module, this workaround can be used successfully to avoid false positive idempotency errors.

Until next time, have fun getting your Ansible on!

Stupid Boto3 Tricks – get_aws_region()

For some use cases, it’s not feasible to rely on an EC2 instance having any boto or AWS configuration information available (e.g., you are using an instance profile/role instead of API keys). This is a problem when it comes to establishing client sessions with services and you need to set the default region as an attribute to the boto3.setup_default_session() module.

Here’s one way to solve this problem via pulling the availability-zone element out of EC2 instance metadata, and then filtering that to drop the AZ portion (e.g., us-east-1b -> us-east-1).

First, import the urllib2 module into your code (Python 2.x):

import urllib2

Then, create a function like so that returns the AWS region name to the calling program:

def get_aws_region():

    # still no equivalent of boto.utils in boto3, so I have to do this janky thing...
    myAz = urllib2.urlopen('').read()
    myRegion = myAz[:-1]
    return myRegion

Quick & easy AMI generator

I have been meaning to put together a Lambda function to create an AMI from a custom EC2 instance.  It’s a pretty typical scenario, but I haven’t taken the time to roll my own. Recently, I ran across an article on StackOverflow which provides a CloudFormation template that:

  • constructs an EC2 image,
  • creates a Lambda execution role for AMI building,
  • creates a Lambda function for constructing an AMI, and
  • uses a custom resource to make an AMI from the instance via the Lambda function.

The Lambda function is written in the JavaScript SDK (node.js), is short and sweet, and easy to modify.

So, I modified both the template and Lambda function to make it a little more generic and reusable. Also, I fixed a logic error in a the original Lambda.  Finally, I wanted to customize the name of both the image and AMI, so I created an InstanceName parameter. The only other parameter for the CF template is InstanceType, which I defaulted to t2.micro. Add your desired instance types to the list in that parameter’s AllowedValues attribute. The base AMI for the instance is a region-specific Amazon Linux image. Once the stack is deployed, simply update the template with your userdata changes to create new custom AMI’s. It’s a very helpful tool to have in your CloudFormation toolbox.

The template is available from my aws-mojo repo on GitHub in both JSON and YAML formats.


Update: Removal of route tables in aws-vpc-scenario2

In a previous post about my Ansible role for creating/removing a Scenario 2 VPC in AWS, I noted that I had been unable to get the ec2_vpc_route_table module to successfully delete route tables. Instead, I fell back to using the awscli to handle the deletion. This kludgy workaround didn’t sit right with me, so I finally dedicated some time this morning to troubleshooting it.

As it turns out, there is a parameter that must be specified when that module is invoked to delete route tables, and the documentation does not call out the necessity of that parameter when deleting route tables and using the route_table_id parameter.

So, this doesn’t work:

- name: Delete AZ1 private route table
    state: absent
    vpc_id: "{{ vpc_id }}"
    route_table_id: "{{ private_rt_az1_id }}"

Instead, you have to add the “lookup” parameter and specify “id” as the lookup type since we are using the rt_id:

- name: Delete AZ1 private route table
    state: absent
    vpc_id: "{{ vpc_id }}"
    route_table_id: "{{ private_rt_az1_id }}"
    lookup: "id"

I have submitted a Github issue requesting clarification of the generated documentation for the module to specify this requirement.

In the meantime, I’ve incorporated this change in the role and updated my repo for the role. Cheers!


Using Ansible Roles to Create a Scenario 2 VPC in AWS

In my last post, I talked about a set of CloudFormation templates I created to quickly and flexibly create/teardown a securely configured Scenario 2 VPC. As an experiment, I decided to see if I could create an Ansible role to do the same thing. The experiment was mostly successful but I ran into some complications.

Ansible’s cloud module support for AWS is pretty comprehensive for most use cases as of this writing (I am using version 2.2.1 that has been recently updated from the v2.2 origin). Still, I discovered a couple of gaps compared to my CloudFormation solution.

First of all, unlike CloudFormation, Ansible doesn’t allow for complete automated deprovisioning of VPC components. When you use Ansible to create AWS resources like a VPC, things are relatively straightforward. However, removal of those resources has to be orchestrated (unlike CloudFormation which handles the stack teardown in a completely automated fashion) just like the creation phase requires. Order of resource deprovisioning matters and you will go through some trial and error to figure out what works for your configuration.

One major issue I ran into with my VPC deployment is that the Ansible module for managing route tables, ec2_vpc_route_table, does not seem to remove route table objects after they’ve been created in a Scenario 2 VPC configuration. Typically, when you specify the attribute “state: absent” in an action, a module will decommission the resource. In this case, I found that I had to fall back to using the shell module to run awscli commands to delete the route tables. A quick perusal of the Ansible GitHub issue queue for extras modules didn’t suggest a workaround nor were there any pull requests related to the problem (note to self: file a bug report). Here’s the relevant section of my delete.yml playbook for VPC removal:

- name: Delete private subnet in AZ2
    state: absent
    vpc_id: "{{ vpc_id }}"
    cidr: "{{ private_subnet_az2_cidr }}"

# For some reason, ec2_vpc_route_table won't delete these
# but we'll keep them in here commented out for posterity.
#- name: Delete public route table
#  ec2_vpc_route_table:
#    state: absent
#    vpc_id: "{{ vpc_id }}"
#    route_table_id: "{{ public_rt_id }}"
#  ignore_errors: yes
#- name: Delete AZ1 private route table
#  ec2_vpc_route_table:
#    state: absent
#    vpc_id: "{{ vpc_id }}"
#    route_table_id: "{{ private_rt_az1_id }}"
#  ignore_errors: yes
#- name: Delete AZ2 private route table
#  ec2_vpc_route_table:
#    state: absent
#    vpc_id: "{{ vpc_id }}"
#    route_table_id: "{{ private_rt_az2_id }}"
#  ignore_errors: yes
# Instead, we will decomm route tables using awscli

- name: Delete public route table via awscli
  shell: aws ec2 delete-route-table --route-table-id "{{ public_rt_id }}"

- name: Delete AZ1 private route table via awscli
  shell: aws ec2 delete-route-table --route-table-id "{{ private_rt_az1_id }}"

- name: Delete AZ2 private route table via awscli
  shell: aws ec2 delete-route-table --route-table-id "{{ private_rt_az2_id }}"

- name: Delete Internet Gateway
    vpc_id: "{{ vpc_id }}"
    state: absent

ec2_vpc_route_table is a relatively new “extras” module, appearing in the v2.0.0 release so it is likely to be fixed down the road, but be aware of this issue, especially in the context of a Scenario 2 VPC deployment (I have a hunch the NAT Gateway may be related to the problem…).

Another issue I encountered, albeit a minor one, is lack of support for VPC Endpoints, which my Scenario 2 CloudFormation templates support. There is an open PR for this functionality, so it may be available soon.

Aside from these problems, I found using Ansible plays for generating a Scenario 2 VPC to be relatively simple (like anything you do with Ansible) and useful. You can download the role I created via Galaxy:

$ ansible-galaxy install

or clone the GitHub repository I created for the Galaxy integration:

$ git clone

Until next time, have fun getting your Ansible on!



Ansible Shenanigans: Part II – Sample Playbook Usage

In Part I, I talked about why Ansible and how to configure your own installation using Vagrant, VirtualBox, and Ansible. Now, let’s take a closer look at using Ansible along with the details of my demo playbook collection ansible-mojo.

Once Ansible is up and running, it is extremely useful for managing nodes using ad-hoc commands. However, it really shines once you start developing collections of commands, or “plays”, in the form of “playbooks” to manage nodes. Similar to recipes and cookbooks in Chef, Ansible’s plays and playbooks are the basis for a best-practice implementation of Ansible to manage your infrastructure in a consistent, flexible, and repeatable fashion.

For ansible-mojo, I wanted to create a set of simple playbooks that would be helpful in demonstrating how to configure nodes with some basic things like:

  • a dedicated user account “ansible” for deployment standardization,
  • installation of standard packages,
  • management of users and sudoers content

Initial Ansible Playbook Run

The anisible-mojo repo contains several files: playbooks, a variables file, and a couple of shell environment files. All playbook content is based on YAML-formatted text files that are easily understandable. I opted to have a single primary playbook (main.yml) that does some initial node configuration then includes other playbooks for specific configuration changes like configuring users (user-config.yml) and installing sysstat for SAR reporting (sysstat-config.yml).

Before I go into details on each of the playbooks, let’s go ahead and do an initial playbook run against our Ubuntu Vagrant box so that we can issue further commands using our dedicated deployment user account “ansible” instead of the “vagrant” user.

NOTE: Be sure that you change authorized_keys in ansible-mojo to contain the public key that you configured your ssh-agent to use for deployment as mentioned in Part I.

In this case, I am using a Vagrant machine called “myvm” and will specify the -e override for ansible_ssh_user to ignore the remote_user setting in ansible.cfg:

[rcrelia@fuji ansible]$ ansible-playbook main.yml -e ansible_ssh_user=vagrant

PLAY [all] *********************************************************************

TASK [setup] *******************************************************************
ok: [myvm]

TASK [Ensure ntpdate is installed] *********************************************
changed: [myvm]

TASK [Ensure ntp is installed] *************************************************
changed: [myvm]

TASK [Ensure ntp is running and enabled at boot] *******************************
ok: [myvm]

TASK [Ensure aptitude is installed] ********************************************
changed: [myvm]

TASK [Update apt package cache if older than one day] **************************
changed: [myvm]

TASK [Add user group] **********************************************************
changed: [myvm] => (item={u'user_uid': 2000, u'user_rc': u'bashrc', u'user_profile': u'bash_profile', u'sudoers': True, u'user_groups': u'users', u'user_gecos': u'ansible user', u'user_shell': u'/bin/bash', u'user_name': u'ansible'})
changed: [myvm] => (item={u'user_uid': 2001, u'user_rc': u'bashrc', u'user_profile': u'bash_profile', u'sudoers': False, u'user_groups': u'users', u'user_gecos': u'Bob Dobbs', u'user_shell': u'/bin/bash', u'user_name': u'bdobbs'})

... SNIP ...

TASK [Install sysstat] *********************************************************
changed: [myvm]

TASK [Configure sysstat] *******************************************************
changed: [myvm]

TASK [Restart sysstat] *********************************************************
changed: [myvm]

PLAY RECAP *********************************************************************
myvm : ok=17 changed=15 unreachable=0 failed=0

Success! Now, we should have the “ansible” user account provisioned on the Vagrant machine and we will perform all future Ansible plays using that account as specified in ansible.cfg in the remote_user setting.

A Closer Look at Playbooks

ansible-mojo contains several files, with all Ansible syntax included in the YAML files:

  • main.yml
  • vars.yml
  • reboot.yml
  • Playbooks nested below main.yml:
    • user-config.yml
      • ssh-config.yml
      • sudoers-config.yml
    • sysstat-config.yml

Each of these files with the exception of vars.yml is an Ansible playbook. I created a primary playbook called “main” which in turn references a file containing miscellaneous variables (another YAML file called vars.yml), along with two other playbooks, user-config (configures user accounts) and sysstat-config (configures SAR reporting). These latter two files are nested playbooks: their execution is dependent on syntax in the main playbook and the vars_file. Finally, user-config includes two playbooks, one for configuring SSH in user accounts and one for configuring sudo access.

At the beginning of the main playbook, we see that the plays are scoped to all hosts in Ansible’s inventory (hosts: all), that plays will be run as a privileged user on the nodes (become: yes), and that some variables have been stored outside of playbooks in a single location called vars.yml. This pattern of using vars_files allows you to have a single place for information that you may not want to distribute along with playbooks (e.g., user account details) for security reasons.

Next, Ansible tasks (or actions) are defined for installing and configuring ntpd on our nodes, along with aptitude, and a command to update the apt packages on a node if the last update was longer ago than 24 hours.

The nesting of playbooks is a pattern that supports reusability and portability of playbook content, provided you don’t hardcode variables in them. Let’s take a closer look at some of these nested playbooks.

Managing users: user-config.yml

Since user-config is a nested playbook, it consists of a sequence of tasks without any operating parameters like host-scoping or privilege/role settings. It does five things before calling its own nested playbooks at the end:

  1. Creates a user’s primary group using the user’s UID as the GID, via the Ansible group module
  2. Creates a user via the Ansible user module
  3. Creates a user’s .bash_profile via the Ansible copy module
  4. Creates a user’s .bashrc via the Ansible copy module
  5. Creates a user’s $HOME/bin directory via the Ansible file module

The syntax is pretty clear about what is happening if you have even the most basic sort of experience managing user accounts on a UNIX/Linux server. Isn’t Ansible awesome?

What may not be so clear is the syntax that uses the “item.” prefix in variable names. Basically, I designed the playbook to use with_items feature of Ansible so I could iterate through multiple users without duplicating a lot of syntax. The “{{ users }}” variable is a referencing a YAML list called users that is stored in the variables file vars.yml. Looking at that list, it becomes apparent that we are cycling through attributes of each user without hardcoding any user-specific variables in our playbook:

users list from vars.yml:

 - user_name: ansible
 user_gecos: 'ansible user'
 user_groups: "users"
 user_uid: 2000
 user_shell: "/bin/bash"
 user_profile: bash_profile
 user_rc: bashrc
 sudoers: yes
 - user_name: bdobbs
 user_gecos: 'Bob Dobbs'
 user_groups: "users"
 user_uid: 2001
 user_shell: "/bin/bash"
 user_profile: bash_profile
 user_rc: bashrc
 sudoers: no

When you write playbooks in Ansible, you should design your plays as generically as possible so that you can re-use your playbooks across different projects and nodes.

Next, user-config includes the ssh-config playbook which has two tasks: Setup a user’s .ssh directory and the user’s authorized_keys content. In this case, each user is being configured to use the same authorized_keys data, which is probably not how you would configure things in an actual deployment from a security best-practices perspective.

Lastly, user-config includes the sudoers-config playbook which uses Ansible’s lineinfile module to specify sudoers syntax to allow for passwordless sudo invocation. We need this for our ansible account, which will be performing Ansible operations for us non-interactively. This play is special in that it is constrained to only be run when the user is supposed to be added to sudoers (via use of Ansible’s when clause). How is this controlled? Through the sudoers attribute from the users list in vars.html:

 - user_name: ansible
 user_gecos: 'ansible user'
 user_groups: "users"
 user_uid: 2000
 user_shell: "/bin/bash"
 user_profile: bash_profile
 user_rc: bashrc
 sudoers: yes

Managing packages: sysstat-config.yml

One of the classic UNIX/Linux performance monitoring tools is sar/sadc. In the open-source world, sar is packaged within the sysstat tool. One of the first things I do on a new machine is to make sure sar is installed, configured, and operational. So, I created a playbook that installs and configures the sysstat package.

One neat tool in Ansible is the lineinfile module which is useful to make sure a specific line is included in a text file, or some pattern within a line is replaced via a back-referenced regular expression. In the case of sysstat, there is a config file on Ubuntu, /etc/default/sysstat, that ships with a default “off” configuration (i.e., ENABLED=”false”). I used the lineinfile module in sysstat-config.yml to change that line and activate sysstat:

 # Install sysstat for sar reporting
 - name: Install sysstat
 name: sysstat
 state: present

 - name: Configure sysstat
 dest: /etc/default/sysstat
 regexp: '^ENABLED='
 line: 'ENABLED="true"'
 state: present

 - name: Start sysstat
 name: sysstat
 state: started
 enabled: yes

After the sysstat package is installed (task #1), and its configuration file modified (task #2), I tell Ansible to make sure sysstat is started and enabled to start on reboot via the service module (task #3).

BONUS Play: Interactive Ansible and Server Reboots

Everything you do with Ansible is typically designed to be non-interactive. However, there may be some things that it makes sense to have some sort of interactive processing for depending on your workflow. I thought it might be interesting if I could trigger a server reboot and pause an Ansible playbook until the server(s) all came back online. This is the purpose of the reboot.yml playbook. This playbook could be used after updating kernel packages on hosts, for example. It would need to be modified to add control logic if rebooting all hosts simultaneously in Ansible’s inventory is undesirable. If you want to constrain the run of this all-hosts scoped playbook to a single host in your inventory, you can use the –limit filter:

ansible-playbook --limit myvm reboot.yml


This wraps up my overview of ansible-mojo’s playbook content and organization. Hopefully by now, you recognize the power and value of Ansible and appreciate just how easy it is to use. In Part I, you learned how to arrange and use Vagrant, VirtualBox, and a source-based copy of Ansible to create a lab environment for your Ansible testing.

In Part II, you learned how to create and use a sequence of Ansible plays to achieve some very common systems deployment goals: creating a deployment user, managing users, distributing ssh authorizations, configuring sudo, and installing packages.

You’ve also learned how to nest playbooks and why you may want to consider stashing certain variables and configuration lists in a file separate from your playbooks.

By downloading ansible-mojo, you can start using Ansible on your own machine immediately, which was my goal for releasing it. I hope you find Ansible as much of a joy to work with as I do.

Future changes to ansible-mojo and accompanying blog posts may or may not include:

  • creating more distro-agnostic playbooks (e.g., plays that work for both CentOS and Ubuntu)
  • integration with Vagrant for local provisioning
  • development of Ansible roles for publishing to Galaxy

Until then, happy hacking and may Ansible make your world better! Cheers!!


Ansible Shenanigans: Part I – Initial Setup and Configuration

I’ve been spending time learning Ansible, the Python-based configuration management framework created by Michael DeHaan. There are two main features that make Ansible worth considering for your configuration management needs: ease of implementation via an agentless design (based on SSH), and a DSL that closely resembles traditional shell scripting command syntax. Ansible “plays” are very easily read and understood whether you are a sysadmin, developer, or technical manager. Having used both Puppet and Chef in the past, which require a client/agent installation, I truly appreciate how quickly one can deploy Ansible to manage servers with minimal overhead and a small learning curve.

One of the best resources I’ve found so far to aid in learning Ansible, in addition to the extensive and quality official Ansible documentation, is Jeff Geerling’s most excellent “Ansible for DevOps.” The author steps you through using Vagrant-based VM’s to explore the use of Ansible for both ad-hoc commands and more complex playbook and role-based management.

All of the work I’ve done with Ansible for this post is publicly available on GitHub, so feel free to clone my ansible-mojo repo and follow along.

Lab setup – Vagrant, VirtualBox, and Ansible

I use a mix of custom VirtualBox VM’s and Vagrant-based VM’s for all of my home devops lab work. For the purposes of this post, I am limiting myself to a Vagrant-based solution as it’s extremely simple and dovetails nicely with the approach in “Ansible for DevOps”. So let’s take a closer look…

I’m using Vagrant 1.8.6 and VirtualBox 5.1.6 (r110634) on my MacBook Pro running Yosemite (10.10.5 w/Python 2.7.11). Historically, most of my recent experience has been with CentOS and AmazonLinux, so I decided to refresh my knowledge of Ubuntu, choosing to use Ubuntu 16.04.1 LTS (Xenial Xerus) for my VM’s using the bento/ubuntu-16.04 image hosted at HashiCorp’s Atlas.

To get started, simply add the bento Ubuntu image to your Vagrant/VirtualBox installation. I store all my Vagrant machines in a directory off my home directory called “vagrant-boxes”:

mkdir ~/vagrant-boxes/bento_ubuntu
cd ~/vagrant-boxes/bento_ubuntu
vagrant init bento/ubuntu-16.04; vagrant up --provider virtualbox

At this point, you should have a working Vagrant machine running Ubuntu 16.04.1 LTS!

Note: I originally started this work using Canonical’s ubuntu/xenial64 official build images for Vagrant. However, I ran into an issue immediately that made provisioning with Ansible a bit wonky, namely the fact that the Canonical image does not ship with Python 2.x installed (Python 3.x is there but is not used for Ansible operations). Be advised of this as you setup your own Ansible sandbox with Vagrant.

Because I like to be able to SSH into my Vagrant machines from anywhere inside my home network, I modify the Vagrantfile to access the VM using a hardcoded IP address that I’ve reserved in my router’s DHCP table. The relevant line if you want to do something similar is: "public_network", ip: "", bridge: "en0: Wi-Fi (AirPort)"

I then use this IP address in my local hosts file, which allows me to use it via a hostname of my choosing within the Ansible hosts file.

Next, I had to install Ansible on my MacBook. I could have used the package found within Homebrew, but that version is currently 2.1.0 and I wanted to work from the most current stable release with is v2.2.0. So, I opted to clone down that repo from Ansible’s GitHub project and work from that source:

git clone git:// --recursive
cd ./ansible
source ./hacking/env-setup

The last step configures your machine to run Ansible out of the source directory from the cloned repo. You can integrate the environment settings it generates into your shell profile so that the pathing is always current to your installation. You should now have a working copy of Ansible v2.2.0:

[rcrelia@fuji ansible (stable-2.2=)]$ ansible --version
ansible (stable-2.2 e9b7d42205) last updated 2016/10/20 10:00:56 (GMT -400)
 lib/ansible/modules/core: (detached HEAD 42a65f68b3) last updated 2016/10/20 10:00:59 (GMT -400)
 lib/ansible/modules/extras: (detached HEAD ddd36d7746) last updated 2016/10/20 10:01:02 (GMT -400)
 config file = /etc/ansible/ansible.cfg
 configured module search path = Default w/o overrides

Note: I keep my Ansible files in a directory under my $HOME location, including ansible.cfg, which is normally expected by default to be in /etc/ansible. While you can use environment variables to change the expected location, I decided to just symlink /etc/ansible to the relevant location in my $HOME directory. YMMV.

sudo ln -s /etc/ansible /Users/rcrelia/code/ansible

Using Ansible With Your Vagrant Machine

In order to use Ansible, a minimum of two configuration files need to be used in whatever location you are using for your work: ansible.cfg and hosts. All other content will depend on whatever playbooks, host config files, and roles you create. The ansible.cfg in my repo is minimal with the defaults removed. However, you can find a full version in there named ansible.full.cfg for reference. Additionally, you will want to make sure you have a working log file for Ansible operations, with the default being /var/log/ansible.log. The output from all issued Ansible commands are logged in ansible.log.

Since Ansible uses SSH to communicate with managed nodes, you will want to use an account with root-level sudo privileges that is configured for SSH access, and ideally one that is passwordless. I personally use a ssh-agent process to store credentials and make sure that I configure the nodes to allow access using that private key via authorized_hosts. Do whatever makes sense for your environment.

By default, the bento Vagrant machine ships with a sudo-capable user called “vagrant”, whose private SSH key can be used for the initial Ansible run. I added that key to my ssh-agent:

ssh-add ~/vagrant-boxes/bento_ubuntu/.vagrant/machines/default/virtualbox/private_key

At this point, I can now communicate with my Vagrant Ubuntu VM using Ansible over a passwordless SSH connection. Let’s test that with a simple check on the node using Ansible’s setup module:

[rcrelia@fuji ansible]$ ansible myvm -m setup -e ansible_ssh_user=vagrant|head -25
myvm | SUCCESS => {
 "ansible_facts": {
 "ansible_all_ipv4_addresses": [
 "ansible_all_ipv6_addresses": [
 "ansible_architecture": "x86_64",
 "ansible_bios_date": "12/01/2006",
 "ansible_bios_version": "VirtualBox",
 "ansible_cmdline": {
 "BOOT_IMAGE": "/vmlinuz-4.4.0-38-generic",
 "quiet": true,
 "ro": true,
 "root": "/dev/mapper/vagrant--vg-root"
 "ansible_date_time": {
 "date": "2016-10-26",
 "day": "26",
 "epoch": "1477494044",
 "hour": "15",
 "iso8601": "2016-10-26T15:00:44Z",
[rcrelia@fuji ansible]$

Note that I specify the -e option to specify the default Vagrant user for my Ansible session. This is an override option and is only required for the initial playbook run from ansible-mojo. Once we’ve applied our main playbook, which sets up a user called “ansible”, we can then use that user for Ansible operations going forward (as specified by our remote_user setting in ansible.cfg).

At this point, we have a working installation of Ansible with a single manageable Ubuntu XenialXerus node based on Vagrant. In Part II, I will cover the workings of ansible-mojo and discuss various details around playbook construction, layering of plays, etc.