Managing pre-commit hooks in Git

Git comes with support for action sequences based on repository activity. Pushes, pulls, merges, and commits can all be configured to trigger specific custom actions responsively. Often, the custom actions are geared towards promoting intra-team communication, process-gating like enforcing commit log standards and best practices for code content/syntax. Of course, CI workflows already are a popular frontline of defense for bad code pushes by using linters and syntax checks as part of an initial test stage in a pipeline. However, those linting processes have a cost in terms of compute resources they utilize. In a large development environment, this can translate into real money costs pretty quickly. Why spend compute cycles on pipeline jobs that can be just as easily run on a developer’s workstation or laptop? So, let’s take a closer look at git hooks…

Git hooks are configured in a given repo via files located in /.git/hooks. A new repository will be automagically populated with these handy tools, which are inactive by default thanks to the file suffix “.sample”:

[rcrelia@fuji hooks (GIT_DIR!)]$ ls -latr
total 40
-rwxr-xr-x 1 rcrelia staff 3611 Dec 22 2014 update.sample
-rwxr-xr-x 1 rcrelia staff 1239 Dec 22 2014 prepare-commit-msg.sample
-rwxr-xr-x 1 rcrelia staff 4951 Dec 22 2014 pre-rebase.sample
-rwxr-xr-x 1 rcrelia staff 1356 Dec 22 2014 pre-push.sample
-rwxr-xr-x 1 rcrelia staff 1642 Dec 22 2014 pre-commit.sample
-rwxr-xr-x 1 rcrelia staff  398 Dec 22 2014 pre-applypatch.sample
-rwxr-xr-x 1 rcrelia staff  189 Dec 22 2014 post-update.sample
-rwxr-xr-x 1 rcrelia staff  896 Dec 22 2014 commit-msg.sample
-rwxr-xr-x 1 rcrelia staff  452 Dec 22 2014 applypatch-msg.sample
drwxr-xr-x 11 rcrelia staff 374 Dec 22 2014 .
drwxr-xr-x 15 rcrelia staff 510 Jan 4 2015 ..

Having individual hooks like these provides a powerful framework for customizing your repository usage to your specific needs and workflows. Each one is triggered at the stage of git workflow described by the filename (pre-commit, post-update, etc.)

Tools like linters fit nicely into the pre-commit action sequence. By configuring the pre-commit hook with a linter, you are delivering higher quality code to your pipelines which makes for a more efficient use of your compute resource budget.

Hook Management: Yelp’s pre-commit

I recently started using an open source utility released by Yelp’s engineers called simply, “pre-commit”. Essentially, it is a framework for managing the pre-commit hook in a git repository, using a single configuration file with multiple action sequences. This allows for a single pre-commit hook to do many different sorts of actions. It includes some basic linter capabilities as well as other code quality control features, but also is integrated with other projects (e.g., ansible-lint has a pre-commit hook available).

Setup is straightforward, as is the usage. Here’s how I did it:

pip install pre-commit
cd repodir
pre-commit install
# edit a new file called .pre-commit-config.yaml in the root of your repo
git add .pre-commit-config.yaml
git commit -m "turn on pre-commit hook"

Immediately you should see the pre-commit utility do its thing when you commit this or any other change to your repository.

Here is the current working version of my pre-commit config file (one per repo):

- repo: git://github.com/pre-commit/pre-commit-hooks
  sha: v0.7.1
  hooks:
   - id: trailing-whitespace
     files: \.(js|rb|md|py|sh|txt|yaml|yml)$
   - id: check-json
     files: \.(json|template)$
   - id: check-yaml
     files: \.(yml|yaml)$
   - id: detect-private-key
   - id: detect-aws-credentials
- repo: git://github.com/detailyang/pre-commit-shell
  sha: 6f03a87e054d25f8a229cef9005f39dd053a9fcb
  hooks:
   - id: shell-lint

So, I’m using some of pre-commit’s built-in handlers for whitespace cleanup, JSON linting, YAML linting, and checking to make sure I don’t include any private keys or AWS credentials in my commits. Also, I’ve integrated a third-party tool, pre-commit-shell, that is a wrapper to shellcheck for syntax checking and enforcing best practices in any shell scripts I might add to the repo.

And here is an example of a code commit that triggers pre-commit’s operation:

[rcrelia@fuji aws-mojo (master +=)]$ git commit -m "pre-commit"
[INFO] Installing environment for git://github.com/pre-commit/pre-commit-hooks.
[INFO] Once installed this environment will be reused.
[INFO] This may take a few minutes...
Trim Trailing Whitespace.................................................Passed
Check JSON...........................................(no files to check)Skipped
Check Yaml...............................................................Passed
Detect Private Key.......................................................Passed
Detect AWS Credentials...................................................Passed
Shell Syntax Check...................................(no files to check)Skipped
[master 7d837e7] pre-commit
 1 file changed, 15 insertions(+)
 create mode 100644 .pre-commit-config.yaml

While pre-commit doesn’t handle management of the other available Git hooks, it does a very good job with what it does control, with a robust plugin interface and the ability to write custom hooks.

If you find yourself in need of some automated linting of your code before you push to your remote repositories, I highly recommend the use of pre-commit for its ease of use and operational flexibility.

Happy coding!

Gitlab Repo Best Practices

I recently had to come up with some guidelines for others to use when it comes to using shared Gitlab repositories in a CI/CD configuration. Here is my take based on my experiences so far, if you have any more to share please drop me a line/comment here.

Note: Gitlab uses the term Merge Request for what is commonly referred to in other CI frameworks as Pull Requests… just a little FYI 🙂

Gitlab Repo Usage – Best Practices and Tips

  • Create MR’s when you are at a point where you want/need to see your changes in action (i.e., merged into master, tested, and deployed).
  • If you will be making more related changes later in the branch, do not opt to have the source branch removed from the repository when submitting your MR.
  • At a minimum, you should merge at least once per day, especially if others are working on the same codebase at the same time. This makes it easier to resolve merge conflicts, which occur when two developers change the same repository content/object in their own respective branches and one merges ahead of the other.
  • Merge conflicts happen. Don’t worry if you experience one. Try to troubleshoot on your own, but if you cannot resolve it by yourself, pull in the other developer(s) whose changes are affecting your merge attempt and work together to resolve them.
  • When creating a MR, indicate in the Title whether or not it is time-sensitive by adding ” – ASAP” to the end of the Title text. This helps reviewers prioritize their review requests with minimal disruption.
  • Do NOT approve your own MR if it involves a code change. The peer-review component of Merge Requests is an opportunity to communicate and share awareness of changes on the team. That said, here are some scenarios where it is ok to approve your own MR’s:
    • you are pushing non-operational changes (e.g., comments, documentation)
    • you are the only developer available and it’s an important change or if waiting for MR review blocks progress signficantly (use good judgment)
  • When adding to a branch, keep your commits as specific as possible when modifying code. Each commit should be understandable on its own, even if there are other commits in the branch
  • Not all MR’s need to hit a pipeline. Depending on the repo pipeline configuration, some branch name filters may exist to insure a certain type of branch gets tested while other types do not. This is especially true of non-code changes (e.g., updating a README)
  • When starting new development as opposed to modifying existing code, it may make sense to create a personal repo or to use a fork of a shared repo to do a lot of iteration quickly without having to do a formal MR process in a shared repo. Once you’ve got some code ready for sharing, you can migrate it manually (copy) into the shared repo and work off MR’s going forward. Not required at all, but it can allow for more rapid iteration especially on small teams.