CFEngine Alternatives

I have been using CFEngine 2 (which is substantially different from version 3) for a great many years to manage various aspects of my home system environments, making use of such things as:

  • Copying files, to do simplistic backups where that works
  • Editing files to have particular content such as SSH keys, cron jobs
  • Restarting processes that I want to keep running (syncthing, dropbox, …)
  • Running shell commands on particular hosts
    • To run backups
    • To run cleanup jobs
  • Setting up symlinks to configuration files, so that I have authoritative configuration in a git repository, and then rc files in $HOME or $HOME/.config or such reference them
  • Ensuring ssh keys have appropriately non-revelatory permissions
  • Making sure new servers have my set of favorite directories

I had used cfengine2 to build system management tools with a “PostgreSQL flair”, where the point was to help manage database instances, doing things like:

  • Deploying PostgreSQL binaries and libraries (our custom builds included Slony-I, for instance)
  • Rotating database logs
  • Building out the filesystem environment for database clusters, thus
    • Setting up needed directories for PGDATA
    • Setting up database log directories
    • Setting up symlinks for the latest binaries, alongside the above “deploying” of the binaries

Eventually, others took this over, ultimately replacing CFEngine with newer tools like Puppet and Ansible, so these uses fell out of my hands.

I never made the migration from CFEngine 2 to CFEngine 3; the latter is apparently a fair bit more featureful, but I found myself unhappy with how the authors decided that having decently trackable logging was something they felt should be a proprietary extra-price extension.

Perhaps ten years later, now, I’m finding that builds of cfengine2 are getting sparse in Linux package management systems.

I started looking around at the sorts of systems that are considered to be successors to CFEngine. My encounters with Puppet have left me with no desire to take that on for systems I’m operating for myself; it seems slow-running and tedious. The short list of plausible alternatives I found of most interest were Ansible and Salt Stack. But as I started poking further, I found that none of these actually reflected the ways in which I have been using CFEngine.

Systems like Puppet, Ansible, and Salt Stack are intended for deploying services and applications, along with their configuration. That’s largely not what I’m doing. (Perhaps I should be looking at it more that way, but it certainly hasn’t been…)

It looks like none of these are what I’m needing for my usual use cases. I am doing some replacements with more modern bits of technology, but with only partial migration away from CFEngine2.

Services

The situations where I was having CFEngine launch, and keep running, certain processes are looking, these days, like what systemd does. I am not especially a lover of systemd, but nor am I one of the haters. I am unhappy with the steady scope creep it seems to undergo, but I do like the way that Unit files provide a declarative way of describing services, their semantics, and their relationships.

For the various services that I want operating, I have set up systemd user unit files. This has led to more CFEngine2 configuration, curiously enough:

  • I create Unit files for services in my favorite Git repo that manages my configuration
  • Configuration files for the service reside in that repo, too.
  • I added CFEngine link targets to point $(HOME)/.config/systemd/user/$SERVICE.service to the unit file in my git repo, and, typically, more to point $HOME/.config to the configuration for the service
  • I added CFEngine process rules that check for service processes that should be running, and run /bin/systemctl --user start $SERVICE if they are not running

It means there’s a few more CFEngine rules, but basically of just two sorts:

  • Process rules, to manage the service process (and it’s using systemd tooling, which is pretty “native,” no horrendous hackishness), and
  • Link rules, to link files in the Git repo into the places where they need to be deployed.

Links

A lot of what I now have left in CFEngine is a set of rules for establishing symlinks.

There has been an outgrowth of tools for doing this sort of thing, and, to be more precise, tools for managing “dotfiles”. There is an awesome-dotfiles repository linking to numerous tools that have been established to help with this.

There are two that elicited the most interest from me:

  • dot-templater, a Rust-based tool with a system for customizing which files (and content) are exposed on each system
  • chezmoi, a more sophisticated system that has a "chezmoi” command for interactively attaching dotfiles to one’s configuration repository

Sadly, they are all so much more sophisticated than symlinks that it has, thus far, seemed simpler just to add a few more link entries to my main CFEngine script.

The direction I am thinking of is to take my “hive” of CFEngine link lines, which, in truth, are decently terse and declarative, and write a little shell-based parser that can read and apply that. Actually, there’s several approaches:

  • Read the link rules, and directly apply them
  • Read the link rules, and generate commands for one or another of the “dotfile manager” tools to put the files under management

Cron Jobs

My use of CFEngine has gone through various bits of evolution over time.

  • Originally, I set up shellcommand rules to run interesting processes periodically, so that my crontab would run cfengine some number of times per hour, and the shellcommand rules would invoke the processes.
    This is well and fine, but means that there are two sources of truth as to what is running, namely what is in the crontab, and what is in my cfengine script. Two sources of truth is not particularly fun.
  • As a part of the “Managing Database Servers” thing, years back, I had recognized that the above was not nice, and so wrote up a script that would capture one’s crontab into a file that would be captured in a specific place, complete with history. It would therefore check the current crontab against the previous version, capturing new versions any time there was a change. This is an output-only approach to things, but nevertheless very useful for tracking history of crontab over time.
    I had never applied this at home.
  • I determined that I needed to fix my “two sources of truth” problem, so took measures to Do Better.
    • A first step was to capture, on each host, the current contents of users’ crontabs, and, as a better thing than before, capturing this in a versioned fashion into a Git repository. This provides the history that Managing Database Servers had done, but as it resides, version-controlled within Git, even better.
      CRONHOME=${HOME}/GitConfig/InitFiles/cron
      pushd ${CRONHOME}
      CRONTABOUTPUT=$(whoami).$(hostname)
      echo "Saving crontab to ${CRONTABOUTPUT}"
      USERNAME=$(/usr/bin/whoami)
      HOST=$(hostname)
      crontab -l > ${CRONHOME}/${CRONTABOUTPUT}
      git add ${CRONTABOUTPUT}
      git commit -m "Saving crontab for user ${USERNAME} on host ${HOST}" ${CRONTABOUTPUT}
  • The new, still better step was to use editfiles to compute what I wanted to have in my crontabs. This would construct new files, $(CRONTABS)/$(hostname).$(username).wanted
    consisting of everything that my CFEngine script decided ought to be running on this host, for this user. Thus, the CFEngine script represents the Single Point Of Truth as to what is supposed to be in my crontab.
    I ran this, and in the interest of some lack of trust ;-), did not immediately automate application of this as a new crontab.
    • I did a nice manual run across each of my hosts, comparing the dumped crontab output with what is thought wanted, namely $(CRONTABS)/$(hostname).$(username).wanted
    • There were discrepancies (and since it wasn’t automatically applied, no consternation!), so some modifications were done to rectify shortcomings
    • When I concluded that everything matched my desires, it’s apropos to run crontab against $(CRONTABS)/$(hostname).$(username).wanted so that this is automatically applied
    • Now we have a series of single points of truth:
      • The captured-in-git history files document actual states of crontab over time
      • If I want to add or remove jobs, that takes place by modifying the CFEngine code to add/remove editfiles rules.

This is not exactly a “migration away from CFEngine”, but it does make for a way better controlled set of cron jobs.

I am quite sure that I am not sure what would be much better. I have looked into cron alternatives both small and large. At one point, we did a Proof of Concept at work looking at Dollar Universe (now a Computer Associates product), at the really sophisticated end. That would, personally, be ridiculous overkill, but there are places where it’s going to be a good choice.

Cron has a number of weaknesses:

  • Not very easily auditable
  • Not good at handling “flow control” where a system may be getting overloaded by the set of cron jobs getting invoked
  • No in-system awareness of jobs that should be mutually exclusive or that should be ordered. (“Don’t run A and B simultaneously; make sure to only run B after having run A”)

Nevertheless, for small-ish tasks where exact timing isn’t too critical and where conflicts may be addressed by running jobs in separate hours of the day, it isn’t worth looking to a job scheduling system that is way more complex to manage and heavier weight to run.

One would-be alternative to cron that looks somewhat interesting is pg_timetable which has its data store backed by PostgreSQL and which has a notion of “task chains.”

At one point, I did a bit of work creating a “pg_cron,” which had loosely similar requirements. It never reached the point of working; the place where I was pointedly short on answers was on how to establish the working environment for tasks. The environment needs to be “portable” in a number of ways; you’d want to be able to control tasks running on remote hosts, too. David Tilbrook’s QEF environment seemed to have relevance; it had ways of managing the launching of work agents with tight control over the environment they would receive. Unfortunately, time just hasn’t permitted experimenting more deeply with that.