I just wrote a puppet manifest that applies a patch to a puppet ruby file on the destination system due to a bug in the version of augeas on the machine. Sidenote: Who do I call to get my devops license revoked? ---- Regarding fleet/etcd/docker/confd: It seems very non-trivial to keep track of systemd service files, confd templates, and Dockerfiles. Definitely more-trivial than puppet based automation but very, very less trivial than it seems to be sold as. Is there something I'm missing about how it all works together? Are we expected to build or use a system built on top of this, such as flynn or deis? ---- My hunch is right on coreos/fleet. It is spelled out directly on the fleet github page.... so less of a hunch and more of a direct and specific project design: "fleet ties together systemd and etcd into a distributed init system. Think of it as an extension of systemd that operates at the cluster level instead of the machine level. This project is very low level and is designed as a foundation for higher order orchestration." ---- fleetd and systemd are still confusing me. TIL: There are definitely more state to a systemd and/or fleetd unit than simply 'start' and 'stopped'. * systemdActiveState: active/inactive/??? * systemdLoadState: loaded/launched/??? * systemdSubState: dead/running/failed/??? From what I gathered, a **loaded/active/running** state is what happens when a service is started as a dependency of another service. **launched/active/running** is what happens when a service is started manually with fleetctl start. A **launched/failed/failed** can't simply be started, it must be loaded and started. I am still trying to figure out how this effects the systemd.service Restart directive which I can not get to restart anything on non-zero exit codes. I'm not 100% convinced many of the systemd features work alongside fleetd and docker. I was hopeful systemd would be my nanny process for ensuring crashed processes could be restarted, but I think it is time to introduce a new service into my test bed before I get too frustrated at systemd. EDIT: Looks like Restart directive demands a timeout (RestartSecs), although I have not found the documentation that proves it. ---- This week I have dived into the logging setup within systemd. Logging now feels like such a 'solved problem' within my systemd experiments. My sufficiently complex multi-machine vagrant setup can have makeshift log aggregation using journalctl, netcat, and socat. log-server dot sh, running on my host machine: socat -u tcp-l:8888,reuseaddr,fork STDOUT log-agent dot sh, running on each machine: DATE="`date '+%Y-%m-%d %H:%M:%S'`" journalctl -f --since="$DATE" | nc hostip 8888 This is, of course, not a solution specific to systemd or binary logging formats. However, journalctl seems to double as a simple logging agent itself; allowing streaming, filtering, cursors, and multiple output formats (json, single line, etc). We can further extend log-agent, to include the hostname or other metadata: DATE="`date '+%Y-%m-%d %H:%M:%S'`" journalctl -f --since="$DATE" \ | sed -u "s,^,[ `hostname` ] ,g" \ | nc hostip 8888 Finally, I would love to replace nc with socat everywhere, using OPENSSL: and OPENSSL-LISTEN to encrypt the logging traffic as it goes from log-agent to log-server. ---- I used to create [Automation Scripts](http://docs.ansible.com/ansible/playbooks.html) for my CI tooling. Now I find that using [Bitnami](https://bitnami.com) Stacks is just easier, and there's no need to worry about fiddling with details. They have really nicely configured and up-to-date releases that are dead easy to deploy. ---- Jelastic cloud isnt diggin' [phusion baseimage](https://github.com/phusion/baseimage-docker) .. lots of fails in run.log And jelastic do *NOT* support restarts like docker-engine **Check mate** or... maybe I can just remove cron and syslog daemons from image .. rebuild baseimage and deactivate cron Or just maybe [DIY](https://www.sourcediver.org/blog/2014/11/17/using-runit-in-a-docker-container/)