Here in Football Radar's Engineering Department, we go by the name of the Systems Team, though to a certain degree all of the above can be used to cover what we do (maybe slightly fewer cables these days!). As we’ve been fairly quiet on the blog, it makes sense to give you an overview of what we do.
We underpin the ability for the other three Engineering teams to test, deploy and run code, so as to make it as easy to work with as possible. We like it this way as it makes the engineers more productive and means that a systems blocker is rarely a thing in standups. As Systems Engineers we do this using a few different technologies and platforms, which we have introduced and are responsible for maintaining.
Jenkins is our build server - it’s a pretty well known tool for the task. We use it for its excellent flexibility and the healthy ecosystem surrounding it. We build a combination of Scala, JS and C++ and test PHP on it before packing to a suitable distribution format. In our case these are Docker containers and fat JARs.
We introduced Docker to production in January 2015, having used it in testing for the preceding three months. This proved to be a great move for us as it provides the ideal idempotent platform on which to develop, test and deploy. There is so much written about Docker on the web, so we won’t worry about it here.
Mesos, Marathon and Chronos
The golden triangle of Mesos with the Marathon and Chronos frameworks running on it underpins the vast majority of where our workloads run. We introduced this into production at the same time as Docker. It’s a great combination that requires little attention from us on a day-to-day basis. It gives the engineers somewhere that is easy to deploy to and provides a simple dashboard for support purposes. We also use Marathon-LB which we have committed code to.
We’re using Pants in our monorepo to build projects that are contained within it. As with Jenkins we can build a wide range of targets and deploy to each of our environments simply. This is an open-source project to which we have committed quite a bit of code, so as to enable building C++ with it for some recent projects that run on GPU.
Keeping an eye on all of this would be impossible without some tools at our disposal. I'm sure you've read about our use of Prometheus. We also use Sensu and collectd to feed data into InfluxDB, which is then either graphed via Grafana or sends alerts to Slack. This gives either the individual on call or the relevant team the information they need to resolve issues quickly at either the infrastructure or software level.
As for the DevOps bit of what we do, we’re all about driving improvement in our environment through testing new platforms and technologies as well as increasing automation. Our usual process for this is that on finding something new we make a quick evaluation of it as a team, then if we think it may have legs we create prototypes to demo to the wider engineering team before seeking a consensus regarding whether we bring it into production.
This approach has worked well for us when bringing in radically different approaches. For instance when we implemented Mesos, Marathon and Chronos we were moving away from a set of pet servers and heading into the the world of cattle-wrangling. As this was going to represent quite a change, a series of interactive demos were undertaken to show the engineering team, providing a forum for questions to be asked and opinion sought before making the switchover. This made it a painless process that was well received and had the added benefit of reducing the fear factor for those who were new to support.
We’re less cable monkeys these days. Having said that, the overall office network does fall under our responsibility and we do have a few servers in the office. These are for Active Directory and for internal monitoring and maintenance, such as Observium and a PXE server that runs Clonezilla to simplify new machine setup. Due to this we tend not to remove any malware or viruses that get past either the firewall or the anti virus; instead we find that reimaging the machines is actually quicker with Clonezilla running as a scripted environment.
Desktop support is outsourced, so isn’t something that we look after directly (though we have been known to help out if we’re quiet!). It helps with relationships within the company as a whole.
If you think you’d like to be part of an environment where you have a team of people making things as easy as possible for you then we’re always on the lookout for new software engineers. Have a look at the roles we’re currently recruiting for.
Image courtesy of Tristan Schmurr