Sr. Site Reliability Engineer, Data & Delivery

Toronto, Vancouver (Remote, ON or BC only)

Our Client is searching for an experienced Site Reliability Engineer (SRE) to join their Data and Delivery team.  As a member of this team, you will be focused on enabling reliable and efficient service delivery across our Engineering organization. We partner closely with contributors responsible for our Kubernetes platform, our VMWare-based infrastructure, and our Observability systems.  


Your Role 

In this role, you’ll be expected to work on continuously improving the ability of engineers to develop, test, release, and maintain their production services. You’ll participate in managing the systems and processes that ferry contributions from our workstations to our production environments, including SCM, Package Managers, Build Servers, Test Automation, Release Orchestration, and ultimately, the tooling that enables Environment Automation. To be successful, you’ll need to work with teams across the Engineering organization to understand their needs, and you’ll need to work closely with our internal Platform and Infrastructure teams to build and maintain the services that provide for those needs. 


Who You Are

You’re an active participant in a culture of sharing and learning. You believe that we succeed or fail as a team, and you confront problems (not people) when things are difficult. You’re an experienced technologist with a passion for DevOps, and you’ve spent a few years dealing with complex automation problems in a Linux/Unix ecosystem. We expect experience with most of the tools and concepts outlined in the skill section (or comparable) -- but we know that nobody knows everything, and you’re a growth-oriented engineer, right?


Your Skills

    • Linux/Unix user experience, particularly with programming tools to automate tasks

    • SCM (Git, Github, Mercurial)

    • Package and Artifact Management (Artifactory, apt, pip, gem, npm)

    • Build Automation (Jenkins, Github Actions)

    • Configuration Management (Puppet)

    • Release Orchestration (Ansible, Capistrano)

    • Programming with at least one OOP language (Python preferred; you will also encounter Erlang, Elixir, and Ruby)

    • Scripting (distinct from mid-sized software development: as an SRE, you aren’t going to be able to avoid hacking on Bash)

    • Experience with distributed systems