Senior Site Reliability Engineer
To see similar active jobs please follow this link: Remote System Administration jobs
Senior Site Reliability Engineer
EMEA, Remote
As a Site Reliability Engineer at GitLab, you are responsible for keeping all user-facing services and other GitLab production systems running smoothly. SREs are a blend of pragmatic operators and software craftspeople that apply sound engineering principles, operational discipline, and mature automation to our operating environments and the GitLab codebase.
SREs specialize in systems (operating systems, storage subsystems, networking), while implementing best practices for availability, reliability and scalability, with varied interests in algorithms and distributed systems.
GitLab.com is a unique site and brings with it unique challenges: it is the largest GitLab instance in existence (and in fact, one of the largest single-tenancy open-source SaaS sites on the Internet). The team’s experience feeds back into other Engineering groups within the company, as well as to GitLab customers running self-managed installations.
What you’ll do
Work embedded in the IDE team. You will report to the Practices team’s manager but will participate in the IDE’s team/’s processes and meetings to make sure you are successful.
Effectively manage and maintain the infrastructure for remote development
Manage application deployments using Helm charts and the Kubernetes ecosystem.
Optimize the infrastructure on Google Cloud Platform. Utilizing GKE, Terraform, Ansible, and other tools.
Track the health, performance, and availability of the remote development infrastructure using observability and monitoring tools
Participate in an on-call rotation
What you’ll need
A deep understanding of Kubernetes and its core concepts
Experience with the Kubernetes ecosystem
Google Cloud Platform expertise, specifically around networking, GKE configuration, and scaling
Experience with Terraform, Ansible, and other tools like Chef.
Experience with Observability and Monitoring tools
Previous experience in an SRE or relatable team, driving efforts and mentoring others..
Think about systems: edge cases, failure modes, behaviors, specific implementations.
Know your way around Linux and the Unix Shell.
Have an urge to collaborate and communicate asynchronously.
Bonus: Experience in programming skills: Shell, and Ruby and/or Go
About the team
The Practices Team is a subgroup of the Reliability Team.
Our mission is to ensure the reliability, performance, and availability of GitLab.com by partnering with Stage Groups to ensure that features and services are designed and implemented with reliability in mind. The team collaborates with Stage groups to build, maintain, and improve services and ensure the services' SLO is met as per GitLab.com's availability and performance goals.
How GitLab will support you
All remote, asynchronous work environment
Home office support
Please note that we welcome interest from candidates with varying levels of experience; many successful candidates do not meet every single requirement. Additionally, studies have shown that people from underrepresented groups are less likely to apply to a job unless they meet every single qualification. If you're excited about this role, please apply and allow our recruiters to assess your application. #LI-BC2
Senior Site Reliability Engineer
To see similar active jobs please follow this link: Remote System Administration jobs
Senior Site Reliability Engineer
EMEA, Remote
As a Site Reliability Engineer at GitLab, you are responsible for keeping all user-facing services and other GitLab production systems running smoothly. SREs are a blend of pragmatic operators and software craftspeople that apply sound engineering principles, operational discipline, and mature automation to our operating environments and the GitLab codebase.
SREs specialize in systems (operating systems, storage subsystems, networking), while implementing best practices for availability, reliability and scalability, with varied interests in algorithms and distributed systems.
GitLab.com is a unique site and brings with it unique challenges: it is the largest GitLab instance in existence (and in fact, one of the largest single-tenancy open-source SaaS sites on the Internet). The team’s experience feeds back into other Engineering groups within the company, as well as to GitLab customers running self-managed installations.
What you’ll do
Work embedded in the IDE team. You will report to the Practices team’s manager but will participate in the IDE’s team/’s processes and meetings to make sure you are successful.
Effectively manage and maintain the infrastructure for remote development
Manage application deployments using Helm charts and the Kubernetes ecosystem.
Optimize the infrastructure on Google Cloud Platform. Utilizing GKE, Terraform, Ansible, and other tools.
Track the health, performance, and availability of the remote development infrastructure using observability and monitoring tools
Participate in an on-call rotation
What you’ll need
A deep understanding of Kubernetes and its core concepts
Experience with the Kubernetes ecosystem
Google Cloud Platform expertise, specifically around networking, GKE configuration, and scaling
Experience with Terraform, Ansible, and other tools like Chef.
Experience with Observability and Monitoring tools
Previous experience in an SRE or relatable team, driving efforts and mentoring others..
Think about systems: edge cases, failure modes, behaviors, specific implementations.
Know your way around Linux and the Unix Shell.
Have an urge to collaborate and communicate asynchronously.
Bonus: Experience in programming skills: Shell, and Ruby and/or Go
About the team
The Practices Team is a subgroup of the Reliability Team.
Our mission is to ensure the reliability, performance, and availability of GitLab.com by partnering with Stage Groups to ensure that features and services are designed and implemented with reliability in mind. The team collaborates with Stage groups to build, maintain, and improve services and ensure the services' SLO is met as per GitLab.com's availability and performance goals.
How GitLab will support you
All remote, asynchronous work environment
Home office support
Please note that we welcome interest from candidates with varying levels of experience; many successful candidates do not meet every single requirement. Additionally, studies have shown that people from underrepresented groups are less likely to apply to a job unless they meet every single qualification. If you're excited about this role, please apply and allow our recruiters to assess your application. #LI-BC2