MENU
  • Remote Jobs
  • Companies
  • Go Premium
  • Job Alerts
  • Post a Job
  • Log in
  • Sign up
Working Nomads logo Working Nomads
  • Remote Jobs
  • Companies
  • Post Jobs
  • Go Premium
  • Get Free Job Alerts
  • Log in

Intermediate Site Reliability Engineer - Tenant Scale: Tenant Services

GitLab

Full-time
North America, Latin America, EMEA
engineer
aws
saas
cloud
kubernetes
Apply for this position

An overview of this role

As a Site Reliability Engineer (SRE) at GitLab, you keep GitLab.com and other production systems running smoothly for millions of users by combining pragmatic operations with strong software engineering practices. You focus on the systems layer (operating systems, storage, networking) and edge services and Kubernetes workloads, designing and operating highly scalable, reliable, and secure infrastructure that supports one of the largest single-tenancy open source SaaS sites on the Internet. You’ll work across the Infrastructure organization to automate away toil, improve availability and performance, and respond to incidents during your local daytime hours as part of a globally distributed on-call rotation. In this role, you’ll help Tenant Services safeguard and scale customer data while increasing automation so GitLab can continue to grow with enterprise-level expectations for reliability and availability.

What you’ll do

  • Design and implement highly scalable infrastructure for GitLab.com to support current and future growth.

  • Collaborate with cross-functional teams across the Infrastructure organization to plan and deliver projects that shape GitLab’s platform direction.

  • Operate and improve edge services and Kubernetes workloads, acting as a subject matter expert within the infrastructure department.

  • Participate in a global on-call rotation during your local daytime hours, respond to production incidents, and contribute to clear, constructive incident reviews.

  • Reduce toil by automating operational tasks and building tools that improve reliability, availability, and scalability.

  • Apply infrastructure as code and configuration management practices to manage cloud resources and environments consistently.

  • Write and maintain production-quality code, preferably in Go or Ruby, to enhance our systems and automation toolchain.

What you’ll bring

  • Background working with the Kubernetes ecosystem, including tools such as Helm, and running production workloads.

  • Experience operating cloud infrastructure on platforms like Google Cloud Platform or Amazon Web Services, especially networking, hosted Kubernetes services, and scaling.

  • Hands-on practice with infrastructure as code and configuration management tools such as Ansible or Chef.

  • Strong programming skills in a modern language, preferably Go or Ruby, applied to automation and reliability problems.

  • Ability to clearly define problems, think beyond short-term fixes, and design solutions that improve systems over time.

  • Consistent focus on reducing toil through automation and thoughtful system design.

  • Independent, proactive working style with a bias for action and comfort operating as a “manager of one” in a distributed, asynchronous environment.

  • Clear written and verbal communication skills, with openness to candidates who bring transferable experience from related reliability, infrastructure, or platform roles.

About the team

Tenant Services is the team responsible for safeguarding and securing customer data stored by the GitLab application and for setting clear guidelines for how that data is accessed. The team runs the largest GitLab instance in existence, and one of the largest single-tenancy open source SaaS sites on the Internet, which means you’ll work on unique scale and reliability challenges that impact users every day. As an all-remote, globally distributed group, Tenant Services collaborates asynchronously across time zones and leans heavily on automation to meet enterprise expectations for reliability, availability, and data protection while continuing to scale. For more on how this team works, see our Team Handbook page.The Tenant Services team at GitLab is responsible for safeguarding and securing customer data stored by the GitLab application and for setting clear guidelines for how that data is accessed. We run the largest GitLab instance in existence, and one of the largest single-tenancy open source SaaS sites on the Internet, which means you’ll work on unique scale and reliability challenges that impact users every day. As an all-remote, globally distributed group, we collaborate asynchronously across time zones and lean heavily on automation to meet enterprise expectations for reliability, availability, and data protection while continuing to scale. For more on how we work, see our Team Handbook page.

Apply for this position
Bookmark Report

About the job

Full-time
North America, Latin America, EMEA
Mid Level
Posted 4 hours ago
engineer
aws
saas
cloud
kubernetes

Apply for this position

Bookmark
Report
Enhancv advertisement
+ 1,284 new jobs added today
30,000+
Remote Jobs

Don't miss out — new listings every hour

Join Premium

Intermediate Site Reliability Engineer - Tenant Scale: Tenant Services

GitLab

An overview of this role

As a Site Reliability Engineer (SRE) at GitLab, you keep GitLab.com and other production systems running smoothly for millions of users by combining pragmatic operations with strong software engineering practices. You focus on the systems layer (operating systems, storage, networking) and edge services and Kubernetes workloads, designing and operating highly scalable, reliable, and secure infrastructure that supports one of the largest single-tenancy open source SaaS sites on the Internet. You’ll work across the Infrastructure organization to automate away toil, improve availability and performance, and respond to incidents during your local daytime hours as part of a globally distributed on-call rotation. In this role, you’ll help Tenant Services safeguard and scale customer data while increasing automation so GitLab can continue to grow with enterprise-level expectations for reliability and availability.

What you’ll do

  • Design and implement highly scalable infrastructure for GitLab.com to support current and future growth.

  • Collaborate with cross-functional teams across the Infrastructure organization to plan and deliver projects that shape GitLab’s platform direction.

  • Operate and improve edge services and Kubernetes workloads, acting as a subject matter expert within the infrastructure department.

  • Participate in a global on-call rotation during your local daytime hours, respond to production incidents, and contribute to clear, constructive incident reviews.

  • Reduce toil by automating operational tasks and building tools that improve reliability, availability, and scalability.

  • Apply infrastructure as code and configuration management practices to manage cloud resources and environments consistently.

  • Write and maintain production-quality code, preferably in Go or Ruby, to enhance our systems and automation toolchain.

What you’ll bring

  • Background working with the Kubernetes ecosystem, including tools such as Helm, and running production workloads.

  • Experience operating cloud infrastructure on platforms like Google Cloud Platform or Amazon Web Services, especially networking, hosted Kubernetes services, and scaling.

  • Hands-on practice with infrastructure as code and configuration management tools such as Ansible or Chef.

  • Strong programming skills in a modern language, preferably Go or Ruby, applied to automation and reliability problems.

  • Ability to clearly define problems, think beyond short-term fixes, and design solutions that improve systems over time.

  • Consistent focus on reducing toil through automation and thoughtful system design.

  • Independent, proactive working style with a bias for action and comfort operating as a “manager of one” in a distributed, asynchronous environment.

  • Clear written and verbal communication skills, with openness to candidates who bring transferable experience from related reliability, infrastructure, or platform roles.

About the team

Tenant Services is the team responsible for safeguarding and securing customer data stored by the GitLab application and for setting clear guidelines for how that data is accessed. The team runs the largest GitLab instance in existence, and one of the largest single-tenancy open source SaaS sites on the Internet, which means you’ll work on unique scale and reliability challenges that impact users every day. As an all-remote, globally distributed group, Tenant Services collaborates asynchronously across time zones and leans heavily on automation to meet enterprise expectations for reliability, availability, and data protection while continuing to scale. For more on how this team works, see our Team Handbook page.The Tenant Services team at GitLab is responsible for safeguarding and securing customer data stored by the GitLab application and for setting clear guidelines for how that data is accessed. We run the largest GitLab instance in existence, and one of the largest single-tenancy open source SaaS sites on the Internet, which means you’ll work on unique scale and reliability challenges that impact users every day. As an all-remote, globally distributed group, we collaborate asynchronously across time zones and lean heavily on automation to meet enterprise expectations for reliability, availability, and data protection while continuing to scale. For more on how we work, see our Team Handbook page.

Working Nomads

Post Jobs
Premium Subscription
Sponsorship
Reviews
Job Alerts

Job Skills
Jobs by Location
API
FAQ
Privacy policy
Terms and conditions
Contact us
About us

Jobs by Category

Remote Administration jobs
Remote Consulting jobs
Remote Customer Success jobs
Remote Development jobs
Remote Design jobs
Remote Education jobs
Remote Finance jobs
Remote Legal jobs
Remote Healthcare jobs
Remote Human Resources jobs
Remote Management jobs
Remote Marketing jobs
Remote Sales jobs
Remote System Administration jobs
Remote Writing jobs

Jobs by Position Type

Remote Full-time jobs
Remote Part-time jobs
Remote Contract jobs

Jobs by Region

Remote jobs Anywhere
Remote jobs North America
Remote jobs Latin America
Remote jobs Europe
Remote jobs Middle East
Remote jobs Africa
Remote jobs APAC

Jobs by Skill

Remote Accounting jobs
Remote Assistant jobs
Remote Copywriting jobs
Remote Cyber Security jobs
Remote Data Analyst jobs
Remote Data Entry jobs
Remote English jobs
Remote Spanish jobs
Remote Project Management jobs
Remote QA jobs
Remote SEO jobs

Jobs by Country

Remote jobs Australia
Remote jobs Argentina
Remote jobs Brazil
Remote jobs Canada
Remote jobs Colombia
Remote jobs France
Remote jobs Germany
Remote jobs Ireland
Remote jobs India
Remote jobs Japan
Remote jobs Mexico
Remote jobs Netherlands
Remote jobs New Zealand
Remote jobs Philippines
Remote jobs Poland
Remote jobs Portugal
Remote jobs Singapore
Remote jobs Spain
Remote jobs UK
Remote jobs USA


Working Nomads curates remote digital jobs from around the web.

© 2026 Working Nomads.