MENU
  • Remote Jobs
  • Companies
  • Go Premium
  • Job Alerts
  • Post a Job
  • Log in
  • Sign up
Working Nomads logo Working Nomads
  • Remote Jobs
  • Companies
  • Post Jobs
  • Go Premium
  • Get Free Job Alerts
  • Log in

Staff Engineer, GitLab Delivery - Operate

GitLab

Full-time
North America, Latin America, EMEA
engineer
devops
ruby on rails
architecture
postgresql
Apply for this position

An overview of this role

As a Staff Engineer within the GitLab Operate team, you will lead the technical direction for GitLab's self-managed deployment strategy, with a particular focus on solving zero-downtime upgrades and operational excellence at scale. This is a high-impact technical leadership role where you'll architect and implement the systems that enable thousands of organizations to deploy, upgrade, and operate GitLab reliably in their own infrastructure.

You'll be the technical anchor for our newly formed Operate team, driving the evolution of GitLab's deployment tooling from traditional packaging approaches toward cloud-native, operator-driven automation. Your work will directly impact GitLab's ability to deliver new features to self-managed customers faster while dramatically reducing operational complexity and upgrading friction.

The GitLab Operate team serves as a critical bridge between GitLab engineering and our self-managed customers, ensuring our products are easily deployable, secure, and scalable across a range of environments—from single-node VM deployments to large-scale Kubernetes clusters supporting tens of thousands of users.

What you'll do

Technical Leadership & Architecture

  • Define the technical vision for GitLab's cloud-native deployment and upgrades future, balancing operational simplicity, customer needs, and engineering constraints

  • Lead the design and implementation of the new tooling, including Operator(s), enabling automated lifecycle management and zero-downtime upgrades

  • Architect upgrade orchestration systems that safely coordinate complex multi-component upgrades across databases, application services, and auxiliary components

  • Establish operational maturity standards and guidance for new services being integrated into GitLab's deployment tooling and empowering development teams for the end-to-end of their components

  • Drive technical decisions around service integration patterns, deployment models, and operational interfaces

Lead complex initiatives overarching multiple groups and be the technical leadership voice that set the direction and drives technical decisions

Platform Engineering & Development

  • Design production-grade Kubernetes Operators that aims to reliable reconciliation logic for complex stateful applications

  • Design and implement upgrade orchestration that handles database migrations, rolling deployments, compatibility checks, and rollback capabilities

  • Develop tooling and automation to reduce the operational complexity of running GitLab at scale

  • Create integration frameworks that enable development teams to ship new services with standardized deployment patterns

  • Maintain and evolve GitLab Helm Charts to support both simple and complex deployment topologies

Database & Application Lifecycle Management

  • Contribute to safe database migration strategies for zero-downtime upgrades across PostgreSQL and other stateful components

  • Implement compatibility layers that enable incremental upgrades without requiring simultaneous updates across all components

  • Design and contribute to build validation and pre-flight check systems that detect potential upgrade issues before they impact production

Cross-Functional Collaboration & Enablement

  • Partner with development teams to define integration requirements for new services and features

  • Collaborate with GitLab Dedicated and Gitlab.com SRE teams to align deployment patterns and operational practices

  • Work with Product Management to translate customer needs into technical requirements

  • Mentor and guide other engineers on the team, establishing technical standards and best practices

  • Create technical documentation and runbooks that enable customer success and support teams

Production Operations & Reliability

  • Define and implement observability standards for self-managed deployments, including metrics, logging, and alerting

  • Build automated testing frameworks that validate deployment and upgrade scenarios across reference architectures

  • Establish performance benchmarks and capacity planning guidance for different deployment scales

  • Design resilience patterns for handling failures during upgrades and operations

  • Contribute to incident response and post-mortems for self-managed deployment issues

What you'll bring

Required Experience & Skills

  • 8+ years of software engineering experience with at least 3+ years in platform engineering or infrastructure roles

  • Expert-level Go proficiency (Ruby and Rails as a plus) with demonstrated ability to work in large, complex codebases

  • Production Kubernetes experience, including:

    • Building and maintaining Kubernetes Operators

    • Designing Helm charts for complex stateful applications

    • Understanding of Custom Resource Definitions (CRDs), admission controllers, and controller patterns

    • Experience with stateful workloads, persistent volumes, and storage classes

  • Cloud-native architecture experience, including service mesh, observability stacks, and infrastructure as code

  • Experience shipping production software that customers install and operate in their own infrastructure

  • Understanding of Linux systems, including package management, systemd, and system-level debugging

Highly Valued Experience

  • Experience building or maintaining Operators for complex stateful applications (databases, message queues, etc.)

  • Ruby on Rails expertise and understanding of Rails application architecture

  • Infrastructure automation using Terraform, Ansible, or similar tools

  • Background in Site Reliability Engineering or DevOps with production on-call experience

  • Understanding of compliance and security requirements for enterprise software deployments

  • Experience with observability platforms

  • Open source contribution history, particularly in infrastructure or deployment tooling

Technical Leadership Qualities

  • Technical influence and communication: Ability to design holistic solutions balancing multiple constraints, write clear technical proposals and documentation, and work across teams influencing without direct authority

  • Team development and execution: Track record of mentoring and elevating team capabilities through teaching and code review, combined with pragmatic decision-making and bias for action when facing incomplete information

What Makes You Stand Out

  • You've built Kubernetes Operators in production and dealt with the operational complexities of stateful workload management

  • You have deep PostgreSQL expertise, including schema design and migration strategies, replication, backup, and recovery, handling database upgrades with minimal downtime

  • You have deep experience with database migrations at scale and understand the tradeoffs between downtime and complexity

  • You've shipped software that customers install on-premises and have felt the pain of upgrade friction firsthand

  • You contribute to open source infrastructure projects and understand community dynamics

  • You can explain complex technical concepts clearly to both technical and non-technical audiences

  • You have experience with zero-downtime deployment strategies for monolithic applications transitioning to microservices

  • You've been on-call for production systems and understand what makes software operable

About the team

The Operate team is part of GitLab Delivery and focuses on delivering GitLab to self-managed users through supported and validated tooling. This includes maintaining and evolving the GitLab Omnibus package, Helm Charts, GitLab Operator, and the GitLab Environment Toolkit (GET).

We partner with SRE, Release, Security, and Development teams to ensure GitLab is easily deployable, supportable, and production-ready in diverse environments—from small single-node deployments to large enterprise-scale Kubernetes clusters.

Current challenges we're tackling:

  • Zero-downtime upgrades: Enabling self-managed customers to upgrade GitLab without service interruption

  • Operational complexity: Reducing the burden of managing GitLab at scale while expanding our service architecture

  • Cloud-native transition: Building the next generation of deployment tooling while supporting existing customers

  • Upgrade velocity: Reducing the time it takes for 80% of self-managed customers to adopt new releases from 7.8 months to 4 months

Team structure:

You'll be joining a newly consolidated Operate team that is building the capability to deliver GitLab's expanding service architecture to self-managed customers. As a Staff engineer, you'll work closely with the engineering manager and product manager to define technical direction while mentoring other engineers on the team.

Apply for this position
Bookmark Report

About the job

Full-time
North America, Latin America, EMEA
Posted 17 hours ago
engineer
devops
ruby on rails
architecture
postgresql

Apply for this position

Bookmark
Report
Enhancv advertisement

30,000+
REMOTE JOBS

Unlock access to our database and
kickstart your remote career
Join Premium

Staff Engineer, GitLab Delivery - Operate

GitLab

An overview of this role

As a Staff Engineer within the GitLab Operate team, you will lead the technical direction for GitLab's self-managed deployment strategy, with a particular focus on solving zero-downtime upgrades and operational excellence at scale. This is a high-impact technical leadership role where you'll architect and implement the systems that enable thousands of organizations to deploy, upgrade, and operate GitLab reliably in their own infrastructure.

You'll be the technical anchor for our newly formed Operate team, driving the evolution of GitLab's deployment tooling from traditional packaging approaches toward cloud-native, operator-driven automation. Your work will directly impact GitLab's ability to deliver new features to self-managed customers faster while dramatically reducing operational complexity and upgrading friction.

The GitLab Operate team serves as a critical bridge between GitLab engineering and our self-managed customers, ensuring our products are easily deployable, secure, and scalable across a range of environments—from single-node VM deployments to large-scale Kubernetes clusters supporting tens of thousands of users.

What you'll do

Technical Leadership & Architecture

  • Define the technical vision for GitLab's cloud-native deployment and upgrades future, balancing operational simplicity, customer needs, and engineering constraints

  • Lead the design and implementation of the new tooling, including Operator(s), enabling automated lifecycle management and zero-downtime upgrades

  • Architect upgrade orchestration systems that safely coordinate complex multi-component upgrades across databases, application services, and auxiliary components

  • Establish operational maturity standards and guidance for new services being integrated into GitLab's deployment tooling and empowering development teams for the end-to-end of their components

  • Drive technical decisions around service integration patterns, deployment models, and operational interfaces

Lead complex initiatives overarching multiple groups and be the technical leadership voice that set the direction and drives technical decisions

Platform Engineering & Development

  • Design production-grade Kubernetes Operators that aims to reliable reconciliation logic for complex stateful applications

  • Design and implement upgrade orchestration that handles database migrations, rolling deployments, compatibility checks, and rollback capabilities

  • Develop tooling and automation to reduce the operational complexity of running GitLab at scale

  • Create integration frameworks that enable development teams to ship new services with standardized deployment patterns

  • Maintain and evolve GitLab Helm Charts to support both simple and complex deployment topologies

Database & Application Lifecycle Management

  • Contribute to safe database migration strategies for zero-downtime upgrades across PostgreSQL and other stateful components

  • Implement compatibility layers that enable incremental upgrades without requiring simultaneous updates across all components

  • Design and contribute to build validation and pre-flight check systems that detect potential upgrade issues before they impact production

Cross-Functional Collaboration & Enablement

  • Partner with development teams to define integration requirements for new services and features

  • Collaborate with GitLab Dedicated and Gitlab.com SRE teams to align deployment patterns and operational practices

  • Work with Product Management to translate customer needs into technical requirements

  • Mentor and guide other engineers on the team, establishing technical standards and best practices

  • Create technical documentation and runbooks that enable customer success and support teams

Production Operations & Reliability

  • Define and implement observability standards for self-managed deployments, including metrics, logging, and alerting

  • Build automated testing frameworks that validate deployment and upgrade scenarios across reference architectures

  • Establish performance benchmarks and capacity planning guidance for different deployment scales

  • Design resilience patterns for handling failures during upgrades and operations

  • Contribute to incident response and post-mortems for self-managed deployment issues

What you'll bring

Required Experience & Skills

  • 8+ years of software engineering experience with at least 3+ years in platform engineering or infrastructure roles

  • Expert-level Go proficiency (Ruby and Rails as a plus) with demonstrated ability to work in large, complex codebases

  • Production Kubernetes experience, including:

    • Building and maintaining Kubernetes Operators

    • Designing Helm charts for complex stateful applications

    • Understanding of Custom Resource Definitions (CRDs), admission controllers, and controller patterns

    • Experience with stateful workloads, persistent volumes, and storage classes

  • Cloud-native architecture experience, including service mesh, observability stacks, and infrastructure as code

  • Experience shipping production software that customers install and operate in their own infrastructure

  • Understanding of Linux systems, including package management, systemd, and system-level debugging

Highly Valued Experience

  • Experience building or maintaining Operators for complex stateful applications (databases, message queues, etc.)

  • Ruby on Rails expertise and understanding of Rails application architecture

  • Infrastructure automation using Terraform, Ansible, or similar tools

  • Background in Site Reliability Engineering or DevOps with production on-call experience

  • Understanding of compliance and security requirements for enterprise software deployments

  • Experience with observability platforms

  • Open source contribution history, particularly in infrastructure or deployment tooling

Technical Leadership Qualities

  • Technical influence and communication: Ability to design holistic solutions balancing multiple constraints, write clear technical proposals and documentation, and work across teams influencing without direct authority

  • Team development and execution: Track record of mentoring and elevating team capabilities through teaching and code review, combined with pragmatic decision-making and bias for action when facing incomplete information

What Makes You Stand Out

  • You've built Kubernetes Operators in production and dealt with the operational complexities of stateful workload management

  • You have deep PostgreSQL expertise, including schema design and migration strategies, replication, backup, and recovery, handling database upgrades with minimal downtime

  • You have deep experience with database migrations at scale and understand the tradeoffs between downtime and complexity

  • You've shipped software that customers install on-premises and have felt the pain of upgrade friction firsthand

  • You contribute to open source infrastructure projects and understand community dynamics

  • You can explain complex technical concepts clearly to both technical and non-technical audiences

  • You have experience with zero-downtime deployment strategies for monolithic applications transitioning to microservices

  • You've been on-call for production systems and understand what makes software operable

About the team

The Operate team is part of GitLab Delivery and focuses on delivering GitLab to self-managed users through supported and validated tooling. This includes maintaining and evolving the GitLab Omnibus package, Helm Charts, GitLab Operator, and the GitLab Environment Toolkit (GET).

We partner with SRE, Release, Security, and Development teams to ensure GitLab is easily deployable, supportable, and production-ready in diverse environments—from small single-node deployments to large enterprise-scale Kubernetes clusters.

Current challenges we're tackling:

  • Zero-downtime upgrades: Enabling self-managed customers to upgrade GitLab without service interruption

  • Operational complexity: Reducing the burden of managing GitLab at scale while expanding our service architecture

  • Cloud-native transition: Building the next generation of deployment tooling while supporting existing customers

  • Upgrade velocity: Reducing the time it takes for 80% of self-managed customers to adopt new releases from 7.8 months to 4 months

Team structure:

You'll be joining a newly consolidated Operate team that is building the capability to deliver GitLab's expanding service architecture to self-managed customers. As a Staff engineer, you'll work closely with the engineering manager and product manager to define technical direction while mentoring other engineers on the team.

Working Nomads

Post Jobs
Premium Subscription
Sponsorship
Free Job Alerts

Job Skills
Jobs by Location
API
FAQ
Privacy policy
Terms and conditions
Contact us
About us

Jobs by Category

Remote Administration jobs
Remote Consulting jobs
Remote Customer Success jobs
Remote Development jobs
Remote Design jobs
Remote Education jobs
Remote Finance jobs
Remote Legal jobs
Remote Healthcare jobs
Remote Human Resources jobs
Remote Management jobs
Remote Marketing jobs
Remote Sales jobs
Remote System Administration jobs
Remote Writing jobs

Jobs by Position Type

Remote Full-time jobs
Remote Part-time jobs
Remote Contract jobs

Jobs by Region

Remote jobs Anywhere
Remote jobs North America
Remote jobs Latin America
Remote jobs Europe
Remote jobs Middle East
Remote jobs Africa
Remote jobs APAC

Jobs by Skill

Remote Accounting jobs
Remote Assistant jobs
Remote Copywriting jobs
Remote Cyber Security jobs
Remote Data Analyst jobs
Remote Data Entry jobs
Remote English jobs
Remote Spanish jobs
Remote Project Management jobs
Remote QA jobs
Remote SEO jobs

Jobs by Country

Remote jobs Australia
Remote jobs Argentina
Remote jobs Brazil
Remote jobs Canada
Remote jobs Colombia
Remote jobs France
Remote jobs Germany
Remote jobs Ireland
Remote jobs India
Remote jobs Japan
Remote jobs Mexico
Remote jobs Netherlands
Remote jobs New Zealand
Remote jobs Philippines
Remote jobs Poland
Remote jobs Portugal
Remote jobs Singapore
Remote jobs Spain
Remote jobs UK
Remote jobs USA


Working Nomads curates remote digital jobs from around the web.

© 2025 Working Nomads.