Staff Site Reliability Engineer
Job Summary
We’re seeking a Staff Site Reliability Engineer to serve as a technical leader within our infrastructure organization. In this role, you’ll help shape the reliability strategy across our engineering teams, drive adoption of best practices, and tackle our most complex infrastructure challenges. You’ll be part of an international, highly engaged and technical group that is well-versed in building enterprise-ready and extremely secure software systems. Our core values of “simple is strong, respect is king, build it like you own it and think like a hacker” should resonate with you.
Essential Duties and Responsibilities
Define and drive the technical vision for infrastructure reliability across the organization
Architect large-scale, fault-tolerant systems on AWS using Terraform
Lead cross-functional initiatives to improve system reliability, scalability, and efficiency
Establish standards for infrastructure-as-code, CI/CD, and deployment practices
Design and implement solutions for our most complex operational challenges
Lead incident response for critical outages and drive systemic improvements
Mentor senior engineers and help grow the SRE team’s capabilities
Evaluate and introduce new technologies that improve operational excellence
Influence engineering culture around reliability, observability, and operational maturity
Education, Experience, Skills, & Abilities
5+ years of experience in SRE, DevOps, or systems engineering, with demonstrated technical leadership
Expert-level knowledge of Terraform, including module design, state management, and scaling IaC across teams
Deep expertise in AWS architecture and services at scale, with strong focus on ECS
Proven experience designing and operating containerized workloads on ECS, including capacity planning, service scaling, and task placement strategies
Strong experience designing and implementing CI/CD systems with GitHub Actions or similar tools
Track record of leading complex, cross-team technical initiatives
Advanced proficiency in Python, Ruby, Javascript, or similar languages
Strong understanding of distributed systems principles
Excellent written and verbal communication skills
Proven ability to balance long-term technical strategy with immediate operational needs
Preferred Experience
Experience building internal developer platforms or self-service infrastructure tooling
Knowledge of FedRAMP
Background in cost optimization and FinOps practices
Contributions to open-source infrastructure projects
Experience scaling infrastructure organizations and processes
Experience defining and implementing SLO frameworks
Working Conditions
The ideal candidate must be able to complete all physical requirements of the job with or without reasonable accommodation.
Sitting and/or standing - Must be able to remain in a stationary position 50% of the time
Carrying and /or lifting - Must be able to carry / move laptop as needed throughout the work day.
Environment - remote, work-from-home 100% of the time.
ADA Statement
Bugcrowd is committed to the full inclusion of all qualified individuals. In keeping with our commitment, Bugcrowd will take the steps to assure that people with disabilities are provided reasonable accommodations. Accordingly, if reasonable accommodation is required to fully participate in the job application or interview process, to perform the essential functions of the position, and/or to receive all other benefits and privileges of employment, please contact HR at ada@bugcrowd.com.
Pay Range Disclosure
At Bugcrowd, we strive for fairness, equality and to create an environment that allows our people to perform at their very best. Our compensation philosophy is to foster a collaborative community that rewards, attracts and retains the best possible talent. The provided salary details are based on US national averages and we retain the flexibility to tailor to the needs of the business.
The national estimate for the current base range for the position of Staff Site Reliability Engineer is: $151,040 -$188,800.
This position may also be eligible to participate in a discretionary bonus program or commission plan, subject to the rules governing the program, whereby an award, if any, depends on various factors, including, without limitation, individual and organizational performance.
About the job
Apply for this position
Staff Site Reliability Engineer
Job Summary
We’re seeking a Staff Site Reliability Engineer to serve as a technical leader within our infrastructure organization. In this role, you’ll help shape the reliability strategy across our engineering teams, drive adoption of best practices, and tackle our most complex infrastructure challenges. You’ll be part of an international, highly engaged and technical group that is well-versed in building enterprise-ready and extremely secure software systems. Our core values of “simple is strong, respect is king, build it like you own it and think like a hacker” should resonate with you.
Essential Duties and Responsibilities
Define and drive the technical vision for infrastructure reliability across the organization
Architect large-scale, fault-tolerant systems on AWS using Terraform
Lead cross-functional initiatives to improve system reliability, scalability, and efficiency
Establish standards for infrastructure-as-code, CI/CD, and deployment practices
Design and implement solutions for our most complex operational challenges
Lead incident response for critical outages and drive systemic improvements
Mentor senior engineers and help grow the SRE team’s capabilities
Evaluate and introduce new technologies that improve operational excellence
Influence engineering culture around reliability, observability, and operational maturity
Education, Experience, Skills, & Abilities
5+ years of experience in SRE, DevOps, or systems engineering, with demonstrated technical leadership
Expert-level knowledge of Terraform, including module design, state management, and scaling IaC across teams
Deep expertise in AWS architecture and services at scale, with strong focus on ECS
Proven experience designing and operating containerized workloads on ECS, including capacity planning, service scaling, and task placement strategies
Strong experience designing and implementing CI/CD systems with GitHub Actions or similar tools
Track record of leading complex, cross-team technical initiatives
Advanced proficiency in Python, Ruby, Javascript, or similar languages
Strong understanding of distributed systems principles
Excellent written and verbal communication skills
Proven ability to balance long-term technical strategy with immediate operational needs
Preferred Experience
Experience building internal developer platforms or self-service infrastructure tooling
Knowledge of FedRAMP
Background in cost optimization and FinOps practices
Contributions to open-source infrastructure projects
Experience scaling infrastructure organizations and processes
Experience defining and implementing SLO frameworks
Working Conditions
The ideal candidate must be able to complete all physical requirements of the job with or without reasonable accommodation.
Sitting and/or standing - Must be able to remain in a stationary position 50% of the time
Carrying and /or lifting - Must be able to carry / move laptop as needed throughout the work day.
Environment - remote, work-from-home 100% of the time.
ADA Statement
Bugcrowd is committed to the full inclusion of all qualified individuals. In keeping with our commitment, Bugcrowd will take the steps to assure that people with disabilities are provided reasonable accommodations. Accordingly, if reasonable accommodation is required to fully participate in the job application or interview process, to perform the essential functions of the position, and/or to receive all other benefits and privileges of employment, please contact HR at ada@bugcrowd.com.
Pay Range Disclosure
At Bugcrowd, we strive for fairness, equality and to create an environment that allows our people to perform at their very best. Our compensation philosophy is to foster a collaborative community that rewards, attracts and retains the best possible talent. The provided salary details are based on US national averages and we retain the flexibility to tailor to the needs of the business.
The national estimate for the current base range for the position of Staff Site Reliability Engineer is: $151,040 -$188,800.
This position may also be eligible to participate in a discretionary bonus program or commission plan, subject to the rules governing the program, whereby an award, if any, depends on various factors, including, without limitation, individual and organizational performance.
