MENU
  • Remote Jobs
  • Companies
  • Go Premium
  • Job Alerts
  • Post a Job
  • Log in
  • Sign up
Working Nomads logo Working Nomads
  • Remote Jobs
  • Companies
  • Post Jobs
  • Go Premium
  • Get Free Job Alerts
  • Log in

Reliability Architect

Twilio

Full-time
USA, Canada
devops
aws
architecture
saas
cloud
Apply for this position

See yourself at Twilio

Join the team as Twilio’s next Reliability Architect.

About the job

As an Architect in SRE, you will drive the technical strategy, vision and outcomes for Twilio’s Reliability Engineering organization. You will define and lead solutions and initiatives that ensure Twilio products are reliable worldwide, and you will define standards and guide engineering teams on best practices for designing, building, and operating resilient systems. This role is pivotal to Twilio’s commitment to operational excellence, scalability, and pragmatic, large-scale systems design in the cloud.

Responsibilities

In this role, you’ll:

  • Partner with senior technical leaders across Twilio to set and communicate the reliability strategy, translating business goals into measurable outcomes.

  • Influence company-wide architectural decisions while balancing long-term vision with near-term and compliance needs.

  • Lead the design, implementation, and operation of scalable solutions and paved roads that enable reliable, high-traffic services; 

  • Influence company-wide architectural decisions to focus on availability, performance, resilience, and cost efficiency using Kubernetes, AWS, Terraform, and modern observability.

  • Ensure integrity and quality across the service lifecycle; design fault-tolerant architectures, incident response, disaster recovery, and capacity/cost management.

  • Collaborate with product and cross-functional teams to identify reliability risks and convert them into actionable designs, programs, and tooling.

  • Establish and champion reliability practices and drive systemic improvements.

  • Mentor and grow engineers and technical leaders

  • Track and apply emerging SRE, cloud, and large-scale systems best practices; introduce pragmatic innovations that improve reliability at scale.

Qualifications 

Twilio values diverse experiences from all kinds of industries, and we encourage everyone who meets the required qualifications to apply. If your career is just starting or hasn't followed a traditional path, don't let that stop you from considering Twilio. We are always looking for people who will bring something new to the table!

*Required:

  • Excellent generalist knowledge of software engineering and delivery.

  • In-depth understanding of the role of Reliability Engineering in a large and diverse SaaS organization.

  • Previous experience driving cross-org technical architecture outcomes.

  • Knowledge of cloud architecture, devops practices, and large-scale systems design with microservices.

  • Bachelor's or Master's degree in Computer Science, Engineering, or a related field (or equivalent experience).

  • 10+ years of experience in Reliability Engineering, DevOps, or Software Engineering roles with a focus on infrastructure, backend systems, and reliability.

  • Strong production experience, including operational management, scaling, partitioning strategies, and tuning for performance and reliability in high-scale environments.

  • Hands-on experience with Kubernetes (e.g., EKS), deploying and managing stateful services, and cloud services like AWS.

  • Proficiency in infrastructure-as-code tools such as Terraform or CloudFormation for automating infrastructure.

  • Expertise in observability tools (e.g., Prometheus, Grafana, Datadog) for monitoring distributed systems and setting up alerting.

  • Proficient in at least one programming language (e.g., Go, Python, Java) for building automation and tooling.

  • Experience designing incident response processes, SLOs/SLIs, runbooks, and participating in on-call rotations.

  • Experience running cross-functional post-incident reviews and driving improvements.

  • Strong understanding of distributed systems principles, including consensus, durability, throughput, and availability tradeoffs.

  • Proven track record of leading reliability improvements in data-intensive or mission-critical systems and collaborating with engineering teams.

  • Excellent problem-solving, analytical, verbal, and written communication skills, with the ability to work in cross-functional and distributed environments.

  • Demonstrated leadership in mentoring teams, influencing decisions, and balancing long-term objectives with short-term needs.

  • Excellent written and verbal communication skills.

  • Ability to influence and build effective working relationships with all levels of the organization.

Desired:

  • Specific experience owning and operating large AWS footprints.

  • Knowledge of Kubernetes architecture and concepts.

  • Experience with data technologies like Apache Kafka, AWS MSK, or similar for reliable streaming.

  • Passion for building reliable products, with prior projects in high-availability systems

Location

This role will be remote, and based in Ireland.

Travel 

We prioritize connection and opportunities to build relationships with our customers and each other. For this role, you may be required to travel occasionally to participate in project or team in-person meetings.

What We Offer

Working at Twilio offers many benefits, including competitive pay, generous time off, ample parental and wellness leave, healthcare, a retirement savings program, and much more. Offerings vary by location.

Apply for this position
Bookmark Report

About the job

Full-time
USA, Canada
Posted 5 hours ago
devops
aws
architecture
saas
cloud

Apply for this position

Bookmark
Report
Enhancv advertisement

30,000+
REMOTE JOBS

Unlock access to our database and
kickstart your remote career
Join Premium

Reliability Architect

Twilio

See yourself at Twilio

Join the team as Twilio’s next Reliability Architect.

About the job

As an Architect in SRE, you will drive the technical strategy, vision and outcomes for Twilio’s Reliability Engineering organization. You will define and lead solutions and initiatives that ensure Twilio products are reliable worldwide, and you will define standards and guide engineering teams on best practices for designing, building, and operating resilient systems. This role is pivotal to Twilio’s commitment to operational excellence, scalability, and pragmatic, large-scale systems design in the cloud.

Responsibilities

In this role, you’ll:

  • Partner with senior technical leaders across Twilio to set and communicate the reliability strategy, translating business goals into measurable outcomes.

  • Influence company-wide architectural decisions while balancing long-term vision with near-term and compliance needs.

  • Lead the design, implementation, and operation of scalable solutions and paved roads that enable reliable, high-traffic services; 

  • Influence company-wide architectural decisions to focus on availability, performance, resilience, and cost efficiency using Kubernetes, AWS, Terraform, and modern observability.

  • Ensure integrity and quality across the service lifecycle; design fault-tolerant architectures, incident response, disaster recovery, and capacity/cost management.

  • Collaborate with product and cross-functional teams to identify reliability risks and convert them into actionable designs, programs, and tooling.

  • Establish and champion reliability practices and drive systemic improvements.

  • Mentor and grow engineers and technical leaders

  • Track and apply emerging SRE, cloud, and large-scale systems best practices; introduce pragmatic innovations that improve reliability at scale.

Qualifications 

Twilio values diverse experiences from all kinds of industries, and we encourage everyone who meets the required qualifications to apply. If your career is just starting or hasn't followed a traditional path, don't let that stop you from considering Twilio. We are always looking for people who will bring something new to the table!

*Required:

  • Excellent generalist knowledge of software engineering and delivery.

  • In-depth understanding of the role of Reliability Engineering in a large and diverse SaaS organization.

  • Previous experience driving cross-org technical architecture outcomes.

  • Knowledge of cloud architecture, devops practices, and large-scale systems design with microservices.

  • Bachelor's or Master's degree in Computer Science, Engineering, or a related field (or equivalent experience).

  • 10+ years of experience in Reliability Engineering, DevOps, or Software Engineering roles with a focus on infrastructure, backend systems, and reliability.

  • Strong production experience, including operational management, scaling, partitioning strategies, and tuning for performance and reliability in high-scale environments.

  • Hands-on experience with Kubernetes (e.g., EKS), deploying and managing stateful services, and cloud services like AWS.

  • Proficiency in infrastructure-as-code tools such as Terraform or CloudFormation for automating infrastructure.

  • Expertise in observability tools (e.g., Prometheus, Grafana, Datadog) for monitoring distributed systems and setting up alerting.

  • Proficient in at least one programming language (e.g., Go, Python, Java) for building automation and tooling.

  • Experience designing incident response processes, SLOs/SLIs, runbooks, and participating in on-call rotations.

  • Experience running cross-functional post-incident reviews and driving improvements.

  • Strong understanding of distributed systems principles, including consensus, durability, throughput, and availability tradeoffs.

  • Proven track record of leading reliability improvements in data-intensive or mission-critical systems and collaborating with engineering teams.

  • Excellent problem-solving, analytical, verbal, and written communication skills, with the ability to work in cross-functional and distributed environments.

  • Demonstrated leadership in mentoring teams, influencing decisions, and balancing long-term objectives with short-term needs.

  • Excellent written and verbal communication skills.

  • Ability to influence and build effective working relationships with all levels of the organization.

Desired:

  • Specific experience owning and operating large AWS footprints.

  • Knowledge of Kubernetes architecture and concepts.

  • Experience with data technologies like Apache Kafka, AWS MSK, or similar for reliable streaming.

  • Passion for building reliable products, with prior projects in high-availability systems

Location

This role will be remote, and based in Ireland.

Travel 

We prioritize connection and opportunities to build relationships with our customers and each other. For this role, you may be required to travel occasionally to participate in project or team in-person meetings.

What We Offer

Working at Twilio offers many benefits, including competitive pay, generous time off, ample parental and wellness leave, healthcare, a retirement savings program, and much more. Offerings vary by location.

Working Nomads

Post Jobs
Premium Subscription
Sponsorship
Free Job Alerts

Job Skills
API
FAQ
Privacy policy
Terms and conditions
Contact us
About us

Jobs by Category

Remote Administration jobs
Remote Consulting jobs
Remote Customer Success jobs
Remote Development jobs
Remote Design jobs
Remote Education jobs
Remote Finance jobs
Remote Legal jobs
Remote Healthcare jobs
Remote Human Resources jobs
Remote Management jobs
Remote Marketing jobs
Remote Sales jobs
Remote System Administration jobs
Remote Writing jobs

Jobs by Position Type

Remote Full-time jobs
Remote Part-time jobs
Remote Contract jobs

Jobs by Region

Remote jobs Anywhere
Remote jobs North America
Remote jobs Latin America
Remote jobs Europe
Remote jobs Middle East
Remote jobs Africa
Remote jobs APAC

Jobs by Skill

Remote Accounting jobs
Remote Assistant jobs
Remote Copywriting jobs
Remote Cyber Security jobs
Remote Data Analyst jobs
Remote Data Entry jobs
Remote English jobs
Remote Spanish jobs
Remote Project Management jobs
Remote QA jobs
Remote SEO jobs

Jobs by Country

Remote jobs Australia
Remote jobs Argentina
Remote jobs Brazil
Remote jobs Canada
Remote jobs Colombia
Remote jobs France
Remote jobs Germany
Remote jobs Ireland
Remote jobs India
Remote jobs Japan
Remote jobs Mexico
Remote jobs Netherlands
Remote jobs New Zealand
Remote jobs Philippines
Remote jobs Poland
Remote jobs Portugal
Remote jobs Singapore
Remote jobs Spain
Remote jobs UK
Remote jobs USA


Working Nomads curates remote digital jobs from around the web.

© 2025 Working Nomads.