Senior Site Reliability Engineer
To see similar active jobs please follow this link: Remote Development jobs
The Role
You'll be given lots of responsibility and the opportunity to have true ownership as we build out the product. This is a unique opportunity to use your engineering powers to make a direct impact in people's lives. We need a Site Reliability Engineer who is enthusiastic about building reliable, scalable, and flexible systems to support our growing team, product, and user base. You'll work with other engineers to reliably release and maintain services, and help define and meet internal and customer-facing SLA's and SLO's.
This position is not eligible to be performed in Hawaii.
What You’ll Do
Manage and orchestrate Cloud Resource (AWS) configuration using Infrastructure As Code (Terraform) to empower engineering staff to embrace a DevOps culture of Self Service Ownership
Develop and govern Observability (Datadog) best practices for tracking platform performance and health trends to meet customer SLAs and lead technical decisions with strong supporting evidence
Create solutions that dynamically scale based on demand with enough flexibility to pivot for fast changing project requirements while maintaining a balance of good versus perfect
Provide strong and consistent communication updates on technical progress or blockers to keep stakeholders informed while additionally creating appropriate documentation on technical design to spread knowledge and reduce information silos
Participate and respond to 24/7 on-call critical alerts and follow documented incident investigation procedures to reestablish customer facing feature availability
Maintain HIPAA, GDPR, SOC-2 compliance and general security through best practice implementation
Who You Are
3+ years of experience in software engineering, with 2 years experience in DevOps
Cloud Provider (AWS, GCP, Azure) experience on managing resources through Infrastructure As Code (Terraform)
Container Orchestration (ECS or K8s) experience to confidently build, test, and release containerized applications for multiple environments and regions.
Knowledge of Observability best practices across common cloud resources (EC2, ECS, RDS, DynamoDB, S3, SQS, Eventbridge) with experience on rolling out enhancements across a distributed platform with scale in mind.
Experience with shell scripting for *nix systems
Experience with Networking for web applications
Effective at communicating ideas through writing and diagramming
Comfortable working with a distributed development and ops team
Familiarity with AWS: ECS and cloud hosting, Gitlab: CI/CD, Python: Django, Flask, aiohttp, Bash, Data: PostgreSQL, Redis, Monitoring: Datadog and Sentry, IaC: Terraform, Packer
Benefits
Fundamentals:
Medical / Dental / Vision / Disability / Life Insurance
High Deductible Health Plan with Health Savings Account (HSA) option
Flexible Spending Account (FSA)
Access to coaches and therapists through Modern Health's platform
Generous Time Off
Company-wide Collective Pause Days
Family Support:
Parental Leave Policy
Family Forming Benefit through Carrot
Family Assistance Benefit through UrbanSitter
Professional Development:
Professional Development Stipend
Financial Wellness:
401k
Financial Planning Benefit through Origin
But wait there’s more…!
Annual Wellness Stipend to use on items that promote your overall well being
New Hire Stipend to help cover work-from-home setup costs
ModSquad Community: Virtual events like active ERGs, holiday themed activities, team-building events and more
Monthly Cell Phone Reimbursement
About the job
Senior Site Reliability Engineer
To see similar active jobs please follow this link: Remote Development jobs
The Role
You'll be given lots of responsibility and the opportunity to have true ownership as we build out the product. This is a unique opportunity to use your engineering powers to make a direct impact in people's lives. We need a Site Reliability Engineer who is enthusiastic about building reliable, scalable, and flexible systems to support our growing team, product, and user base. You'll work with other engineers to reliably release and maintain services, and help define and meet internal and customer-facing SLA's and SLO's.
This position is not eligible to be performed in Hawaii.
What You’ll Do
Manage and orchestrate Cloud Resource (AWS) configuration using Infrastructure As Code (Terraform) to empower engineering staff to embrace a DevOps culture of Self Service Ownership
Develop and govern Observability (Datadog) best practices for tracking platform performance and health trends to meet customer SLAs and lead technical decisions with strong supporting evidence
Create solutions that dynamically scale based on demand with enough flexibility to pivot for fast changing project requirements while maintaining a balance of good versus perfect
Provide strong and consistent communication updates on technical progress or blockers to keep stakeholders informed while additionally creating appropriate documentation on technical design to spread knowledge and reduce information silos
Participate and respond to 24/7 on-call critical alerts and follow documented incident investigation procedures to reestablish customer facing feature availability
Maintain HIPAA, GDPR, SOC-2 compliance and general security through best practice implementation
Who You Are
3+ years of experience in software engineering, with 2 years experience in DevOps
Cloud Provider (AWS, GCP, Azure) experience on managing resources through Infrastructure As Code (Terraform)
Container Orchestration (ECS or K8s) experience to confidently build, test, and release containerized applications for multiple environments and regions.
Knowledge of Observability best practices across common cloud resources (EC2, ECS, RDS, DynamoDB, S3, SQS, Eventbridge) with experience on rolling out enhancements across a distributed platform with scale in mind.
Experience with shell scripting for *nix systems
Experience with Networking for web applications
Effective at communicating ideas through writing and diagramming
Comfortable working with a distributed development and ops team
Familiarity with AWS: ECS and cloud hosting, Gitlab: CI/CD, Python: Django, Flask, aiohttp, Bash, Data: PostgreSQL, Redis, Monitoring: Datadog and Sentry, IaC: Terraform, Packer
Benefits
Fundamentals:
Medical / Dental / Vision / Disability / Life Insurance
High Deductible Health Plan with Health Savings Account (HSA) option
Flexible Spending Account (FSA)
Access to coaches and therapists through Modern Health's platform
Generous Time Off
Company-wide Collective Pause Days
Family Support:
Parental Leave Policy
Family Forming Benefit through Carrot
Family Assistance Benefit through UrbanSitter
Professional Development:
Professional Development Stipend
Financial Wellness:
401k
Financial Planning Benefit through Origin
But wait there’s more…!
Annual Wellness Stipend to use on items that promote your overall well being
New Hire Stipend to help cover work-from-home setup costs
ModSquad Community: Virtual events like active ERGs, holiday themed activities, team-building events and more
Monthly Cell Phone Reimbursement