Site Reliability Engineer

AffiniPay

Full-time

USA

$120k-$185k per year

Posted 1 year ago

Go ad-free with Premium ×

The job listing has expired. Unfortunately, the hiring company is no longer accepting new applications.

To see similar active jobs please follow this link: Remote Development jobs

AffiniPay is looking for a Site Reliability Engineer to help us build and maintain our next-generation platform. You will be a member of the newly created Platform Engineering Organization, which is responsible for the software delivery and infrastructure management focusing on availability, performance and security and core service layers supporting our core payments platform and practice management software. We serve over 200,000 legal professionals and process over $18 billion in payments every year! You will work closely with our development team to ensure they have the tools needed to build, test, and deploy code with ease and ensure developer and QA engineer success. You will engineer and maintain configuration management environments, software delivery pipelines, IaC processes, technical governance and observability tools among other areas of platform availability. .

What You’ll Do:

Automate deployment, monitoring, management, and incident response for 100% cloud based infrastructure
Monitor site availability, stability and performance and resolve site issues and work with observability team to fine tune alerting, monitors and metrics
Scale infrastructure to meet rapidly growing demand with attention to cost and customer usage patterns
Monitor and evolve system performance leveraging observability metrics for decision making and case building of needed changes
Participate in the design and implementation of robust disaster recovery solutions supporting company-defined restore point and time objectives
Share in new feature design as the cloud SME supporting container/k8s first designs
Collaborate with developers to bring new features and services into production while maintaining structure and standards for introducing new tech
Develop and improve operational best practices and procedures
Participate in an on-call rotation and provide emergency support outside of normally scheduled hours as needed
Design, provision, and manage cloud infrastructure in AWS
Propose standard methodologies for continuous integration and deployment, performance and health monitoring, and alerting.
Develop, implement and own internal tooling and automation as your product

About You:

5+ years of professional technical experience in software engineering, cloud operations or a related function.
3+ years of experience within a DevOps or SRE role
Documented experience provisioning and managing infrastructure in a public cloud environment such as AWS (preferred), Google Cloud Platform, or Azure
Experience with Linux container technologies, such as Docker, OCI, LXC and ability to administer, build and deploy images in an automated manner.
Solid understanding of how a Kubernetes Platform operates (service discovery, deployments, monitoring, scheduling, load balancing)
Proficiency in at least one, ideally more of the following programming or scripting languages: Ruby, Python, Java, Javascript (NodeJS), Bash
Extensive experience utilizing Infrastructure as Code to provision and maintain infrastructure: Terraform (preferred), CloudFormation
Experience with relational database systems such as MySQL (preferred) and PostgreSQL, to include understanding and optimizing SQL queries
Experience with Kafka on an MSK environment is a strong plus
Experience with ElasticSearch is a plus
Experience implementing and maintaining CI pipelines using CircleCI (preferred), Jenkins, Github Actions, Gitlab, Azure Devops, etc.
Experience with Windows Server is a plus
Solid understanding of networking concepts and how they are applied and controlled in a cloud environment
Experience managing resources in a PCI controlled environment (highly preferred)
An understanding and passion for developing highly secure and highly available systems
Bachelor's degree in Computer Engineering, Computer Science, or a related field (or equivalent experience).

Additional Information:

The base pay range for this position is between $120,000-$185,000 USD annually. The salary range for performing this role outside of the US / Austin / California may differ. AffiniPay is committed to offering competitive, fair and commensurate compensation and has provided an estimated pay range for this role. Actual compensation may vary based on job-related knowledge, skills, experience and education.

If you're located near our San Diego or Austin offices, we offer hybrid work options.

Go ad-free with Premium ×

Site Reliability Engineer

AffiniPay

The job listing has expired. Unfortunately, the hiring company is no longer accepting new applications.

To see similar active jobs please follow this link: Remote Development jobs

What You’ll Do:

Automate deployment, monitoring, management, and incident response for 100% cloud based infrastructure
Monitor site availability, stability and performance and resolve site issues and work with observability team to fine tune alerting, monitors and metrics
Scale infrastructure to meet rapidly growing demand with attention to cost and customer usage patterns
Monitor and evolve system performance leveraging observability metrics for decision making and case building of needed changes
Participate in the design and implementation of robust disaster recovery solutions supporting company-defined restore point and time objectives
Share in new feature design as the cloud SME supporting container/k8s first designs
Collaborate with developers to bring new features and services into production while maintaining structure and standards for introducing new tech
Develop and improve operational best practices and procedures
Participate in an on-call rotation and provide emergency support outside of normally scheduled hours as needed
Design, provision, and manage cloud infrastructure in AWS
Propose standard methodologies for continuous integration and deployment, performance and health monitoring, and alerting.
Develop, implement and own internal tooling and automation as your product

About You:

5+ years of professional technical experience in software engineering, cloud operations or a related function.
3+ years of experience within a DevOps or SRE role
Documented experience provisioning and managing infrastructure in a public cloud environment such as AWS (preferred), Google Cloud Platform, or Azure
Experience with Linux container technologies, such as Docker, OCI, LXC and ability to administer, build and deploy images in an automated manner.
Solid understanding of how a Kubernetes Platform operates (service discovery, deployments, monitoring, scheduling, load balancing)
Proficiency in at least one, ideally more of the following programming or scripting languages: Ruby, Python, Java, Javascript (NodeJS), Bash
Extensive experience utilizing Infrastructure as Code to provision and maintain infrastructure: Terraform (preferred), CloudFormation
Experience with relational database systems such as MySQL (preferred) and PostgreSQL, to include understanding and optimizing SQL queries
Experience with Kafka on an MSK environment is a strong plus
Experience with ElasticSearch is a plus
Experience implementing and maintaining CI pipelines using CircleCI (preferred), Jenkins, Github Actions, Gitlab, Azure Devops, etc.
Experience with Windows Server is a plus
Solid understanding of networking concepts and how they are applied and controlled in a cloud environment
Experience managing resources in a PCI controlled environment (highly preferred)
An understanding and passion for developing highly secure and highly available systems
Bachelor's degree in Computer Engineering, Computer Science, or a related field (or equivalent experience).

Additional Information:

If you're located near our San Diego or Austin offices, we offer hybrid work options.

Site Reliability Engineer

Site Reliability Engineer

Working Nomads

Jobs by Category

Jobs by Position Type

Jobs by Region

Jobs by Skill

Jobs by Country