About the job

Full-time

USA

9 Applicants

Posted 3 years ago

devops

python

docker

azure

cloud

golang

rust

cplusplus

Sr. Site Reliability Engineer

MobileCoin

The job listing has expired. Unfortunately, the hiring company is no longer accepting new applications.

To see similar active jobs please follow this link: Remote System Administration jobs

Company: MobileCoin
HQ: San Francisco
Location: Fully Remote anywhere in the US or Canada
Position: Senior SRE
Comp: Base salary $160K - $190k + coins & equity - total comp is north of $300K

Responsibilities

Maintain, monitor and improve our Kubernetes clusters.

Maintain, improve, scale and secure our Azure infrastructure and Ubuntu Linux systems.

Assist our development teams in running, packaging, deploying and troubleshooting applications

Work with developers on streamlining deployment processes with Jenkins and other tooling

Be responsible for maintenance and improvements to multiple internal services, for example Kubernetes, Prometheus, and Logging

Monitor, triage and respond to alerts in our 24/7/365 environment.

Participate in design and code reviews, and ensure that the foundation for our services is best in class.

Evaluate new technologies, design and implement as appropriate.

Identify automation opportunities and implement by creating custom or by using off the shelf solutions.

Requirements

Extensive experience of working in cloud-based systems operations

You’re very comfortable with Linux command line

You have extensive experience with Docker (building and running containers), and container orchestration (Kubernetes preferred)

You have experience with Prometheus and Grafana (preferred), or other monitoring systems (InfluxDB, StatsD, Graphite, etc)

Experience with CI pipelines and Jenkins (preferred)

You are security minded and follow standard security best-practices (least-privilege, common attack defenses, etc)

You have a good understanding of computer networking, TCP/IP, load balancing, distributed computing, web services, and the fundamental protocols used by the internet (HTTP, HTTPS, DNS, etc.).

You have experience supporting production workloads and are familiar with monitoring concepts and tooling.

You're highly proficient in at least one scripting language (Python, Go, Rust, Bash, etc.).

You're enthusiastic about working in a small, growing team, you are open, empathetic, and care about putting the best ideas forward in a collaborative and helpful manner.

Nice to Have

Experience with Azure

Experience with Rust and/or C/C++

Experience with advanced CPU features in a container environment (SGX, GPU, etc)

About the job

Sr. Site Reliability Engineer

Working Nomads

Jobs by Category

Jobs by Position Type

Jobs by Region

Jobs by Skill

Jobs by Country