Platform Engineer II

Kraken

Full-time

USA

$120k-$170k per year

platform engineer

engineer

devops

python

docker

Apply for this position

Help us use technology to make a big green dent in the universe!

Kraken powers some of the most innovative global developments in energy.

We’re a technology company focused on creating a smart, sustainable energy system. From optimising renewable generation, creating a more intelligent grid and enabling utilities to provide excellent customer experiences, our operating system for energy is transforming the industry around the world in a way that benefits everyone.

It’s a really exciting time in energy. Help us make a real impact on shaping a better, more sustainable future.

What we do: build the most AI-driven, innovative, forward-thinking platform for energy management. From optimizing resources to delivering cost-effective, exceptional customer experiences through advanced Customer Information Systems (CIS), billing, meter data management, CRM, and AI-driven communications, Kraken is powering the next wave of innovation in the energy industry. We're an innovative and customer-focussed company, helping to drag the utilities industry into the 21st century.

Why we do it: future energy will not look like energy as we know it today. We need to not just think about our future, but build for it. Now.

Our Global Platform Engineering Reliability team is responsible for architecting, developing, and maintaining the resilient and scalable infrastructure that power and support our platform.

As a Platform Engineer within the Reliability team, you'll play a crucial role in ensuring the availability, performance, and scalability of our platform. Your proficiency in supporting platforms that serve millions of customers across many clients and timezones will ensure stability and high performance for our brands and clients.

Staying abreast of the latest trends and best practices in architecting applications at scale will be vital. Your analytical skills and attention to detail will be indispensable as you pinpoint areas for enhancement, ensure optimal platform performance, and continuously improve our reliability and efficiency.

What you'll do:

Proactively monitor and ensure the reliability and performance of our platform services, including overseeing the daily operation of the deployment pipeline and addressing any emergent issues
Contribute to the building of initiatives that enhance scalability and system resilience, and evolve application deployment architecture to align with business growth
Collaborate with product teams to support the release of new features and services, ensuring adherence to reliability and performance standards
Support product teams in improving the performance and availability of their systems
Actively participate in incident response calls to mitigate, recover from, and resolve incidents
Guide product teams in designing systems for resilience and graceful failure under heavy load
Assist application teams with post-incident tasks and follow-ups, and contribute to the creation and review of post-mortem documentation
Analyse incident metrics to identify trends and potential improvements, communicating these insights to the development team
Help solve interesting and difficult problems. There’s a great opportunity for disruption in the global energy market

What you'll have:

Proficient using AWS; we use a lot of different AWS services and not just the standard few
Good expertise in multiple of the following areas:
PostgreSQL, or a similar RDBMS, particularly in Amazon RDS at scale
Docker and Kubernetes; we use Amazon EKS in production
Python; particularly with Django, the Django ORM and Celery
Datadog, or a similar logging/monitoring tool
Messaging queues, event-driven async processing or similar technologies - we use RabbitMQ
Terraform, or a similar infrastructure-as-code tool
Experience with a Linux distribution
Previous experience working in small, highly-autonomous teams
Good communication, with a focus on doing this asynchronously across multiple timezones and countries

What will help:

Experience working on SaaS platforms, including engaging product teams to ensure up-skilling and knowledge sharing across teams
Experience managing and supporting a large scale internet facing service
Previous experience working in a remote-first asynchronous global team providing follow the sun support
Experience in responding to incidents and outages, writing technical incident reports and organising incident retrospectives
Experience managing very large relational databases
Experience in using service level objectives to improve application performance
A proactive, innovative mindset

Why you'll love it here:

Great medical, dental, and vision insurance options including FSAs.
Paid time off — we know working hard means also being able to recharge as needed, we trust our employees to get the work done and take the time they need.
401(k) plan with employer match.
Parental leave. Biological, adoptive and foster parents are all eligible.
Pre-tax commuter benefits.
Flexible working environment: you need to shift around your schedule? You do you, we genuinely believe in work/life balance.
Equity Options: every Kraken employee owns part of the business. We’re a team, working together towards huge goals. Every person is crucial to our success, you should be rewarded as such.
Modern office or co-working spaces depending on location.
The salary range for this role ranges on average from $120,000-$170,000 (with some flexibility) depending on relevant experience, location, role alignment, and technical/client management expertise demonstrated throughout the interview process. While the broad salary range is listed, not all candidates will be placed at the top of the range—this will be determined by the overall fit for the position. If you have questions about this, just ask! Our recruiters are happy to provide more context.

If this sounds like you then we'd love to hear from you.

Are you ready for a career with us? We want to ensure you have all the tools and environment you need to unleash your potential. Need any specific accommodations? Whether you require specific accommodations or have a unique preference, let us know, and we'll do what we can to customise your interview process for comfort and maximum magic!

Studies have shown that some groups of people, like women, are less likely to apply to a role unless they meet 100% of the job requirements. Whoever you are, if you like one of our jobs, we encourage you to apply as you might just be the candidate we hire. Across Octopus, we're looking for genuinely decent people who are honest and empathetic. Our people are our strongest asset and the unique skills and perspectives people bring to the team are the driving force of our success. As an equal opportunity employer, we do not discriminate on the basis of any protected attribute. Our commitment is to provide equal opportunities, an inclusive work environment, and fairness for everyone.

Apply for this position

Bookmark Report

Platform Engineer II

Kraken

Help us use technology to make a big green dent in the universe!

Kraken powers some of the most innovative global developments in energy.

It’s a really exciting time in energy. Help us make a real impact on shaping a better, more sustainable future.

Why we do it: future energy will not look like energy as we know it today. We need to not just think about our future, but build for it. Now.

Our Global Platform Engineering Reliability team is responsible for architecting, developing, and maintaining the resilient and scalable infrastructure that power and support our platform.

What you'll do:

Proactively monitor and ensure the reliability and performance of our platform services, including overseeing the daily operation of the deployment pipeline and addressing any emergent issues
Contribute to the building of initiatives that enhance scalability and system resilience, and evolve application deployment architecture to align with business growth
Collaborate with product teams to support the release of new features and services, ensuring adherence to reliability and performance standards
Support product teams in improving the performance and availability of their systems
Actively participate in incident response calls to mitigate, recover from, and resolve incidents
Guide product teams in designing systems for resilience and graceful failure under heavy load
Assist application teams with post-incident tasks and follow-ups, and contribute to the creation and review of post-mortem documentation
Analyse incident metrics to identify trends and potential improvements, communicating these insights to the development team
Help solve interesting and difficult problems. There’s a great opportunity for disruption in the global energy market

What you'll have:

Proficient using AWS; we use a lot of different AWS services and not just the standard few
Good expertise in multiple of the following areas:
PostgreSQL, or a similar RDBMS, particularly in Amazon RDS at scale
Docker and Kubernetes; we use Amazon EKS in production
Python; particularly with Django, the Django ORM and Celery
Datadog, or a similar logging/monitoring tool
Messaging queues, event-driven async processing or similar technologies - we use RabbitMQ
Terraform, or a similar infrastructure-as-code tool
Experience with a Linux distribution
Previous experience working in small, highly-autonomous teams
Good communication, with a focus on doing this asynchronously across multiple timezones and countries

What will help:

Experience working on SaaS platforms, including engaging product teams to ensure up-skilling and knowledge sharing across teams
Experience managing and supporting a large scale internet facing service
Previous experience working in a remote-first asynchronous global team providing follow the sun support
Experience in responding to incidents and outages, writing technical incident reports and organising incident retrospectives
Experience managing very large relational databases
Experience in using service level objectives to improve application performance
A proactive, innovative mindset

Why you'll love it here:

Great medical, dental, and vision insurance options including FSAs.
Paid time off — we know working hard means also being able to recharge as needed, we trust our employees to get the work done and take the time they need.
401(k) plan with employer match.
Parental leave. Biological, adoptive and foster parents are all eligible.
Pre-tax commuter benefits.
Flexible working environment: you need to shift around your schedule? You do you, we genuinely believe in work/life balance.
Equity Options: every Kraken employee owns part of the business. We’re a team, working together towards huge goals. Every person is crucial to our success, you should be rewarded as such.
Modern office or co-working spaces depending on location.
The salary range for this role ranges on average from $120,000-$170,000 (with some flexibility) depending on relevant experience, location, role alignment, and technical/client management expertise demonstrated throughout the interview process. While the broad salary range is listed, not all candidates will be placed at the top of the range—this will be determined by the overall fit for the position. If you have questions about this, just ask! Our recruiters are happy to provide more context.

If this sounds like you then we'd love to hear from you.

About the job

Apply for this position

30,000+
REMOTE JOBS

Platform Engineer II

Working Nomads

Jobs by Category

Jobs by Position Type

Jobs by Region

Jobs by Skill

Jobs by Country