Staff Software Engineer - Observability Knowledge Graph Backend
This is a full-time remote position, and we're seeking candidates in the US or Canada.
The Opportunity
The Grafana Observability (O11y) group builds end-to-end observability solutions that span Application, Infrastructure, Database, Browser and Mobile observability. At the core of this experience is the Grafana Knowledge Graph—the platform that powers true full-stack observability across Grafana Cloud.
The Knowledge Graph is a suite of distributed, multi-tenant microservices that connects signals across the stack and delivers automated Root Cause Insights by actively analyzing Metrics, Traces, and Logs stored in Grafana Cloud. These services process and store data across SQL, graph, and time-series databases and are designed for high availability and scale.
Used by thousands of self-service customers and trusted by some of the world’s largest organizations to monitor mission-critical infrastructure, the Knowledge Graph is a foundational platform at Grafana. As adoption continues to grow, our focus is on continuously improving performance, increasing reliability, and scaling efficiently—without compromising on quality.
At Grafana, we value collective creativity and diverse perspectives. Every team member is encouraged to bring ideas forward and help shape a platform that our users depend on every day.
We’re scaling fast and staying true to what makes us different: an open-source legacy, a global collaborative culture, and a passion for meaningful work. Our team thrives in an innovation-driven environment where transparency, autonomy, and trust fuel everything we do.
You may not meet every requirement, and that’s okay. If this role excites you, we’d love you to raise your hand for what could be a truly career-defining opportunity.
What will you be doing?
Work with your team to build and roll out new features, then use the results to iterate and improve.
Drive projects from initial ideation all the way to operations once it is in the hands of customers.
Take on complex challenges and break them down to achieve short feedback loops: to analyze, design, and build modular solutions, deliver MVPs, gather data and feedback, and then progress iteratively
Maintain critical systems, and own their reliability, performance, and availability.
Be a part of your team’s on-call rotations and take ownership of the services you’re running.
Mentor and support other team members, participate in design discussions, and collaborate with the team
Learn new skills by gaining a deeper understanding of our cloud product and our customers and getting to know the codebase.
Take an active role in influencing our roadmap and your own career objectives
What Makes You a Great Fit?
You are a motivated self starter with a bias towards action.
You have strong coding skills and operational experience; you were responsible for operating the software you have built.
You have worked on a SaaS platform and dealt with common distributed systems problems (e.g. scalability, multi-tenancy, data isolation, HA, …)
You have excellent written and spoken communication skills. You’ll be working with your team mates in a fully remote setup. Good communication skills are a must.
You are willing to work across teams. Your work has to be aligned with the needs of other squads and external stakeholders. You make your plans transparent, bring stakeholders on board, and are open to feedback and suggestions.
You are pragmatic; you prioritize progress over perfection; you can handle ambiguity.
We use Java on the backend and deploy on AWS/Azure/GCP Clouds using Kubernetes. You must have programming experience in Java and experience with Kubernetes and any one of the cloud platforms.
You are customer focused. We build everything with our users in mind.
Compensation & Rewards:
In Canada, the Base compensation range for this role is CAD 186,368- CAD 223,642. Actual compensation may vary based on level, experience, and skillset as assessed in the interview process. Benefits include equity, bonus (if applicable) and other benefits listed here.
All of our roles include Restricted Stock Units (RSUs), giving every team member ownership in Grafana Labs' success. We believe in shared outcomes—RSUs help us stay aligned and invested as we scale globally.
*Compensation ranges are country specific. If you are applying for this role from a different location than listed above, your recruiter will discuss your specific market’s defined pay range & benefits at the beginning of the process.
About the job
Apply for this position
Staff Software Engineer - Observability Knowledge Graph Backend
This is a full-time remote position, and we're seeking candidates in the US or Canada.
The Opportunity
The Grafana Observability (O11y) group builds end-to-end observability solutions that span Application, Infrastructure, Database, Browser and Mobile observability. At the core of this experience is the Grafana Knowledge Graph—the platform that powers true full-stack observability across Grafana Cloud.
The Knowledge Graph is a suite of distributed, multi-tenant microservices that connects signals across the stack and delivers automated Root Cause Insights by actively analyzing Metrics, Traces, and Logs stored in Grafana Cloud. These services process and store data across SQL, graph, and time-series databases and are designed for high availability and scale.
Used by thousands of self-service customers and trusted by some of the world’s largest organizations to monitor mission-critical infrastructure, the Knowledge Graph is a foundational platform at Grafana. As adoption continues to grow, our focus is on continuously improving performance, increasing reliability, and scaling efficiently—without compromising on quality.
At Grafana, we value collective creativity and diverse perspectives. Every team member is encouraged to bring ideas forward and help shape a platform that our users depend on every day.
We’re scaling fast and staying true to what makes us different: an open-source legacy, a global collaborative culture, and a passion for meaningful work. Our team thrives in an innovation-driven environment where transparency, autonomy, and trust fuel everything we do.
You may not meet every requirement, and that’s okay. If this role excites you, we’d love you to raise your hand for what could be a truly career-defining opportunity.
What will you be doing?
Work with your team to build and roll out new features, then use the results to iterate and improve.
Drive projects from initial ideation all the way to operations once it is in the hands of customers.
Take on complex challenges and break them down to achieve short feedback loops: to analyze, design, and build modular solutions, deliver MVPs, gather data and feedback, and then progress iteratively
Maintain critical systems, and own their reliability, performance, and availability.
Be a part of your team’s on-call rotations and take ownership of the services you’re running.
Mentor and support other team members, participate in design discussions, and collaborate with the team
Learn new skills by gaining a deeper understanding of our cloud product and our customers and getting to know the codebase.
Take an active role in influencing our roadmap and your own career objectives
What Makes You a Great Fit?
You are a motivated self starter with a bias towards action.
You have strong coding skills and operational experience; you were responsible for operating the software you have built.
You have worked on a SaaS platform and dealt with common distributed systems problems (e.g. scalability, multi-tenancy, data isolation, HA, …)
You have excellent written and spoken communication skills. You’ll be working with your team mates in a fully remote setup. Good communication skills are a must.
You are willing to work across teams. Your work has to be aligned with the needs of other squads and external stakeholders. You make your plans transparent, bring stakeholders on board, and are open to feedback and suggestions.
You are pragmatic; you prioritize progress over perfection; you can handle ambiguity.
We use Java on the backend and deploy on AWS/Azure/GCP Clouds using Kubernetes. You must have programming experience in Java and experience with Kubernetes and any one of the cloud platforms.
You are customer focused. We build everything with our users in mind.
Compensation & Rewards:
In Canada, the Base compensation range for this role is CAD 186,368- CAD 223,642. Actual compensation may vary based on level, experience, and skillset as assessed in the interview process. Benefits include equity, bonus (if applicable) and other benefits listed here.
All of our roles include Restricted Stock Units (RSUs), giving every team member ownership in Grafana Labs' success. We believe in shared outcomes—RSUs help us stay aligned and invested as we scale globally.
*Compensation ranges are country specific. If you are applying for this role from a different location than listed above, your recruiter will discuss your specific market’s defined pay range & benefits at the beginning of the process.
