MENU
  • Remote Jobs
  • Companies
  • Go Premium
  • Job Alerts
  • Post a Job
  • Log in
  • Sign up
Working Nomads logo Working Nomads
  • Remote Jobs
  • Companies
  • Post Jobs
  • Go Premium
  • Get Free Job Alerts
  • Log in

Research Engineer - Evaluations

Canva

Full-time
Austria
engineer
machine learning
artificial intelligence
cloud
analytics
Apply for this position

Company Description

Join the team redefining how the world experiences design.

Servus, hey, g'day, mabuhay, kia ora, 你好, hallo, vítejte!

Thanks for stopping by. We know job hunting can be a little time consuming and you're probably keen to find out what's on offer, so we'll get straight to the point.

Where and how you can work

Our flagship campus is in Sydney, Australia but Austria is home to part of our European operations. And you have choice in where and how you work, we trust our Canvanauts to choose the balance that empowers them and their team to achieve their goals.

Fun fact, a big part of our Austrian operations is developing the AI product within Canva to help reimagine how artificial intelligence can be used in design. Pretty cool ha!

Job Description

At Canva, our mission is to empower the world to design. To ensure our generative AI models are truly helpful, we are seeking a talented Research/Machine Learning Engineer to build our next-generation evaluation system by leveraging automatic evaluations.

Job Description

At Canva, our mission is to empower the world to design. To ensure our generative AI models are truly helpful, we are seeking a talented Research/Machine Learning Engineer to build our next-generation evaluation system by leveraging automatic evaluations.

About the role:

You will engineer sophisticated AI agents that can automatically assess the quality and human alignment of our generative design models. This high-impact role focuses on building the practical systems that make cutting-edge research effective, to provide a rapid feedback loop that guides the future of design generation at Canva, ultimately empowering millions of users to create.

At the moment, this role is focused on:

  • Agentic Evaluation Systems: Engineering autonomous AI agents that use Multimodal Large Language Models (MLLMs) to evaluate the quality, relevance, and human alignment of generated designs.

  • Inference-Time Alignment: Mastering techniques that improve model outputs without full retraining, but by inference-based methods including prompt engineering, in-context learning and Retrieval-Augmented Generation (RAG).

  • Model Benchmarking & Analysis: Building a rigorous framework to systematically benchmark internal and external quality understanding models, delivering clear, data-driven insights on human alignment.

Primary Responsibilities:

  • Design, build, and optimize the infrastructure for an 'MLLM-as-a-Judge' evaluation system for scalable, automated feedback.

  • Implement and experiment with inference-time alignment techniques (Prompt Engineering, RAG, ICL) to directly improve model output quality.

  • Establish and manage a comprehensive benchmarking process to compare various foundation models on design-centric tasks.

  • Analyze evaluation data to identify model failure modes and provide actionable recommendations to the research team.

  • Collaborate with research scientists and ML engineers to integrate the agentic judge system into the model development lifecycle.

  • Translate the latest research in LLM evaluation and agentic AI into practical, production-ready engineering solutions.

You’re probably a match if you:

  • You have a strong understanding of generative AI models (e.g., Diffusion Models, GANs, Transformers) and their architectures, with practical experience that informs robust evaluation strategies

  • Excel at creating data-driven evaluation methodologies, turning user analytics into clear, actionable insights.

  • You’ve successfully managed or optimized large-scale distributed model training across hundreds of GPUs

  • You have a solid understanding of machine learning, have worked with PyTorch and know how to optimize such codes for speed

  • You have disciplined coding practices, and are experienced with code reviews and pull requests.

  • You have experience working in cloud environments, ideally AWS

Nice to Have:

  • Familiarity with evaluation libraries and frameworks.

  • Experience building or working with agentic AI systems or multi-agent coordination.

  • Knowledge of data visualization tools to communicate findings effectively.

  • A background or interest in human-computer interaction, design principles, or AI ethics.

Additional Information

What's in it for you?

Achieving our crazy big goals motivates us to work hard - and we do - but you'll experience lots of moments of magic, connectivity and fun woven throughout life at Canva, too. We also offer a stack of benefits to set you up for every success in and outside of work.

Here's a taste of what's on offer:

  • Equity packages - we want our success to be yours too

  • Inclusive parental leave policy that supports all parents & carers

  • An annual Vibe & Thrive allowance to support your wellbeing, social connection, home office setup & more

  • Flexible leave options that empower you to be a force for good, take time to recharge and supports you personally

Check out lifeatcanva.com for more info.

Other stuff to know

We make hiring decisions based on your experience, skills and passion, as well as how you can enhance Canva and our culture. When you apply, please tell us the pronouns you use and any reasonable adjustments you may need during the interview process.

Please note that interviews are predominantly conducted virtually. 

Apply for this position
Bookmark Report

About the job

Full-time
Austria
Posted 18 hours ago
engineer
machine learning
artificial intelligence
cloud
analytics

Apply for this position

Bookmark
Report
Enhancv advertisement

30,000+
REMOTE JOBS

Unlock access to our database and
kickstart your remote career
Join Premium

Research Engineer - Evaluations

Canva

Company Description

Join the team redefining how the world experiences design.

Servus, hey, g'day, mabuhay, kia ora, 你好, hallo, vítejte!

Thanks for stopping by. We know job hunting can be a little time consuming and you're probably keen to find out what's on offer, so we'll get straight to the point.

Where and how you can work

Our flagship campus is in Sydney, Australia but Austria is home to part of our European operations. And you have choice in where and how you work, we trust our Canvanauts to choose the balance that empowers them and their team to achieve their goals.

Fun fact, a big part of our Austrian operations is developing the AI product within Canva to help reimagine how artificial intelligence can be used in design. Pretty cool ha!

Job Description

At Canva, our mission is to empower the world to design. To ensure our generative AI models are truly helpful, we are seeking a talented Research/Machine Learning Engineer to build our next-generation evaluation system by leveraging automatic evaluations.

Job Description

At Canva, our mission is to empower the world to design. To ensure our generative AI models are truly helpful, we are seeking a talented Research/Machine Learning Engineer to build our next-generation evaluation system by leveraging automatic evaluations.

About the role:

You will engineer sophisticated AI agents that can automatically assess the quality and human alignment of our generative design models. This high-impact role focuses on building the practical systems that make cutting-edge research effective, to provide a rapid feedback loop that guides the future of design generation at Canva, ultimately empowering millions of users to create.

At the moment, this role is focused on:

  • Agentic Evaluation Systems: Engineering autonomous AI agents that use Multimodal Large Language Models (MLLMs) to evaluate the quality, relevance, and human alignment of generated designs.

  • Inference-Time Alignment: Mastering techniques that improve model outputs without full retraining, but by inference-based methods including prompt engineering, in-context learning and Retrieval-Augmented Generation (RAG).

  • Model Benchmarking & Analysis: Building a rigorous framework to systematically benchmark internal and external quality understanding models, delivering clear, data-driven insights on human alignment.

Primary Responsibilities:

  • Design, build, and optimize the infrastructure for an 'MLLM-as-a-Judge' evaluation system for scalable, automated feedback.

  • Implement and experiment with inference-time alignment techniques (Prompt Engineering, RAG, ICL) to directly improve model output quality.

  • Establish and manage a comprehensive benchmarking process to compare various foundation models on design-centric tasks.

  • Analyze evaluation data to identify model failure modes and provide actionable recommendations to the research team.

  • Collaborate with research scientists and ML engineers to integrate the agentic judge system into the model development lifecycle.

  • Translate the latest research in LLM evaluation and agentic AI into practical, production-ready engineering solutions.

You’re probably a match if you:

  • You have a strong understanding of generative AI models (e.g., Diffusion Models, GANs, Transformers) and their architectures, with practical experience that informs robust evaluation strategies

  • Excel at creating data-driven evaluation methodologies, turning user analytics into clear, actionable insights.

  • You’ve successfully managed or optimized large-scale distributed model training across hundreds of GPUs

  • You have a solid understanding of machine learning, have worked with PyTorch and know how to optimize such codes for speed

  • You have disciplined coding practices, and are experienced with code reviews and pull requests.

  • You have experience working in cloud environments, ideally AWS

Nice to Have:

  • Familiarity with evaluation libraries and frameworks.

  • Experience building or working with agentic AI systems or multi-agent coordination.

  • Knowledge of data visualization tools to communicate findings effectively.

  • A background or interest in human-computer interaction, design principles, or AI ethics.

Additional Information

What's in it for you?

Achieving our crazy big goals motivates us to work hard - and we do - but you'll experience lots of moments of magic, connectivity and fun woven throughout life at Canva, too. We also offer a stack of benefits to set you up for every success in and outside of work.

Here's a taste of what's on offer:

  • Equity packages - we want our success to be yours too

  • Inclusive parental leave policy that supports all parents & carers

  • An annual Vibe & Thrive allowance to support your wellbeing, social connection, home office setup & more

  • Flexible leave options that empower you to be a force for good, take time to recharge and supports you personally

Check out lifeatcanva.com for more info.

Other stuff to know

We make hiring decisions based on your experience, skills and passion, as well as how you can enhance Canva and our culture. When you apply, please tell us the pronouns you use and any reasonable adjustments you may need during the interview process.

Please note that interviews are predominantly conducted virtually. 

Working Nomads

Post Jobs
Premium Subscription
Sponsorship
Free Job Alerts

Job Skills
API
FAQ
Privacy policy
Terms and conditions
Contact us
About us

Jobs by Category

Remote Administration jobs
Remote Consulting jobs
Remote Customer Success jobs
Remote Development jobs
Remote Design jobs
Remote Education jobs
Remote Finance jobs
Remote Legal jobs
Remote Healthcare jobs
Remote Human Resources jobs
Remote Management jobs
Remote Marketing jobs
Remote Sales jobs
Remote System Administration jobs
Remote Writing jobs

Jobs by Position Type

Remote Full-time jobs
Remote Part-time jobs
Remote Contract jobs

Jobs by Region

Remote jobs Anywhere
Remote jobs North America
Remote jobs Latin America
Remote jobs Europe
Remote jobs Middle East
Remote jobs Africa
Remote jobs APAC

Jobs by Skill

Remote Accounting jobs
Remote Assistant jobs
Remote Copywriting jobs
Remote Cyber Security jobs
Remote Data Analyst jobs
Remote Data Entry jobs
Remote English jobs
Remote Spanish jobs
Remote Project Management jobs
Remote QA jobs
Remote SEO jobs

Jobs by Country

Remote jobs Australia
Remote jobs Argentina
Remote jobs Brazil
Remote jobs Canada
Remote jobs Colombia
Remote jobs France
Remote jobs Germany
Remote jobs Ireland
Remote jobs India
Remote jobs Japan
Remote jobs Mexico
Remote jobs Netherlands
Remote jobs New Zealand
Remote jobs Philippines
Remote jobs Poland
Remote jobs Portugal
Remote jobs Singapore
Remote jobs Spain
Remote jobs UK
Remote jobs USA


Working Nomads curates remote digital jobs from around the web.

© 2025 Working Nomads.