Senior Research Engineer
About the Role
We are seeking a highly skilled Senior Research Engineer to collaborate closely with both Research and Engineering teams. The role involves diagnosing and resolving bottlenecks across large-scale distributed training, data processing, and inference systems, while also driving optimizations for existing high-performance pipelines.
The ideal candidate possesses a deep understanding of modern deep learning systems, combined with strong engineering expertise in areas such as layer-level optimization, large-scale distributed training, streaming, low-latency and asynchronous inference, inference compilers, and advanced parallelization techniques.
This is a cross-functional role requiring strong technical rigor, attention to detail, intellectual curiosity, and excellent communication skills. The position is embedded within the Research team and is responsible for developing and refining the technical foundation that enables cutting-edge research and translates its outcomes into production, bridging research and production engineering.
What You'll Do
Investigate and mitigate performance bottlenecks in large-scale distributed training and inference systems.
Develop and implement both low-level (operator/kernel) and high-level (system/architecture) optimization strategies.
Translate research models and prototypes into highly optimized, production-ready inference systems.
Explore and integrate inference compilers such as TensorRT, ONNX Runtime, AWS Neuron and Inferentia, or similar technologies.
Design, test, and deploy scalable solutions for parallel and distributed workloads on heterogeneous hardware.
Facilitate knowledge transfer and bidirectional support between Research and Engineering teams, ensuring alignment of priorities and solutions.
What You'll Need
Strong expertise in the Python ecosystem and major ML frameworks (PyTorch, JAX).
Experience with lower-level programming (C++ or Rust preferred).
Deep understanding of GPU acceleration (CUDA, profiling, kernel-level optimization); TPU experience is a strong plus.
Proven ability to accelerate deep learning workloads using compiler frameworks, graph optimizations, and parallelization strategies.
Solid understanding of the deep learning lifecycle: model design, large-scale training, data processing pipelines, and inference deployment.
Strong debugging, profiling, and optimization skills in large-scale distributed environments.
Excellent communication and collaboration skills, with the ability to clearly prioritize and articulate impact-driven technical solutions.
Pay Transparency:
AssemblyAI strives to recruit and retain exceptional talent from diverse backgrounds while ensuring pay equity across our team. Our salary ranges are set to be competitive for our size, stage, and industry, and reflect just one component of the full compensation, benefits, and rewards we offer.
Salary determinations consider a variety of factors, including relevant experience, technical depth, skills demonstrated during the interview process, and maintaining internal equity with peers on the team. The range shared below represents a general expectation for the posted position. However, we are open to considering candidates who may fall above or below the outlined experience level—in those cases, we will communicate any adjustments to the expected salary range.
The range provided applies to candidates located in the United States. For candidates outside of the U.S., compensation ranges may differ; any adjustments will be communicated throughout the interview process.
Salary range: $210,000 - $309,000
The expected base compensation for this role is listed above. Our total compensation package includes competitive equity grants, 100% employer-paid benefits, and the flexibility of being fully remote.
About the job
Apply for this position
Senior Research Engineer
About the Role
We are seeking a highly skilled Senior Research Engineer to collaborate closely with both Research and Engineering teams. The role involves diagnosing and resolving bottlenecks across large-scale distributed training, data processing, and inference systems, while also driving optimizations for existing high-performance pipelines.
The ideal candidate possesses a deep understanding of modern deep learning systems, combined with strong engineering expertise in areas such as layer-level optimization, large-scale distributed training, streaming, low-latency and asynchronous inference, inference compilers, and advanced parallelization techniques.
This is a cross-functional role requiring strong technical rigor, attention to detail, intellectual curiosity, and excellent communication skills. The position is embedded within the Research team and is responsible for developing and refining the technical foundation that enables cutting-edge research and translates its outcomes into production, bridging research and production engineering.
What You'll Do
Investigate and mitigate performance bottlenecks in large-scale distributed training and inference systems.
Develop and implement both low-level (operator/kernel) and high-level (system/architecture) optimization strategies.
Translate research models and prototypes into highly optimized, production-ready inference systems.
Explore and integrate inference compilers such as TensorRT, ONNX Runtime, AWS Neuron and Inferentia, or similar technologies.
Design, test, and deploy scalable solutions for parallel and distributed workloads on heterogeneous hardware.
Facilitate knowledge transfer and bidirectional support between Research and Engineering teams, ensuring alignment of priorities and solutions.
What You'll Need
Strong expertise in the Python ecosystem and major ML frameworks (PyTorch, JAX).
Experience with lower-level programming (C++ or Rust preferred).
Deep understanding of GPU acceleration (CUDA, profiling, kernel-level optimization); TPU experience is a strong plus.
Proven ability to accelerate deep learning workloads using compiler frameworks, graph optimizations, and parallelization strategies.
Solid understanding of the deep learning lifecycle: model design, large-scale training, data processing pipelines, and inference deployment.
Strong debugging, profiling, and optimization skills in large-scale distributed environments.
Excellent communication and collaboration skills, with the ability to clearly prioritize and articulate impact-driven technical solutions.
Pay Transparency:
AssemblyAI strives to recruit and retain exceptional talent from diverse backgrounds while ensuring pay equity across our team. Our salary ranges are set to be competitive for our size, stage, and industry, and reflect just one component of the full compensation, benefits, and rewards we offer.
Salary determinations consider a variety of factors, including relevant experience, technical depth, skills demonstrated during the interview process, and maintaining internal equity with peers on the team. The range shared below represents a general expectation for the posted position. However, we are open to considering candidates who may fall above or below the outlined experience level—in those cases, we will communicate any adjustments to the expected salary range.
The range provided applies to candidates located in the United States. For candidates outside of the U.S., compensation ranges may differ; any adjustments will be communicated throughout the interview process.
Salary range: $210,000 - $309,000
The expected base compensation for this role is listed above. Our total compensation package includes competitive equity grants, 100% employer-paid benefits, and the flexibility of being fully remote.
