AI Operations Engineer

Full-time
Latin America
Mid Level
Posted 1 hour ago
Apply for this position → Go ad-free with Premium ×

Why You'll Love This Role: 

We're looking for an experienced ML Ops Engineer to join the ML/AI team at Newsela. This team works on projects ranging from classical Machine Learning to AI / Generative pipelines.  This is a hands-on role. You'll work closely with ML/AI, data and site reliability engineers to take models from prototype to production, build robust data pipelines, and keep our services running smoothly as we continue to scale.

What You'll Be Doing:

  • Design and maintain CI/CD pipelines for ML model training, packaging, and deployment across our microservices.

  • Manage containerized services on AWS ECS, optimizing for cost, latency, and availability.

  • Automate infrastructure provisioning and service configuration with Terraform.

  • Work to maintain and scale services that make use of third party LLM providers.

  • Build and improve data pipelines that feed models from BigQuery, S3, and DynamoDB into training and inference workflows.

  • Instrument services with observability tooling (Datadog, OpenTelemetry, Langfuse) and establish SLOs for model-serving endpoints.

  • Collaborate with ML engineers to productionize new models using BentoML, FastAPI, and container-based serving.

About You:

  • 2-3 years in ML Ops supporting ML/AI features, systems and workflows with 3-4 years prior experience in DevOps, CloudOps or SRE.

  • Strong proficiency in Python.

  • Hands-on experience with Docker containerization and container orchestration.

  • Solid understanding of CI/CD for ML workflows in an enterprise production environment.

  • Experience with Infrastructure as Code, preferably Terraform.

  • Familiarity with cloud platforms — specifically AWS (ECS, ECR, S3, DynamoDB, CloudWatch) and GCP (BigQuery, Vertex AI).

  • Experience with LLM integration and observability (OpenAI API, Google GenAI, Langfuse tracing).

  • Experience building and maintaining data pipelines for ML training and feature engineering

  • Familiarity with ML modeling workflows — training, evaluation, experiment tracking (e.g. MLFlow, Weights & Biases), and model versioning

  • Experience monitoring and flagging model drift over time.

  • Exposure to NLP/NLU models and frameworks such as Hugging Face Transformers, spaCy, or sentence-transformers

  • Knowledge of vector databases (LanceDB, FAISS) and embedding-based retrieval systems

  • Experience with scaling and maintaining deep learning frameworks (TensorFlow, PyTorch) in production settings

  • Familiarity with classical ML libraries (scikit-learn, XGBoost, LightGBM) and model explainability tools (SHAP)

  • Working knowledge of ML serving frameworks such as BentoML or similar.

  • Comfort working with FastAPI or similar async Python web frameworks.

About Newsela:

Newsela takes authentic, real world content from trusted sources and makes it instruction ready for K-12 classrooms. Each text is published at five reading levels, so content is accessible to every learner. Today, over 3.3 million teachers and 40 million students have registered with Newsela for content that's personalized to student interests, accessible to everyone, aligned to instructional standards, and attached to activities and reporting that hold teachers accountable for instruction and students accountable for their work. With over 15,000 texts on our platform and multiple new texts published every day across 20+ genres, Newsela enables educators to go deep on any subject they choose.

#LI-Remote

Go ad-free with Premium ×
Apply for this position →
About the Job
Full-time
Latin America
Mid Level
Posted 1 hour ago
Check if your resume is a good fit
25/100
Get Full Report
+ 1,284 new jobs added today
30,000+
Remote Jobs

Don't miss out — new listings every hour

Join Premium

AI Operations Engineer

Why You'll Love This Role: 

We're looking for an experienced ML Ops Engineer to join the ML/AI team at Newsela. This team works on projects ranging from classical Machine Learning to AI / Generative pipelines.  This is a hands-on role. You'll work closely with ML/AI, data and site reliability engineers to take models from prototype to production, build robust data pipelines, and keep our services running smoothly as we continue to scale.

What You'll Be Doing:

  • Design and maintain CI/CD pipelines for ML model training, packaging, and deployment across our microservices.

  • Manage containerized services on AWS ECS, optimizing for cost, latency, and availability.

  • Automate infrastructure provisioning and service configuration with Terraform.

  • Work to maintain and scale services that make use of third party LLM providers.

  • Build and improve data pipelines that feed models from BigQuery, S3, and DynamoDB into training and inference workflows.

  • Instrument services with observability tooling (Datadog, OpenTelemetry, Langfuse) and establish SLOs for model-serving endpoints.

  • Collaborate with ML engineers to productionize new models using BentoML, FastAPI, and container-based serving.

About You:

  • 2-3 years in ML Ops supporting ML/AI features, systems and workflows with 3-4 years prior experience in DevOps, CloudOps or SRE.

  • Strong proficiency in Python.

  • Hands-on experience with Docker containerization and container orchestration.

  • Solid understanding of CI/CD for ML workflows in an enterprise production environment.

  • Experience with Infrastructure as Code, preferably Terraform.

  • Familiarity with cloud platforms — specifically AWS (ECS, ECR, S3, DynamoDB, CloudWatch) and GCP (BigQuery, Vertex AI).

  • Experience with LLM integration and observability (OpenAI API, Google GenAI, Langfuse tracing).

  • Experience building and maintaining data pipelines for ML training and feature engineering

  • Familiarity with ML modeling workflows — training, evaluation, experiment tracking (e.g. MLFlow, Weights & Biases), and model versioning

  • Experience monitoring and flagging model drift over time.

  • Exposure to NLP/NLU models and frameworks such as Hugging Face Transformers, spaCy, or sentence-transformers

  • Knowledge of vector databases (LanceDB, FAISS) and embedding-based retrieval systems

  • Experience with scaling and maintaining deep learning frameworks (TensorFlow, PyTorch) in production settings

  • Familiarity with classical ML libraries (scikit-learn, XGBoost, LightGBM) and model explainability tools (SHAP)

  • Working knowledge of ML serving frameworks such as BentoML or similar.

  • Comfort working with FastAPI or similar async Python web frameworks.

About Newsela:

Newsela takes authentic, real world content from trusted sources and makes it instruction ready for K-12 classrooms. Each text is published at five reading levels, so content is accessible to every learner. Today, over 3.3 million teachers and 40 million students have registered with Newsela for content that's personalized to student interests, accessible to everyone, aligned to instructional standards, and attached to activities and reporting that hold teachers accountable for instruction and students accountable for their work. With over 15,000 texts on our platform and multiple new texts published every day across 20+ genres, Newsela enables educators to go deep on any subject they choose.

#LI-Remote