Search - Workchat - Principal Data Scientist
What is The Role:
The Search Data Science team is responsible for developing and integrating statistical tools and machine learning models within the Search domain in support of semantic search, RAG, agentic search, and chat applications. As a Data Scientist in this area, you will work closely with our Product teams to lead the innovation, incubation, and prototyping phases of how to evolve and transform our AI/ML driven Search experiences and solutions with a focus on quickly bringing new ideas to production and into the hands of our customers. Your primary focus will be driving forward research and development in support of improving semantic search with proprietary models and customized open source models, developing techniques and models for query and document understanding, implementing RAG and LLM-driven search experiences, and developing tooling to help customers design and implement successful end-to-end RAG systems. Furthermore, you’ll be investigating aspects of modern agentic search including reasoning engines, prompt engineering techniques, query understanding, and more. Doing this requires exploring and benchmarking new open source models, and existing proprietary Elastic models, while keeping up to date with the latest major advancements in the fields of NLP and information retrieval.
What You Will Be Doing:
Explore, select and benchmark open source and Elastic proprietary models
Implementing RAG and other LLM-based search experiences
Designing evaluation protocols for semantic search, tool selection, and generation in LLM-based search experiences
Keeping up-to-date with the most significant recent developments in the field of NLP and information retrieval
Engage with the NLP and information retrieval communities (blogs, documentation, Python examples, conference talks, academic papers, etc.)
Collaborate with cross-functional teams of data scientists, engineers, and product managers
Promote knowledge sharing and collaboration in a distributed team
What You Will Bring:
8+ years of confirmed experience building and applying NLP to production use cases
8+ years of professional software development experience in Python
Experience in Generative AI, Retrieval Augmented Generation, and information retrieval
Experience with libraries and frameworks such as PyTorch, transformers, and Pandas
Experience using collaborative notebook-based workflows (e.g. Jupyter) for prototyping and knowledge sharing
Expertise in AI/ML quality evaluation and improvement, including balancing tuning techniques with cost/benefit tradeoffs
Self motivated, collaborative style, open communicator, experience in a distributed team
Good attention to detail and highly organized
Real passion for data, analysis and achieving excellence
Experience with Elasticsearch is useful
An academic background in the domain is also a plus
If this sounds interesting, we would love to hear from you! Please include whatever info you believe is relevant: resume, GitHub profile, code samples, blog posts and writing samples, links to personal projects, etc.
About the job
Apply for this position
Search - Workchat - Principal Data Scientist
What is The Role:
The Search Data Science team is responsible for developing and integrating statistical tools and machine learning models within the Search domain in support of semantic search, RAG, agentic search, and chat applications. As a Data Scientist in this area, you will work closely with our Product teams to lead the innovation, incubation, and prototyping phases of how to evolve and transform our AI/ML driven Search experiences and solutions with a focus on quickly bringing new ideas to production and into the hands of our customers. Your primary focus will be driving forward research and development in support of improving semantic search with proprietary models and customized open source models, developing techniques and models for query and document understanding, implementing RAG and LLM-driven search experiences, and developing tooling to help customers design and implement successful end-to-end RAG systems. Furthermore, you’ll be investigating aspects of modern agentic search including reasoning engines, prompt engineering techniques, query understanding, and more. Doing this requires exploring and benchmarking new open source models, and existing proprietary Elastic models, while keeping up to date with the latest major advancements in the fields of NLP and information retrieval.
What You Will Be Doing:
Explore, select and benchmark open source and Elastic proprietary models
Implementing RAG and other LLM-based search experiences
Designing evaluation protocols for semantic search, tool selection, and generation in LLM-based search experiences
Keeping up-to-date with the most significant recent developments in the field of NLP and information retrieval
Engage with the NLP and information retrieval communities (blogs, documentation, Python examples, conference talks, academic papers, etc.)
Collaborate with cross-functional teams of data scientists, engineers, and product managers
Promote knowledge sharing and collaboration in a distributed team
What You Will Bring:
8+ years of confirmed experience building and applying NLP to production use cases
8+ years of professional software development experience in Python
Experience in Generative AI, Retrieval Augmented Generation, and information retrieval
Experience with libraries and frameworks such as PyTorch, transformers, and Pandas
Experience using collaborative notebook-based workflows (e.g. Jupyter) for prototyping and knowledge sharing
Expertise in AI/ML quality evaluation and improvement, including balancing tuning techniques with cost/benefit tradeoffs
Self motivated, collaborative style, open communicator, experience in a distributed team
Good attention to detail and highly organized
Real passion for data, analysis and achieving excellence
Experience with Elasticsearch is useful
An academic background in the domain is also a plus
If this sounds interesting, we would love to hear from you! Please include whatever info you believe is relevant: resume, GitHub profile, code samples, blog posts and writing samples, links to personal projects, etc.