Milin Desai, Sentry.io
In this interview during theCUBE's coverage of AWS re:Invent, Han Xiao, vice president of AI at Elastic and former chief executive officer of Jina AI, joins theCUBE’s Rob Strechay to unpack how Jina AI’s technology is reshaping the Elastic ecosystem. Xiao explains how Jina’s search foundation models – specifically embeddings, rerankers and small language models – serve as the "brain" behind Elastic’s orchestration framework. This integration aims to solidify Elastic as the essential computational layer for search, enabling developers to build highly accurate, multimodal and multilingual systems that are critical for powering the next generation of agentic AI. The conversation delves into the nuances of "context engineering," which Xiao describes as the art of optimizing the information fed to Large Language Models (LLMs). He details how small language models are increasingly utilized to compress context and rerank passages within massive token windows, ensuring LLMs receive the most relevant data without unnecessary noise. Xiao also highlights that Jina AI will become the default model provider for the Elastic Inference Service (ELSER), streamlining the developer experience by providing immediate access to state-of-the-art tools for building robust search and retrieval workflows.