In this interview during theCUBE's coverage of AWS re:Invent, Christine Yen, chief executive officer of Honeycomb.io, sits down with theCUBE’s Dave Vellante to discuss why observability is emerging as the critical "trust fabric" for AI-driven software development. Yen explains that as AI coding assistants accelerate development velocity and increase the volume of code in production, the potential for instability and "unknown unknowns" rises significantly. She argues that traditional monitoring, which often minimizes data collection to control costs, cannot keep up with the distributed dependencies responsible for nearly 70% of outages. Instead, high-fidelity telemetry is required to create the necessary feedback loops that allow engineering teams to validate agentic behavior and maintain system reliability.
The conversation also highlights Honeycomb’s latest strategic announcements designed to meet these challenges, including the launch of Honeycomb Private Cloud for organizations with strict governance needs. Yen details the company’s full embrace of OpenTelemetry standards for metrics and the general availability of Honeycomb Canvas, a natural language interface that simplifies complex querying. Yen and Vellante further explore the misconception that AI will reduce the need for oversight, with Yen positioning observability as the "seatbelt" for AI – allowing teams to move fast while retaining the ability to detect and resolve issues in real time.
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
AWS re:Invent 2025. If you don’t think you received an email check your
spam folder.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open the link to automatically sign into the site.
Register for AWS re:Invent 2025
Please fill out the information below. You will receive an email with a verification link confirming your registration. Click the link to automatically sign into the site.
You’re almost there!
We just sent you a verification email. Please click the verification button in the email. Once your email address is verified, you will have full access to all event content for AWS re:Invent 2025.
I want my badge and interests to be visible to all attendees.
Checking this box will display your presense on the attendees list, view your profile and allow other attendees to contact you via 1-1 chat. Read the Privacy Policy. At any time, you can choose to disable this preference.
Select your Interests!
add
Upload your photo
Uploading..
OR
Connect via Twitter
Connect via Linkedin
EDIT PASSWORD
Share
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
AWS re:Invent 2025. If you don’t think you received an email check your
spam folder.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open the link to automatically sign into the site.
Sign in to gain access to AWS re:Invent 2025
Please sign in with LinkedIn to continue to AWS re:Invent 2025. Signing in with LinkedIn ensures a professional environment.
Are you sure you want to remove access rights for this user?
Details
Manage Access
email address
Community Invitation
Han Xiao, Elastic
In this interview during theCUBE's coverage of AWS re:Invent, Christine Yen, chief executive officer of Honeycomb.io, sits down with theCUBE’s Dave Vellante to discuss why observability is emerging as the critical "trust fabric" for AI-driven software development. Yen explains that as AI coding assistants accelerate development velocity and increase the volume of code in production, the potential for instability and "unknown unknowns" rises significantly. She argues that traditional monitoring, which often minimizes data collection to control costs, cannot keep up with the distributed dependencies responsible for nearly 70% of outages. Instead, high-fidelity telemetry is required to create the necessary feedback loops that allow engineering teams to validate agentic behavior and maintain system reliability.
The conversation also highlights Honeycomb’s latest strategic announcements designed to meet these challenges, including the launch of Honeycomb Private Cloud for organizations with strict governance needs. Yen details the company’s full embrace of OpenTelemetry standards for metrics and the general availability of Honeycomb Canvas, a natural language interface that simplifies complex querying. Yen and Vellante further explore the misconception that AI will reduce the need for oversight, with Yen positioning observability as the "seatbelt" for AI – allowing teams to move fast while retaining the ability to detect and resolve issues in real time.
In this interview during theCUBE's coverage of AWS re:Invent, Han Xiao, vice president of AI at Elastic and former chief executive officer of Jina AI, joins theCUBE’s Rob Strechay to unpack how Jina AI’s technology is reshaping the Elastic ecosystem. Xiao explains how Jina’s search foundation models – specifically embeddings, rerankers and small language models – serve as the "brain" behind Elastic’s orchestration framework. This integration aims to solidify Elastic as the essential computational layer for search, enabling developers to build highly accurate,...Read more
exploreKeep Exploring
What are the goals of Jina AI and what is its focus?add
What year was Jina AI founded and what is their primary goal?add
What is the current status of the Elastic Inference Service and its relationship with Jina AI?add
>> Hello and welcome to theCUBE's coverage of AWS re:Invent 2025 where we're talking about all things agentic and AI. To help me unpack this a little bit more, I got Han Xiao, who is the Vice President of AI at Elastic, also Founder and former CEO of Jina AI. Welcome on board, Han.
Han Xiao
>> Thanks, Rob.
Rob Strechay
>> So before we dive into deep here, help us understand Jina AI, what your goals were before the acquisition by Elastic and what it's really about.
Han Xiao
>> Yeah, so Jina AI was founded in 2020. So our only goal is to build a world-class search model. So we call them the search foundation models. So that particularly includes the embeddings, rerankers, and the small language models that people can use to build better search system, high quality search system and high relevance search system. And so over the last five years, we have been extensively working on building the world-class model, make sure that they work on multilingual, multimodal data and that they can be used as a foundational building block when people try to build a highly scalable and highly accurate search system.
Rob Strechay
>> Yeah, I think that makes a lot of sense when you start to look at what is going on with AI and what is going on in this entire realm, it's so busy. What did acquisition really mean to Elastic? And what does it really underscore for Elastic as well?
Han Xiao
>> Yeah. So one of the observations that we look at is when people try to build a very high quality search system, they typically need a lot of building block and they also need an orchestration layer, which basically connects all the dots together. Elastic, and in particular the Elasticsearch is one of the most downloaded and usable framework for developers and business to build production-ready system, search system. So we want to be the computational layer behind this search system. So we want to use our embedding models to better represent all this document data, image data, multimodal data into vector representation so that people can leverage semantic search, vector database to query those data, get those data more accurately. And then we also have a ranker model, which basically serve as another computational block, happens after the first stage retrieval. So this basically boosts the accuracy and precision even further. So finally, when you think about today's agentic AI, so there will always be a large language model as it happens at the last stage, which basically generates answers and optimize the search result. For that, we're also providing some small language model to help the agent better understand the context and output more meaningful and more human-readable search result.
Rob Strechay
>> So Han, help me understand how bringing Jina to Elastic is really helping out in the ecosystem.
Han Xiao
>> So right now for developers and business to build a high quality search system, you need not only the orchestration layer, not only is the framework, but also you need the brain behind this framework. Who is going to provide this frame? So basically, at Jina, we have spent years of research and development on building the best search model that can be plugged into any search system. So this particular is useful in today's world when you try to handle multimodal data and multilingual data where there are a lot of things that cannot be implemented using traditional keyword-based search. So in this case, where Jina models and particular those deep neural network-based models are really helpful when they can better represent those multimodal data into a searchable format.
Rob Strechay
>> Yeah. That to me is so important to organizations as they look at having that control plane for their agentic operating system, which is these applications which are multimodal in many ways. But you also talk about context engineering. Kind of help us understand where Jina's AI team has really been pushing with that whole space as well.
Han Xiao
>> Yeah. So context engineering is a very heated topic today. So a lot of people talk about context engineering and how important they are in the search and any agentic system. So to me, context engineering is all about cherry-picking what are the best words that you need to send to the LLM before it answers, before it meets the output. So it sounds very simple, right? So you basically try to selectively copy the context and into LLM. But in reality, there are a lot of technical details. For example, how do you preserve the information while reducing the total number of tokens in the context? How do you rank different search result in the context so that LLM can recognize which snippet is more important to the answers? And how do you mask out the personal sensitive information before sending to the LLM API? So those kind of things basically fall into the context engineering category. To that, we actually have a lot of small language models, which basically can be very, very useful for those context engineering purpose. For example, one can use embedding model to compress the context by preserving the information overall. One can use reranker to rerank the number of passages to compute the optimal passages and put them on the top of the context so that LLM gives them more emphasis in the answers. So I think in general, this is very, very exciting time for small language models to be shine in this context engineering domain. So one of the observations that I see over the last one year is that, originally, those embeddings models and reranker models built for large scale purpose. So they're building the way that they can batch process billions of documents and they are used as the core retriever behind the search system. But since 2025, what I see is a lot of embeddings and reranker usage happens inside the context, inside this 1 million context lens of the large language models. And I believe these trends will continue in the next year in 2026, and we will see that. So people will look for more strong and smarter small language model that can be used to optimize the context in a very extreme way so that the large language model can give much, much better result.
Rob Strechay
>> Totally agree. I think that, again, it talks to the whole fact of specificity of a small language model and how it can really help with particular tasks as you build it out. Help us here understand the context of Jina with ELSER and the Elastic Inference Service and what it means going forward with Elastic.
Han Xiao
>> Elastic Inference Service is right now the default inference service behind all the Elasticsearch system. When developers try to use embedded surveys, reranker surveys or any small language model, they're basically underneath their coding, the ELSER or what we call the Elastic Inference Service underneath. So from now on, Jina is going to be the default model provider for ELSER. So developers can have very handy experience and access all the top models from Jina AI, but also ELSER will stay open. So basically, not only Jina AI model, but also other very strong embeddings, reranker models will also appear on ELSER. What we want to provide is the best developer experience for all the business to make sure that they have the immediate access every time there's a new embedding model, new reranker model, new small language model that can be used to build the search system.
Rob Strechay
>> That makes total sense and I think it fits in very well with what everybody is trying to do at re:Invent this year as they look towards building out these agentic workflows and how they are able to bring different types of data and different types of actions together. And again, I think that makes a lot of sense. So hey, Han, thank you for coming on board. This has been great. Really appreciate it.
Han Xiao
>> Thank you, Rob. Yeah.
Rob Strechay
>> And thank you for watching our coverage of AWS re:Invent 2025 on theCUBE, the leader in analysis and news. Stay tuned for more.