Elastic.ON Amsterdam 2024
This week, I attended Elastic.ON in Amsterdam. I still remember the first few Elastic events with just a few people, then going to San Francisco with many people. It has been a few years since I last visited Elastic.ON, so I liked going this year.
It was a good day, with interesting content. I got inspired to try out a few of the new features. In this post, you will read some of the news I received during the conference. Not everything is entirely new, but these were the exciting things for me.
- Search AI Platform
- OpenTelemetry for shipping data
- LogsDB index mode
- Semantic text fields
- Run Elastic stack on your local machine
- ES|QL
- AI Playground
The Search AI Platform
Yes, this is marketing hype, but the platform is maturing, getting more features, and delivering everything you need to create AI-driven search applications. New features include integrated embeddings, inference, and RAG. Some features are rebuilt from the ground up. You can read more about these features in the remainder of this blog.
OpenTelemetry for shipping data
OpenTelemetry is an open standard for sending structured observability data to an endpoint. Elastic is focusing on this project to become the best transmitter of telemetry data and logs. Of course, Elasticsearch is supposed to be the endpoint for storing the data. The cool part is that Elastic is pushing the OpenTelemetry project forward for all of us.
LogsDB index mode
The Logs DataStream introduces the logsdb index mode. The goal is efficiency. The size of the index decreases by 2.5. One optimisation is to remove the source. Asking for the source will reconstruct it from the data. Be sure to prioritise your mapping to a higher number than 100 to prevent the default log index pattern from kicking in.
Semantic text fields
With semantic_text fields, Elastic brings semantic search without the hassle. You do not have to create your embeddings. You can easily access hybrid search (combine lexical and vector search).
Elastic provides an embedding model called ELSER. This is an interesting approach to running inference on their platform. However, with Docker, this is a challenge. Therefore, I use the OpenAI integration to create the embeddings.
Below is the command to create the inference endpoint. No, I do not include the API key.
PUT _inference/text_embedding/openai
{
"service": "openai",
"service_settings": {
"model_id": "text-embedding-3-small",
"api_key": "PASTE YOUR API KEY HERE"
}
}
Response to creating the inference endpoint. Notice the default chunking strategy that Elastic chooses. Yes, you can configure another chunking strategy.
{
"inference_id": "openai",
"task_type": "text_embedding",
"service": "openai",
"service_settings": {
"model_id": "text-embedding-3-small",
"similarity": "dot_product",
"dimensions": 1536,
"rate_limit": {
"requests_per_minute": 3000
}
},
"chunking_settings": {
"strategy": "sentence",
"max_chunk_size": 250,
"sentence_overlap": 1
}
}
The next step is to create an index that uses the field type semantic_text.
PUT blogs
{
"mappings": {
"properties": {
"title": {
"type": "semantic_text",
"inference_id": "openai"
}
}
}
}
Next, you can load some data. The example uses blog titles.
POST blogs/_bulk
{"index":{}}
{"title": "The Evolution of Creativity: Thriving in the Age of AI"}
{"index":{}}
{"title": "Build an Agent using Amazon Bedrock."}
{"index":{}}
{"title": "RAG: splitter chain for proper chunks."}
{"index":{}}
{"title": "What? A synonyms API for Elasticsearch?"}
{"index":{}}
{"title": "RAG optimisation: use an LLM to chunk your text semantically."}
{"index":{}}
{"title": "Introducing Rag4p GUI"}
{"index":{}}
{"title": "Getting the proper context for RAG is choosing your chunking and retrieval strategy."}
{"index":{}}
{"title": "LLM size does matter"}
{"index":{}}
{"title": "Bringing Lexical search to Python Pandas using SearchArray"}
{"index":{}}
{"title": "Set your (search) metrics, and live by them."}
And, what you have all been waiting for, ask a question to find the most relevant items.
GET blogs/_search
{
"query": {
"semantic": {
"field": "title",
"query": "What title is about measurements?"
}
},
"size": 1
}
The response for this request without the embedding array:
{
"_index": "blogs",
"_id": "-lEWaZMBHmjklNVmAUcy",
"_score": 0.6609547,
"_source": {
"title": {
"text": "Set your (search) metrics, and live by them.",
"inference": {
"inference_id": "openai",
"model_settings": {
"task_type": "text_embedding",
"dimensions": 1536,
"similarity": "dot_product",
"element_type": "float"
},
"chunks": [
{
"text": "Set your (search) metrics, and live by them.",
"embeddings": []
}
]
}
}
}
}
Notice that the answer does not contain the words you are searching for, but it does contain semantic similar content.
Run Elastic stack on your local machine
If you are like me, you have many potential elastic clusters on your machine. Each customer requires specific components; sometimes, you need a cluster to try something quickly. Starting a cluster has become much more manageable. Check this command.
curl -fsSL https://elastic.co/start-local | sh
As a result, you get the username, password, and API_KEY to work with the fresh cluster.
ES|QL
The Elasticsearch Query Language (ES|QL) is a powerful query language for querying, filtering, analysing, and plotting data. The | in the name is there for a reason. It tells you you can pipe commands together to work with your data. What better way to show what you can do than by giving an example?
For the example, I generated a few access log lines using ChatGPT. Each line looks like this.
127.0.0.1 - frank [10/Oct/2024:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326
I used a bash script to extract the data, create an index, and insert it. The data and the script are in the Gist mentioned below.
The first step is to find the data. The ES|QL query is:
FROM access_logs
Now, we want to extract data using the dissect command
FROM access_logs |
DISSECT message "%{my_ip} - %{my_user} [%{my_date}] '%{my_method} %{my_url} %{my_protocal}' %{my_status_code} %{my_size}"
In the last part, we want to create a chart for the requests per user.
FROM access_logs |
DISSECT message "%{my_ip} - %{my_user} [%{my_date}] '%{my_method} %{my_url} %{my_protocal}' %{my_status_code} %{my_size}" |
STATS COUNT(*) BY my_user
It is an exciting query language for learning more about your data and plotting primary aggregate data. More tools are becoming available in ES|QL, so now is an excellent time to use them. It is available in 8.16, so it is easy to play with.
AI Playground
As mentioned at the beginning of this blog, Elastic is betting hard on AI. Search systems are very important for Retrieval-Augmented Generation (RAG), a pattern in which a query is used to obtain information or context that the LLM uses to generate an answer to the question.
The Playground feature lets you play with a RAG system. It is available in Kibana as a technical preview. You need a connection to an LLM. Currently, you can choose OpenAI, Amazon Bedrock, or Google Gemini. In the example, I use a connection to OpenAI.
Below is a screenshot of the Playground. First, you have to select the data sources to extract data from. Select the Data button in the top right corner. Select the indexes you want to use. I selected the index with the blog titles from before. Next, configure the query used to find documents related to the question. I use the default query. You can find it by pushing the Query toggle in the top middle of the screen. Next, go back to the chat.
Now you can ask a question. Beware that the content I am using is limited to 10 blog posts. However, I want to know if the Rag4p framework has a user interface.
As someone who works a lot on RAG, I know it is far from complete. Still, I like what I see. It is a good start to working with RAG from Kibana on your Elastic platform. I’ll look into it in the coming weeks to see what else you can do with it.
Want to know more about what we do?
We are your dedicated partner. Reach out to us.