Elasticon 2016 Day 1
This week we are attending elasticon 2016. If you want to follow us along you can check our twitter feeds. If you like to read the summary read this and the following two blog posts. Our twitter handles are:
After some very nice days in San Francisco the conference finally started today. After the registration it was time to catch up with some of our previous colleagues. We had a chat with Noi, Uri, Steven, Elissa, Sejal, Martijn and of course Luca from Elastic. So much fun to travel the world to speak to people living close to us. After some hours of mingling and talking to some of the sponsors of the conference began with the keynote. Always fun to see Steven on stage, makes me feel proud we used to work together.
Ok, now to the real content. What was announced during the keynote?
Some numbers: 1800+ attendees, 50.000.000 downloads
Elastic products versions will be stream lined. Starting with the next major release bonanza all products will become 5.0. The first alfa release is expected within a few weeks. Next to that the logo’s of all the products have been streamlined as well. So fresh new logo’s for elastic, logstash, kibana, beats and some new ones.
They now want to make it easier to install extensions to different kind of products. Think about a Kibana extension together with a custom beats product. You can now install all these together using the packs product. Elastic is going to use this approach for their own commercial plugins. This is now introduced as x-pack. This is bundled set of features for security, alerting, monitoring and more to come. X-pack introduces tight shield integration into Kibana, now they also make it possible to create a login form in Kibana to secure everything you show in Kibana.
Elastic told us last year that they acquired Found, a hosted elasticsearch platform. They invested heavy in it, but found was hard to find. Therefore they renamed found into Elastic Cloud. To top it of, they are making found available to buy as a product. This way you are now able to host your own private elastic cloud on your own hardware or cloud provider. This is branded as Elastic Cloud Enterprise. The beta is becoming available shortly.
What’s Evolving in Elasticsearch
(Clinton Gormley and Simon Willnauer)
Multiple improvements have been made to the usage of Heap space. Among them a different store than the inverted index. The columnar store is used for things like aggregations and sorting. In elastic this is called the doc values. This is one of the many improvements in the heap value usage which is very important for the performance of elastic.
One important functionality of elastic is support for Geo locations. Most of the functionalities are backed by Lucene ones. Recently Geo points v2 is introduces. It has a 10 times performance increase, 50% less index size. Based on this new geo support elastic can create cool new features. One of them being the geo centroid aggregations, these determine the center of geo points instead of a grid of data points.
For us one of the most interesting features in the re-index api. This will enable you to re-index your index into a new one with new mappings or change the number of shards. Next to the re-index api some other new api’s were introduced. Among them the _update_by_query and task management api.
Scripting has been an issue from 1.2 onwards. Do we support scripting, what scripting language is safe. To make it possible to do scripting elastic needed another approach. No general purpose scripting language, but specific for elastic. It is called “Painless”, it does only support what elastic needs.
New type introduced for data structures, string is replaced with text and keyword
A new feature is Search after, this adds support for deep pagination. It has always been a problem for elastic to ask for page 200 or something. Now with Search after you can enable this functionality.
From 5.x onwards a new Java client will be introduced. It is a wrapper around the Java Http Client, the same as for all the other supported language drivers. This driver decouples the server and client. The driver makes it possible to upgrade the server but leave the client behind. The client also needs a lot less dependencies.
What’s Cookin’ in Kibana
(Chris Cowan and Rashid Khan)
This talks first talks about some features in the Kibana 4 interface. Also it gives a preview of the new 5.x interface. With 4.x it now possible to customize the axis labels and colours on charts for the different bars. Kibana re-introduces the dark side theme, which is received with much enthusiasm.
You can use field formatters to show certain images instead of a value, this can be used to show a coloured icon based on the actual data. You can also write your own plugin for a formatters.
It is now easy to export everything from Kibana, even to export a visualization to generate more of the similar one and then reimport it.
In Kibana 5.x the new interface is introduced. The interface removes a lot of clutter. They made it possible to give you around 20% more space to show your data.
A lot has been done on supporting plugins. Everything you see in Kibana is already a plugin and you can easily create your own plugin by using the yeoman generator for node.js. If you want to extend Kibana in more depth, hacks are the way to go. They make it possible to do almost everything you need to do. Make a hammer to hit every nail.
Examples of really cool Kibana hacks can be found on github:
What’s Brewing in Beats
(Monica Sarbu and Tudor Golubenco)
Beats are about importing data into elastic. There are a lot of beats out of the box and they all use the library, libbeat. Examples are File beat, Top beat, Packet beat. With packet beat you can decode network traffic for known sources. Some of these sources are: HTTP, DNS, MySQL and PostgreSQL.
Besides the elastic supported plugins there are also more than 10 community supported beats. Some examples are Nginx and Docker.
Starting version 5.x a new beat is introduced. Metric beat, collecting metrics from other systems into elastic using the libbeat library. Metric beat can be used out of the box with a number of systems, but it can also be a library to include in your own beat. Other new features for 5.x are: Filebeat will contain redis/kafka output options out of the box. Each beat will include generic filtering, for instance to drop certain events or even fields. That way you could filter all 200 OK messages in an http beat.
Hunting the Hackers: How Cisco Talos is Leveling Up Security
(Kate Nolan and Samir Sapra)
Cisco’s Talos provides protection before and after cybersecurity threats. During this talk Kate and Samir gave a rough overview what their system is doing and how they use Elasticsearch to leverage data to detect bad guys. They run a 10 node cluster with approximately 3TB of data and about 100k events per day, to do dynamic malware analysis. Simple automated Elasticsearch queries help with finding (new) exploit kits, by searching for common patterns within their event data. Samir dove deeper into how the system helped them takedown a hacker group called SSHPsychos/Group 93 and explained how they were able to shut these hackers down.
TAP(ping) Out Security Threats at FireEye
With 3.6 petabyte of raw storage, 700 billion events on 400+ nodes and 300k events per second; FireEye’s Threat Analytics Platform (TAP) uses Elasticsearch for analytics and sub-second search in the hunt for cybersecurity events. This talk was mainly focussed on how FireEye stores their data, manage their clusters and how to avoid users breaking their system by running too complex queries. One example given was a regex query searching for some credit card data. This query got 1080 cpu cores spinning at 100% for 83 minutes, turning the system inoperable. In order to avoid this in the future, they’re actively working on several levels (hardware, Lucene, Elasticsearch) to improve the speed with which Elasticsearch can handle queries FireEye’s users fire at their system.