Monitoring our Devcon app backend with Elastic Beats
Last week was our Luminis conference called Devcon 2017. For the third time we organise this conference with around 500 attendees and 20 speakers.
This year we wanted to have a mobile app with the conference program and information about speakers. The app was created with the Ionic framework and the backend is a spring boot application. Before and during the conference we wanted to monitor the servers. We wanted to monitor the log files, hardware statistics and uptime. I used a number of beats to collect data, store the data in elasticsearch and show nice dashboards with Kibana. In this blog post I’ll explain to you the different beats. I’ll show you how to set up a system like this yourselves and I’ll show you some of the interesting charts that you can create from the collected data.
Beats is the platform for single-purpose data shippers. They install as lightweight agents and send data from hundreds or thousands of machines to Logstash or Elasticsearch. ~ Elastic homepage
Beats is a library to make it easier to build single purpose data shippers. Elastic comes with a number of beats themselves, but the community has already been creating their own beats as well. There are 4 beats I have experience with, for our monitoring wishes we needed three of them.
- Filebeat – Used to monitor files, can deal with multiple files in one directory, has module for files in well known formats like nginx, apache https, etc.
- Metricbeat – Monitors the resources of our hardware, think about used memory, used cpu, disk space, etc
- Heartbeat – Used to check endpoint for availability. Can check times it took to connect and if the remote system is up
- Packetbeat – Takes a deep dive into the packets going over the wire. It has a number of protocols to sniff like http, dns and amp. It also understand the packets being sent to applications like: MySql, MongoDB and Redis.
The idea behind a beat is that it has some input, a file or an endpoint, and an output. The output can be elasticsearch, logstash but also a file. All beats come with default index templates that tell elasticsearch to create indexes with the right mapping. They also come with predefined Kibana dashboards that are easy to install. Have a look at the image below for an example.
That should give you an idea of what beats are all about. In the next section I’ll show you the setup of our environment.
The backend and the monitoring servers
I want to tell you about the architecture of our application. It is a basic spring boot application with things like security, caching and spa enabled. We use MySql as a datastore. The app consists of two main parts. The first being an API for the mobile app. The second a graphical user interface created with Thymeleaf. The GUI is created to enable us to edit the conference properties like the speakers, the talks, used twitter tags, etc. We installed FileBeat and MetricBeat on the backend. We had a second server, this server was running elasticsearch. This second server is also the host for HeartBeat. The next image shows an overview of the platform.
All beats were installed using the debian packages and using the installation guides from the elastic website. As the documentation is thorough I won’t go into details here. I do want to show the configurations that I used for the different beats.
Filebeat was used to monitor the nginx access logs files. Filebeat makes use of modules with predefined templates. In our case we use the nginx module. Below the configuration that we used.
- module: nginx
As we have been doing with logstash for a while, we want to enhance the log lines with things like browser extraction, gea enhancements. Elastic these days has an option to use ingest for this purpose. More information about how to install these ingest components can be found here.
An example of the filebeat dashboard in Kibana is below.
Next step is monitoring the CPU, load factor, memory usage per process. Installing MetricBeat is easy when using the debian package. Below the configuration I used on the server.
- module: system
- module: mysql
As you can see, this one is bigger than the filebeat config. I configure he metrics to measure and how often to measure. In this example we measure every 30 seconds. We have two modules, the system module and the mysql module. With mysql module we get specific metrics about the mysql process. Below an idea of the available metrics. Interesting to see the amount of commands and threads.
This beat can be used to monitor the availability of other services. I used it to monitor the availability of our backend as well as our homepage. Configuration is as easy as this.
# Configure monitors
- type: http
schedule: '@every 60s'
- type: http
schedule: '@every 30s'
You see both monitors, they check a url every 30 seconds or 60 seconds in the first monitor. With this beat you can explicitly enable the kibana dashboards in the configuration. An example dashboard is presented in the image below. Not so funny to see that our website has been unavailable for a moment. Luckily this was not our mobile app backend.
Of course you can use all the prefabricated dashboard. You can however still create you own dashboards, combine the available views that you need to analyse your platform. In my case I want to create a new view. I want to have an indication of the usage of urls of the api over time. Kibana 5.3 comes with a new chart type called heatmap chart. In the next few images I am stepping through the creation of this chart. At the end I’ll also do an analysis of the chart for the conference day. Our heatmap shows the amount of calls to specific url. The darker the block, the more hits. First we create a new visualisation, choose the heatmap.
Next we need to chose the index to take the documents from. In our case we use the filebeat index.
Next choose the x-axis.
We create buckets for each time period, so a date histogram is what we need.
Now we need a sub aggregation, divide the histogram buckets per url. So we add a subaggregation in the interface and chose the y-Axis.
The sub aggregation can be a terms aggregation, use the field nginx.access.url.
Have a look at the image above, we see a number of urls we are not interested in. The have very little hits or are not interesting for our analysis. We can do two things to filter out specific urls. One of them is excluding values. These are under the advanced trigger. Open it by clicking it, bottom left. Than we can enter a minimum doc count of 2. We can also exclude a number of url with the pattern.
Finally let us have a better look at a larger version of this chart. In this chart we can see a few things. First of all the url /monitoring/ping. This is steady throughout the whole time period, which is no surprise as we call this url each 30 seconds using the heartbeat plugin. The second row with url /api/conferences/1is a complete json representation of the program, session details as well as speaker information. Most used at the beginning of the conference and it stopped around 17:00 when the last sessions started. At the beginning of the day people marked their favourite sessions. People were not consistent in rating session which can be found in the last line. Than there is the url /icon-256.png. This is an image used by android devices when a push notification was received. The darks green blocks were exactly the moments when we send out a push notification.
The test with using beats for monitoring felt really good. I like how easy they are to setup and the information you can obtain using the different kind of beats. Will definitely use them more in the future.