<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Search Nuggets &#187; Kibana</title>
	<atom:link href="http://blog.comperiosearch.com/blog/tag/kibana/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.comperiosearch.com</link>
	<description>A blog about Search as THE solution</description>
	<lastBuildDate>Mon, 13 Jun 2016 08:59:45 +0000</lastBuildDate>
	<language>en-US</language>
		<sy:updatePeriod>hourly</sy:updatePeriod>
		<sy:updateFrequency>1</sy:updateFrequency>
	<generator>https://wordpress.org/?v=3.9.40</generator>
	<item>
		<title>ELK stack deployment with Ansible</title>
		<link>http://blog.comperiosearch.com/blog/2015/11/26/elk-stack-deployment-with-ansible/</link>
		<comments>http://blog.comperiosearch.com/blog/2015/11/26/elk-stack-deployment-with-ansible/#comments</comments>
		<pubDate>Thu, 26 Nov 2015 09:59:38 +0000</pubDate>
		<dc:creator><![CDATA[Christoffer Vig]]></dc:creator>
				<category><![CDATA[English]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[ansible]]></category>
		<category><![CDATA[deployment]]></category>
		<category><![CDATA[Elasticsearch]]></category>
		<category><![CDATA[elk]]></category>
		<category><![CDATA[Kibana]]></category>
		<category><![CDATA[logstash]]></category>

		<guid isPermaLink="false">http://blog.comperiosearch.com/?p=3999</guid>
		<description><![CDATA[As human beings, we like to believe that each and every one of us is a special individual, and not easily replaceable. That may be fine, but please, don’t fall into the habit of treating your computer the same way. Ansible is a free software platform for configuring and managing computers, and I’ve been using [...]]]></description>
				<content:encoded><![CDATA[<p><img class="alignright" src="http://www.ansible.com/hs-fs/hub/330046/file-767051897-png/Official_Logos/ansible_circleA_red.png?t=1448391213471" alt="" width="251" height="251" />As human beings, we like to believe that each and every one of us is a special individual, and not easily replaceable. That may be fine, but please, don’t fall into the habit of treating your computer the same way.</p>
<p><span id="more-3999"></span></p>
<p><a href="https://en.wikipedia.org/wiki/Ansible_(software)"><b>Ansible</b> </a>is a <a href="https://en.wikipedia.org/wiki/Free_software">free software</a> platform for configuring and managing computers, and I’ve been using it a lot lately to manage the ELK stack. Elasticsearch, Logstash and Kibana.</p>
<p>I can define a list of servers I want to manage in a YAML config file &#8211; the so called inventory:</p><pre class="crayon-plain-tag">[elasticearch-master]
es-master1.mydomain.com
es-master2.mydomain.com
es-master3.mydomain.com

[elasticsearch-data]
elk-data1.mydomain.com
elk-data2.mydomain.com
elk-data3.mydomain.com

[kibana]
kibana.mydomain.com</pre><p>And define the roles for the servers in another YAML config file &#8211; the so called playbook:</p><pre class="crayon-plain-tag">- hosts: elasticsearch-master
  roles:
    - ansible-elasticsearch

- hosts: elasticsearch-data
  roles:
    - ansible-elasticsearch

- hosts: logstash
  roles:
    - ansible-logstash

- hosts: kibana
  roles:
    - ansible-kibana</pre><p>&nbsp;</p>
<p>Each group of servers may have their own files containing configuration variables.</p><pre class="crayon-plain-tag">elasticsearch_version: 2.1.0
elasticsearch_node_master: false
elasticsearch_heap_size: 1000G</pre><p>&nbsp;</p>
<p>Ansible is used for configuring the ELK stack vagrant box at <a href="https://github.com/comperiosearch/vagrant-elk-box-ansible">https://github.com/comperiosearch/vagrant-elk-box-ansible</a>, which was recently upgraded with Elasticsearch 2.1, Kibana 4.3 and Logstash 2.1</p>
<p>The same set of Ansible roles can be applied when the configuration needs to move into production, by applying another set of variable files with modified host names, certificates and such. The possible ways to do this are several.</p>
<p><b>How does it work?</b></p>
<p>Ansible is agent-less. This means, you do not install anything (an agent) on the machines you control. Ansible needs only to be installed on the controlling machine (Linux/OSX) and  connects to the managed machines (some support for windows, even) using SSH. The only requirement on the managed machines is python.</p>
<p>Happy ansibling!</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.comperiosearch.com/blog/2015/11/26/elk-stack-deployment-with-ansible/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Elasticsearch: Shield protected Kibana with Active Directory</title>
		<link>http://blog.comperiosearch.com/blog/2015/08/21/elasticsearch-security-shield/</link>
		<comments>http://blog.comperiosearch.com/blog/2015/08/21/elasticsearch-security-shield/#comments</comments>
		<pubDate>Fri, 21 Aug 2015 14:26:45 +0000</pubDate>
		<dc:creator><![CDATA[Christoffer Vig]]></dc:creator>
				<category><![CDATA[English]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Elasticsearch]]></category>
		<category><![CDATA[enterprise]]></category>
		<category><![CDATA[Kibana]]></category>
		<category><![CDATA[security]]></category>

		<guid isPermaLink="false">http://blog.comperiosearch.com/?p=3245</guid>
		<description><![CDATA[Elasticsearch easily stores terabytes of data, but how can you make sure users only see the data they should? This post will explore how to use Shield, a plugin for Elasticsearch, to authenticate users with Active Directory. Elasticsearch will by default allow anyone access to all data. The Shield plugin allows locking down Elasticsearch using authentication [...]]]></description>
				<content:encoded><![CDATA[<p>Elasticsearch easily stores terabytes of data, but how can you make sure users only see the data they should? This post will explore how to use Shield, a plugin for Elasticsearch, to authenticate users with Active Directory.</p>
<p><span id="more-3245"></span><br />
<a title="NO TRESPASSING" href="https://www.flickr.com/photos/mike2099/2058021162/in/photolist-48RTZu-4ttdcn-4YPqqU-5WbRAP-8rYugF-XsCao-ftZ1hL-dpmFB-dqyeUE-bjV3VY-bEMba3-bEMb6w-84YCqg-rf5Yk1-8Yjaj3-chg68s-4KDN1M-4KDMWF-5MfWjA-tCJt6J-8nxBiZ-6YsUyh-KfDRK-54uLmy-bv1Pv-oChdLk-pL3X8t-4RTTjd-dhfUPn-cEkCFY-czjXiE-m1zThD-dzESFD-oj2KUM-c16MV-72dTxS-g4Yky4-kK9YR-p6DYnY-5HJvrX-8aovPQ-dhfVkP-bwB8c-gFzTXk-7zd9iF-eua6KC-2gzEc-8nxtcH-2gzEb-fnp3zH" data-flickr-embed="true"><img src="https://farm3.staticflickr.com/2059/2058021162_ed7b6e8d72_b.jpg" alt="NO TRESPASSING" width="600" /></a><script src="//embedr.flickr.com/assets/client-code.js" async="" charset="utf-8"></script></p>
<p>Elasticsearch will by default allow anyone access to all data. The <a href="https://www.elastic.co/guide/en/shield/current/introduction.html">Shield</a> plugin allows locking down Elasticsearch using authentication from the internal esusers realm, Active Directory (AD)  or LDAP . Using AD, you can map groups defined in your Windows domain to roles in Elasticsearch. For instance, you can allow people in the Fishery department access only to  fish-indexes, and give complete control to anyone in the IT department.</p>
<p>To use Shield in production, you have to buy an Elasticsearch subscription, however, you get a 30-day trial when installing the license manager. So let&#8217;s hurry up and see how this works out in Kibana.</p>
<p>&nbsp;</p>
<p>In this post, we will install Shield and connect to Active Directory (AD) for authentication. After having made sure we can authenticate with AD, we will add SSL encryption everywhere possible. We will add authentication for the Kibana server using the built in authentication realm esusers, and if time allows at the end, we will create two user groups, each with access to its own index, and check how it all looks when accessed in Kibana 4.</p>
<p>&nbsp;</p>
<h3>Prerequisites</h3>
<p>You will need a previously installed Elasticsearch and Kibana. The most recent versions should work, I have used Elasticsearch 1.7 and Kibana 4.1.1  If you need a machine to test on, I can personally recommend the vagrant-elk-box you can find <a href="https://github.com/comperiosearch/vagrant-elk-box-ansible">here</a>: <strong>The following guide assumes the file locations of the vagrant-elk-box</strong>, if you install differently, you will probably know where to look. Ask an adult for help.</p>
<p>For Active Directory, you need to be on a domain that uses Active Directory. That would probably mean some kind of Windows work environment.</p>
<p>&nbsp;</p>
<h4>Installing Shield</h4>
<p>If you&#8217;re on the vagrant box you should begin the lesson by entering the vagrant box using the commands</p><pre class="crayon-plain-tag">vagrant up
vagrant ssh</pre><p>&nbsp;</p>
<p>Install the license manager</p><pre class="crayon-plain-tag"> sudo /usr/share/elasticsearch/bin/plugin -i elasticsearch/license/latest</pre><p>Install Shield</p><pre class="crayon-plain-tag"> sudo /usr/share/elasticsearch/bin/plugin -i elasticsearch/shield/latest</pre><p>Restart elasticsearch. (service elasticsearch restart)</p>
<p>Check out the logs,  you should find some information regarding when your Shield license will expire (logfile location:  /var/log/elasticsearch/vagrant-es.log)</p>
<h4>Integrating Active Directory</h4>
<p>The next step involves figuring out a thing or two about your Active Directory configuration. First of all you need to know the address. Now you need to be on  your windows machine, open cmd.exe and type</p><pre class="crayon-plain-tag">set LOGONSERVER</pre><p>The name of your AD should pop back.  Add a section similar to the following into the elasticsearch.yml file (at /etc/elasticsearch/elasticsearch.yml)</p><pre class="crayon-plain-tag">shield.authc.realms:
  active_directory:
    type: active_directory
    domain_name: superdomain.com
    unmapped_groups_as_roles: true
    url: ldap://ad.superdomain.com</pre><p>Type in the address to your AD in the url: field (where it says url: ldap://ad.superdomain.com). If your logonserver is ad.cnn.com, you should type in url: ldap://ad.cnn.com</p>
<p>Also, you need to figure out your domain name and type it in correctly.</p>
<p>NB: Be careful with the indenting! Elasticsesarch cares a lot about correct indenting, and may even refuse to start without telling you why if you make a mistake.</p>
<h5>Finding the Correct name for the Active Directory group</h5>
<p>Next step involves figuring out the name for the Group you wish to grant access to. You may have called your group &#8220;Fishermen&#8221;, but that is probably not exactly what it&#8217;s called in AD.</p>
<p>Microsoft has a very simple and nice tool called <a href="https://technet.microsoft.com/en-us/library/bb963907.aspx">Active Directory Explorer</a> . Open the tool and enter the adress you just found from the LOGONSERVER (remember? it&#8217;s only 10 lines above)</p>
<p>You may have to click and explore a little to find the groups you want. Once you find it, you need the value for the &#8220;distinguishedName&#8221; attribute. You can double click on it and copy out from the &#8220;Object&#8221;.</p>
<p>This is an example from my AD</p><pre class="crayon-plain-tag">CN=Rolle IT,OU=Groups,OU=Oslo,OU=Comperiosearch,DC=comperiosearch,DC=com</pre><p>Now this value represents a group which we want to map to a role in elasticsearch.</p>
<p>Open the file /etc/elasticsearch/shield/role-mapping.yml. It should look similar to this</p><pre class="crayon-plain-tag"># Role mapping configuration file which has elasticsearch roles as keys
# that map to one or more user or group distinguished names

#roleA:   this is an elasticsearch role
#  - groupA-DN  this is a group distinguished name
#  - groupB-DN
#  - user1-DN   this is the full user distinguished name
power_user:
  - "CN=Rolle IT,OU=Groups,OU=Oslo,OU=Comperiosearch,DC=comperiosearch,DC=com"
#user:
# - "cn=admins,dc=example,dc=com" 
# - "cn=John Doe,cn=other users,dc=example,dc=com"</pre><p>I have uncommented the line with &#8220;power_user:&#8221; and added a line below containing the distinguishedName from above.</p>
<p>By restarting elasticsearch, anyone in the &#8220;Rolle IT&#8221; group should now be able to log in (and nobody else (yet)).</p>
<p>To test it out, open <a href="http://localhost:9200">http://localhost:9200</a> in your browser. You should be presented with a login box where you can type in your username/password. In case of failure, check out the elasticsearch logs (at /var/log/elasticsearch/vagrant-es.log).</p>
<p>If you were able to log in, that means Active Directory authentication works. Congratulations!  You deserve a refreshment. Some strong coffee, will go down well with the next sections, where we add encrypted communications everywhere we can.</p>
<h3>SSL  - Elasticsearch</h3>
<p>Authentication and encrypted communication go hand in hand. Without SSL, username and password is transferred in plaintext on the wire. For this demo we will use self-signed certificates. Keytool comes with Java, and is used to handle certificates for Elasticsearch.  The following command will generate a self-signed certficate and put it in a JKS file named self-signed.jks. (swap out  $password with your preferred password)</p><pre class="crayon-plain-tag">keytool -genkey -keyalg RSA -alias selfsigned -keystore self-signed.jks -keypass $password -storepass $password -validity 360 -keysize 2048 -dname "CN=localhost, OU=orgUnit, O=org, L=city, S=state, C=NO"</pre><p>Copy the certificate into /etc/elasticsearch/</p>
<p>Modify  /etc/elasticsearch/elasticsearch.yml by adding the following lines:</p><pre class="crayon-plain-tag">shield.ssl.keystore.path: /etc/elasticsearch/self-signed.jks
shield.ssl.keystore.password: $password
shield.ssl.hostname_verification: false
shield.transport.ssl: true
shield.http.ssl: true</pre><p>(use the same password as you used when creating the self-signed certificate )</p>
<p>Restart Elasticsearch again, and watch the logs for failures.</p>
<p>Try to open https://localhost:9200 in your browser (NB: httpS not http)</p>
<div id="attachment_3905" style="width: 310px" class="wp-caption alignright"><img class="wp-image-3905 size-medium" src="http://blog.comperiosearch.com/wp-content/uploads/2015/08/your-connection-is-not-private-e1440146932126-300x181.png" alt="your connection is not private" width="300" height="181" /><p class="wp-caption-text">https://localhost:9200</p></div>
<p>You should a screen warning you that something is wrong with the connection. This is a good sign! It means your certificate is actually working! For production use you could use your own CA or buy a proper certificate, which both will avoid the ugly warning screen.</p>
<h4>SSL &#8211; Active directory</h4>
<p>Our current method of connecting to Active Directory is unencrypted &#8211; we need to enable SSL for the AD connections.</p>
<p>1. Fetch the certificate from your Active Directory server (replace ldap.example.com with the LOGONSERVER from above)</p><pre class="crayon-plain-tag">echo | openssl s_client -connect ldap.example.com:6362&gt;/dev/null| openssl x509 &gt; ldap.crt</pre><p>2. Import the certificate into your keystore (located at /etc/elasticsearch/)</p><pre class="crayon-plain-tag">keytool -import-keystore self-signed.jks -file ldap.crt</pre><p>&nbsp;</p>
<p>3. Modify AD url in elasticsearch.yml<br />
change the line</p><pre class="crayon-plain-tag">url: ldap://ad.superdomain.com</pre><p>to</p><pre class="crayon-plain-tag">url: ldaps://ad.superdomain.com</pre><p>Restart elasticsearch and check logs for failures</p>
<h4>Kibana authentication with esusers</h4>
<p>With Elasticsearch locked down by Shield, it means no services can search or post data either. Including Kibana and Logstash.</p>
<p>Active Directory is great, but I&#8217;m not sure I want to use it for letting the Kibana server talk to Elasticsearch. We can use the Shield built in user management system, esusers. Elasticsearch comes with a set of predefined roles, including roles for Logstash, Kibana4 server and Kibana4 user. (/etc/elasticsearch/shield/role-mapping.yml on the vagrant-elk box if you&#8217;re still on that one).</p>
<p>Add a new kibana4_server user, granting it the role kibana4_server, using this command:</p><pre class="crayon-plain-tag">cd /usr/share/elasticsearch/bin/shield  
./esusers useradd kibana4_server -p secret -r kibana4_server</pre><p></p>
<h4></h4>
<h4>Adding esusers realm</h4>
<p>The esusers realm is the default one, and does not need to be configured if that&#8217;s the only realm you use. Now since we added the Active Directory realm we must add another section to the elasticsearch.yml file from above.</p>
<p>It should end up looking like this</p><pre class="crayon-plain-tag">shield.authc.realms:
  esusers:
    type: esusers
    order: 0
  active_directory:
    order: 1
    type: active_directory
    domain_name: superdomain.com
    unmapped_groups_as_roles: true
    url: ldap://ad.superdomain.com</pre><p>The order parameter defines in what order elasticsearch should try the various authentication mechanisms.</p>
<h4>Allowing Kibana to access Elasticsearch</h4>
<p>Kibana must be informed of the new user we just created. You will find the kibana configuration file at /opt/kibana/config/kibana.yml.</p>
<p>Add in the username and password you just created. You also need to change the address for elasticsearch to using https</p><pre class="crayon-plain-tag"># The Elasticsearch instance to use for all your queries.
elasticsearch_url: "https://localhost:9200"

# If your Elasticsearch is protected with basic auth, this is the user credentials
# used by the Kibana server to perform maintence on the kibana_index at statup. Your Kibana
# users will still need to authenticate with Elasticsearch (which is proxied thorugh
# the Kibana server)
kibana_elasticsearch_username: kibana4_server
kibana_elasticsearch_password: secret</pre><p>Restart kibana and elasticsearch, and watch the logs for any errors. Try opening Kibana at  http://localhost:5601, type in your login and password. Provided you&#8217;re in the group you gave access earlier, you should be able to login.</p>
<h4></h4>
<h4>Creating SSL for Kibana</h4>
<p>Once you have enabled authorization for Elasticsearch, you really need to set SSL certificates for Kibana as well. This is also configured in kibana.yml</p><pre class="crayon-plain-tag">verify_ssl: false
# SSL for outgoing requests from the Kibana Server (PEM formatted)
ssl_key_file: "kibana_ssl_key_file"
ssl_cert_file: "kibana_ssl_cert_file"</pre><p>You can create a self-signed key and cert file for kibana using the following command:</p><pre class="crayon-plain-tag">openssl req -x509 -newkey rsa:2048 -keyout key.pem -out cert.pem -days 365 -nodes</pre><p>&nbsp;</p>
<p><a href="http://blog.comperiosearch.com/wp-content/uploads/2015/08/kibana-auth.png"><img class="alignright size-medium wp-image-3920" src="http://blog.comperiosearch.com/wp-content/uploads/2015/08/kibana-auth-300x200.png" alt="kibana auth" width="300" height="200" /></a></p>
<h4>Configuring AD groups for Kibana access</h4>
<p>Unfortunately, this part of the post is going to be very sketchy, as we are desperately running out of time. This blog is much too long already.</p>
<p>Elasticsearch already comes with a list of predefined roles, among which you can find the kibana4 role.  The kibana4 role allows read/write access to the .kibana index, in addition to search and read access to all indexes. We want to limit access to just one index for each AD group. The fishery group shall only access the fishery index, and the finance group shall only acess the finance index. We can create roles that limit access to one index by copying the kibana4 role, giving it an appropriate name and changing the index:&#8217;*&#8217; section to map to only the preferred index.</p>
<p>The final step involves mapping the Elasticsearch role into an AD role. This is done in the role_mapping.yml file, as mentioned above.</p>
<p>Only joking of course, that wasn&#8217;t the last step. The last step is restarting Elasticsearch, and checking the logs for failures as you try to log in.</p>
<p>&nbsp;</p>
<h3>Securing Elasticsearch</h3>
<p>Shield brings enterprise authentication to Elasticsearch. You can easily manage access to various parts of  Elasticsearch management and data by using Active Directory groups.</p>
<p>This has been a short dive into the possibilities, make sure to contact Comperio if you should need  help in creating a solution with Elasticsearch and Shield.</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.comperiosearch.com/blog/2015/08/21/elasticsearch-security-shield/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Analyzing web server logs with Elasticsearch in the cloud</title>
		<link>http://blog.comperiosearch.com/blog/2015/05/26/analyzing-weblogs-with-elasticsearch-in-the-cloud/</link>
		<comments>http://blog.comperiosearch.com/blog/2015/05/26/analyzing-weblogs-with-elasticsearch-in-the-cloud/#comments</comments>
		<pubDate>Tue, 26 May 2015 21:12:34 +0000</pubDate>
		<dc:creator><![CDATA[Christoffer Vig]]></dc:creator>
				<category><![CDATA[Elasticsearch]]></category>
		<category><![CDATA[English]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[found by elastic]]></category>
		<category><![CDATA[Kibana]]></category>
		<category><![CDATA[logstash]]></category>

		<guid isPermaLink="false">http://blog.comperiosearch.com/?p=3702</guid>
		<description><![CDATA[Using Logstash and Kibana on Found by Elastic, Part 1 This is part one of a two post blog series, aiming to demonstrate how to feed logs from IIS into Elasticsearch and Kibana via Logstash, using the hosted services provided by Found by Elastic. This post will deal with setting up the basic functionality and [...]]]></description>
				<content:encoded><![CDATA[<h2>Using Logstash and Kibana on Found by Elastic, Part 1</h2>
<p>This is part one of a two post blog series, aiming to demonstrate how to feed logs from IIS into Elasticsearch and Kibana via Logstash, using the hosted services provided by Found by Elastic. This post will deal with setting up the basic functionality and securing connections. Part 2 will show how to configure Logstash to read from IIS log files, and how to use Kibana 4 to visualize web traffic. Originally published on the <a href="https://www.found.no/foundation/analyzing-weblogs-with-elasticsearch/">Elastic Blog</a><br />
<span id="more-3702"></span></p>
<h4>Getting the Bits</h4>
<p>For this demo I will be running Logstash and Kibana from my Windows laptop.<br />
If you want to follow along, download and extract Logstash 1.5.RC4 or later, and Kibana 4.0.2 or later from <a href="https://www.elastic.co/downloads">https://www.elastic.co/downloads</a>.</p>
<h4>Creating an Elasticsearch Cluster</h4>
<p>Creating a new trial cluster in Found is just a matter of logging in and pressing a button. It takes a few seconds until the cluster is ready, and a screen with some basic information on how to connect pops up. We need the address for the HTTPS endpoint, so copy that out.</p>
<h4>Configuring Logstash</h4>
<p>Now, with the brand new SSL connection option in Logstash, connecting to Found is as simple as this Logstash configuration</p><pre class="crayon-plain-tag">input { stdin{} }

output {
  elasticsearch {
    protocol =&gt; http
    host =&gt; REPLACE_WITH_FOUND_CLUSTER_HOSTNAME
    port =&gt; "9243" # Check the port also
    ssl =&gt; true
  }

  stdout { codec =&gt; rubydebug }
}</pre><p>&nbsp;</p>
<p>Save the file as found.conf</p>
<p>Start up Logstash using</p><pre class="crayon-plain-tag">bin\logstash.bat agent --verbose -f found.conf</pre><p>You should see a message similar to</p><pre class="crayon-plain-tag">Create client to elasticsearch server on `https://....foundcluster.com:9243`: {:level=&amp;gt;:info}</pre><p>Once you see &#8220;Logstash startup completed&#8221; type in your favorite test term on the terminal. Mine is &#8220;fisk&#8221; so I type that.<br />
You should see output on your screen showing what Logstash intends to pass on to elasticsearch.</p>
<p>We want to make sure this actually hits the cloud, so open a browser window and paste the HTTPS link from before, append <code>/_search</code> to the URL and hit enter.<br />
You should now see the search results from your newly created Elasticsearch cluster, containing the favorite term you just typed in. We have a functioning connection from Logstash on our machine to Elasticsearch in the cloud! Congratulations!</p>
<h4>Configuring Kibana 4</h4>
<p>Kibana 4 comes with a built-in webserver. The configuration is done in a kibana.yml file in the config directory. Connecting to Elasticsearch in the cloud comes down to inserting the address of the Elasticsearch instance.</p><pre class="crayon-plain-tag"># The Elasticsearch instance to use for all your queries.
elasticsearch_url: `https://....foundcluster.com:9243`</pre><p>Of course, we need to verify that this really works, so we open up Kibana on <a href="http://localhost:5601">http://localhost:5601</a>, select the Logstash index template, with the @timestamp data field as suggested, and open up the discover panel. Now, if there was less than 15 minutes since you inserted your favorite test term in Logstash (previous step), you should see it already. Otherwise, change the date range by clicking on the selector in the top right corner.</p>
<p><img class="alignleft" src="https://raw.githubusercontent.com/babadofar/MyOwnRepo/master/images/kibanatest.png" alt="Kibana test" width="1090"  /></p>
<h4>Locking it down</h4>
<p>Found by Elastic has worked hard to make the previous steps easy. We created an Elasticsearch cluster, fed data into it and displayed in Kibana in less than 5 minutes. We must have forgotten something!? And yes, of course! Something about security. We made sure to use secure connections with SSL, and the address generated for our cluster contains a 32 character long, randomly generated list of characters, which is pretty hard to guess. Should, however, the address slip out of our hands, hackers could easily delete our entire cluster. And we don’t want that to happen. So let’s see how we can make everything work when we add some basic security measures.</p>
<h4>Access Control Lists</h4>
<p>Found by Elastic has support for access control lists, where you can set up lists of usernames and passwords, with lists of rules that deny/allow access to various paths within Elasticsearch. This makes it easy to create a &#8220;read only&#8221; user, for instance, by creating a user with a rule that only allows access to the <code>/_search</code> path. Found by Elastic has a sample configuration with users searchonly and readwrite. We will use these as starting point but first we need to figure out what Kibana needs.</p>
<h4>Kibana 4 Security</h4>
<p>Kibana 4 stores its configuration in a special index, by default named &#8220;.kibana&#8221;. The Kibana webserver needs write access to this index. In addition, all Kibana users need write access to this index, for storing dashboards, visualizations and searches, and read access to all the indices that it will query. More details about the access demands of Kibana 4 can be found on the <a href="http://www.elastic.co/guide/en/shield/current/_shield_with_kibana_4.html">elastic blog</a>.</p>
<p>For this demo, we will simply copy the “readwrite” user from the sample twice, naming one kibanaserver, the other kibanauser.</p><pre class="crayon-plain-tag">Setting Access Control in Found:
# Allow everything for the readwrite-user, kibanauser and kibanaserver
- paths: ['.*']
conditions:
- basic_auth:
users:
- readwrite
- kibanauser
- kibanaserver
- ssl:
require: true
action: allow</pre><p>Press save and the changes are immediately effective. Try to reload the Kibana at <a href="http://localhost:5601">http://localhost:5601</a>, you should be denied access.</p>
<p>Open up the kibana.yml file from before and modify it:</p><pre class="crayon-plain-tag"># If your Elasticsearch is protected with basic auth, this is the user credentials
# used by the Kibana server to perform maintence on the kibana_index at statup. Your Kibana
# users will still need to authenticate with Elasticsearch (which is proxied thorugh
# the Kibana server)
kibana_elasticsearch_username: kibanaserver
kibana_elasticsearch_password: `KIBANASERVER_USER_PASSWORD`</pre><p>Stop and start Kibana to effectuate settings.<br />
Now when Kibana starts up, you will be presented with a login box for HTTP authentication.<br />
Type in kibanauser as the username, and the password . You should now again be presented with the Discover screen, showing the previously entered favorite test term. Again, you may have to expand the time range to see your entry.</p>
<h4>Logstash Security</h4>
<p>Logstash will also need to supply credentials when connecting to Found by Elastic. We reuse permission from the readwrite user once again, this time giving the name &#8220;logstash&#8221;.<br />
It is simply a matter of supplying the username and password in the configuration file.</p><pre class="crayon-plain-tag">output {
  elasticsearch {
    ….
    user =&gt; “logstash”,
    password =&gt; `LOGSTASH_USER_PASSWORD`
  }
}</pre><p></p>
<h4>Wrapping it up</h4>
<p>This has been a short dive into Logstash and Kibana with Found by Elastic. The recent changes done in order to support the Shield plugin for Elasticsearch, Logstash and Kibana, make it very easy to use the secure features of Found by Elastic. In the next post we will look into feeding logs from IIS into Elasticsearch via Logstash, and visualizing the most used query terms in Kibana.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.comperiosearch.com/blog/2015/05/26/analyzing-weblogs-with-elasticsearch-in-the-cloud/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>3 steg til Big Data</title>
		<link>http://blog.comperiosearch.com/blog/2015/04/28/3-steg-til-big-data/</link>
		<comments>http://blog.comperiosearch.com/blog/2015/04/28/3-steg-til-big-data/#comments</comments>
		<pubDate>Tue, 28 Apr 2015 13:00:09 +0000</pubDate>
		<dc:creator><![CDATA[Christoffer Vig]]></dc:creator>
				<category><![CDATA[Business]]></category>
		<category><![CDATA[Big Data]]></category>
		<category><![CDATA[Elasticsearch]]></category>
		<category><![CDATA[Kibana]]></category>
		<category><![CDATA[log]]></category>
		<category><![CDATA[søk]]></category>

		<guid isPermaLink="false">http://blog.comperiosearch.com/?p=3609</guid>
		<description><![CDATA[Big data er tidens tredje hotteste buzzword, men ikke alle er klar over hva det er, hvor de kan finne det, eller hva man skal med det. Big Data er i ferd med å vokse frem under beina på de fleste av oss. Det digitale universet fordobles for annet hvert år som går.  Internett, mobil og ikke minst tingenes [...]]]></description>
				<content:encoded><![CDATA[<p><strong>Big data</strong> er tidens <a href="http://www.languagemonitor.com/words-of-the-year-woty/the-top-business-buzzwords-of-global-english-for-2014">tredje hotteste buzzword</a>, men ikke alle er klar over hva det er, hvor de kan finne det, eller hva man skal med det. Big Data er i ferd med å vokse frem under beina på de fleste av oss. Det digitale universet fordobles for annet hvert år som går.  Internett, mobil og ikke minst tingenes internett genererer stadig mer informasjon.</p>
<p>Skal du lykkes i forretningslivet i dag, er du avhengig av å kjenne brukernes bevegelser og kunne tilpasse løsningen din etter dette. Du kan velge å stole på maktene, som Snåsamannen eller Märtha, eller du kan ta makten i din egen hånd og høste innsikten som ligger begravet i virksomhetens og brukernes logger.</p>
<h3><strong>3 steg</strong></h3>
<p>Vi tar utgangspunkt i at du har en nettside, og at du får tak i loggene til denne. I tillegg trenger du en datamaskin, samt en datakyndig person, helst en med utvikler-kompetanse.</p>
<p><strong>Slik kommer du i gang:</strong></p>
<ol>
<li><strong>Identifiser 3 målbare KPI’er</strong>.<br />
Forslag: Sidevisninger pr. dag, Mest brukte spørreord, Responstid pr.side</li>
<li><strong>Mat loggene inn i ELK</strong>.<br />
Finn logdata og en utvikler. Utvikleren finner lett ut av dette.</li>
<li><strong>Visualisér KPI’ene</strong>.<br />
Hold fast i utvikleren, mens dere sammen ser på dataene i Kibana og finner passende grafisk fremstilling.<br/></li>
</ol>
<div id="attachment_3606" style="width: 310px" class="wp-caption alignnone"><a href="http://blog.comperiosearch.com/wp-content/uploads/2015/04/Comperio_bigdata.png"><img class="wp-image-3606 size-medium" src="http://blog.comperiosearch.com/wp-content/uploads/2015/04/Comperio_bigdata-300x203.png" alt="Comperio_bigdata" width="300" height="203" /></a><p class="wp-caption-text">Eksempel på Kibana dashboard</p></div>
<p><strong>KPI</strong></p>
<p>Forslagene til KPIer er standard måletall for nettsider. Dette er tall som alle nettsideanalyseverktøy, som Google Analytics, kan gi deg i dag. Forskjellen er at nå er det du som setter sammen grafene og utvikler verktøyene,  dataene tilhører deg, og måten du velger å sette informasjonen sammen på for å skape innsikt er helt opp til deg selv. Igjen; Hensikten her er å demonstrere en teknikk og vise fram et verktøy, ikke å fortelle deg hvilke KPIer du bør være opptatt av.</p>
<p><strong>ELK</strong></p>
<p><a href="https://www.elastic.co/"><strong>ELK </strong></a><strong>, som nevnt over, eller </strong>den såkalte “ELK stacken”, tilbyr et komplett Big Data lagrings-, søk- og analyse-verktøy. ELK står for Elasticsearch, Logstash og Kibana, en samling open source produkter utviklet av teknologiselskapet Elastic. Søkemotoren Elasticsearch er kjernen i stacken, med fokus på utviklervennlighet og skalerbarhet. Logstash mater data inn i Elasticsearch, mens Kibana tilbyr ad-hoc data-analyse og nydelige visualiseringer og grafer.</p>
<p>Netflix, GitHub, Microsoft er eksempler på gigantvirksomheter som benytter Elasticsearch i kjernen av sin virksomhet.</p>
<p>Bakgrunnen til plattformens popularitet ligger i at den er enkel å starte med, samtidig som den leverer uovertrufne søke- og analyse-muligheter.  ELK stacken nevnes ofte i samme åndedrag som Big Data, ettersom den takler større  datamengder.</p>
<p>&nbsp;</p>
<h3><strong>En start</strong></h3>
<p>Loggene til nettsiden din kvalifisere antakeligvis ikke helt til betegnelsen Big Data. Poenget er at verktøykassen vi introduserer  her står du rustet til større oppgaver.</p>
<p>Du kan kan komme i gang med å ta makten over bedriftens datalogger uten at det krever store ressurser. Planen kan legges underveis, samtidig som enkel tilgang til rådata alene kan skape både ny innsikt og nye spørsmål og behov.</p>
<p>Søk og analyse av store datamengder, som f.eks. transaksjonslogger, nettverkstrafikk, brannmur, internett-aktivitet i stor skale, som twitter, irc, nettsider osv.</p>
<p>Det norske søketeknologiselskapet <a href="http://www.comperio.no">Comperio</a> er partner med Elastic, og har mange utviklere som du kan hjelpe deg gjennom disse tre stegene. Comperio har jobbet med søk siden 2004 og er et av verdens ledende selskaper innen søketeknologi.</p>
<p><strong>Ikke la Big Data skuta seile sin egen sjø, ta plass ved roret og sett kursen mot din egen Big Data horisont nå!</strong></p>
<p>&nbsp;</p>
<p><em>Les om Comperios frokostmøte <a href="https://www.eventbrite.com/e/comperio-frokost-sk-og-jakten-pa-den-gode-vinen-tickets-16052734160">om hvordan forstå kundene dine bedre</a>.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.comperiosearch.com/blog/2015/04/28/3-steg-til-big-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Replacing FAST ESP with Elasticsearch at Posten</title>
		<link>http://blog.comperiosearch.com/blog/2015/03/20/elasticsearch-at-posten/</link>
		<comments>http://blog.comperiosearch.com/blog/2015/03/20/elasticsearch-at-posten/#comments</comments>
		<pubDate>Fri, 20 Mar 2015 10:00:52 +0000</pubDate>
		<dc:creator><![CDATA[Seb Muller]]></dc:creator>
				<category><![CDATA[Elasticsearch]]></category>
		<category><![CDATA[English]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Comperio]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[elastic]]></category>
		<category><![CDATA[fast]]></category>
		<category><![CDATA[geosearch]]></category>
		<category><![CDATA[Kibana]]></category>
		<category><![CDATA[logstash]]></category>
		<category><![CDATA[posten]]></category>
		<category><![CDATA[tilbudssok]]></category>

		<guid isPermaLink="false">http://blog.comperiosearch.com/?p=3364</guid>
		<description><![CDATA[First, some background A few years ago Comperio launched a nifty service for Posten Norge, Norway&#8217;s postal service. Through the service, retail companies can upload their catalogues and seasonal flyers to make the products listed within searchable. Although the catalogue handling and processing is also very interesting, we&#8217;re going to focus on the search side [...]]]></description>
				<content:encoded><![CDATA[<h2>First, some background</h2>
<p>A few years ago Comperio launched a nifty service for <a title="Posten Norge" href="http://www.posten.no/">Posten Norge</a>, Norway&#8217;s postal service. Through the service, retail companies can upload their catalogues and seasonal flyers to make the products listed within searchable. Although the catalogue handling and processing is also very interesting, we&#8217;re going to focus on the search side of things in this post. As Comperio has a long relationship and a great deal of experience with <a title="FAST ESP" href="http://blog.comperiosearch.com/blog/2012/07/30/comperio-still-likes-fast-esp/">FAST ESP</a>, this first iteration of Posten&#8217;s <a title="Tilbudssok" href="http://tilbudssok.posten.no/">Tilbudssok</a> used it as the search backend. It also incorporated Comperio Front, our search middleware product, which recently <a title="Comperio Front" href="http://blog.comperiosearch.com/blog/2015/02/16/front-5-released/">had a big release. </a>.</p>
<h2>Newer is better</h2>
<p>Unfortunately, FAST ESP is getting on a bit and as a result Tilbudssok has been limited by what we can coax out of it. To ensure we provide the best possible search solution we decided it was time to upgrade and chose <a title="Elasticsearch" href="https://www.elastic.co/products">Elasticsearch</a> as the best candidate. If you are unfamiliar with Elasticsearch, take a moment to browse our other <a title="Elasticsearch blog posts" href="http://blog.comperiosearch.com/blog/tag/elasticsearch/">blog posts</a> on the subject. The resulting project had three main requirements:</p>
<ul>
<li>Replace Fast ESP with Elasticsearch while otherwise maintaining as much of the existing architecture as possible</li>
<li>Add geodata to products such that a user could find the nearest store where they were available</li>
<li>Setup sexy log analysis with <a title="Logstash" href="https://www.elastic.co/products/logstash">Logstash</a> and <a title="Kibana" href="https://www.elastic.co/products/kibana">Kibana</a></li>
</ul>
<p></br></p>
<h2>Data Sources, Ingestion and Processing</h2>
<p>The data source for the search system is a MySQL database populated with catalogue and product data. A separate Comperio system generates this data when Posten&#8217;s customers upload PDFs of their brochures i.e. we also fully own the entire data generation process.</p>
<p>The FAST ESP based solution made use of FAST&#8217;s JDBC connector to feed data directly to the search index. Inspired by <a title="Elasticsearch: Indexing SQL databases. The easy way." href="http://blog.comperiosearch.com/blog/2014/01/30/elasticsearch-indexing-sql-databases-the-easy-way/">Christoffers blog post</a>, we made use of the <a title="Elasticsearch JDBC River Plugin" href="https://github.com/jprante/elasticsearch-river-jdbc">JDBC plugin for Elasticsearch</a>. This allowed us to use the same SQL statements to feed Elasticsearch. It took us no more than a couple of hours, including some time wrestling with field mappings, to populate our Elasticsearch index with the same data as the FAST one.</p>
<p>We then needed to add store geodata to the index. As mentioned earlier, we completely own the data flow. We simply extended our existing catalogue/product uploader system to include a store uploader service. Google&#8217;s <a title="Google Geocoder" href="https://code.google.com/p/geocoder-java/">geocoder</a> handled converted addresses to coordinates for use with Elasticsearch&#8217;s geo distance sorting. We now had store data in our database. An extra JDBC river and another round of mapping wrestling got that same data into the Elasticsearch index.</p>
<h2>Our approach</h2>
<p>Before the conversion to Elasticsearch, the Posten system architecture was typical of most Comperio projects. Users interact with a Java based frontend web application. This in turn sends queries to Comperio&#8217;s search abstraction layer, <a title="Comperio Front" href="http://blog.comperiosearch.com/blog/2015/02/16/front-5-released/">Comperio Front</a>. This formats requests such that the system&#8217;s search engine, in our case FAST ESP, can understand them. Upon receiving a response from the search engine, Front then formats it into a frontend friendly format i.e. JSON or XML depending on developer preference.</p>
<p>&nbsp;</p>
<p><a href="http://blog.comperiosearch.com/wp-content/uploads/2015/03/tilbudssok_architecture.png"><img class="size-medium wp-image-3422 aligncenter" src="http://blog.comperiosearch.com/wp-content/uploads/2015/03/tilbudssok_architecture-300x145.png" alt="Generic Search Architecture" width="300" height="145" /></a></p>
<p>Unfortunately, when we started the project, Front&#8217;s Elasticsearch adapter was still a bit immature. It also felt a bit over kill to include it when Elasticsearch has such a <a href="http://www.elastic.co/guide/en/elasticsearch/client/java-api/current/">robust Java API</a> already. I saw an opportunity to reduce the system&#8217;s complexity and learn more about interacting with Elasticsearch&#8217;s Java API and took it. With what I learnt, we could later beef up Front&#8217;s Elasticsearch adapter for future projects.</p>
<p>As a side note, we briefly flirted with the idea of replacing the entire frontend with a <a href="http://blog.comperiosearch.com/blog/2013/10/24/instant-search-with-angularjs-and-elasticsearch/">hipstery Javascript/Node.js ecosystem</a>. It was trivial to throw together a working system very quickly but in the interest of maintaining existing architecture and trying to keep project run time down we opted to stick with the existing Java based MVC framework.</p>
<p>After a few rounds of Googling, struggling with documentation and finally simply diving into the code, I was able to piece together the bits of the Elasticsearch Java API puzzle. It is a joy to work with! There are builder classes for pretty much everything. All of our queries start with a basic SearchRequestBuilder. Depending on the scenario, we can then modify this SRB with various flavours of QueryBuilders, FilterBuilders, SortBuilders and AggregationBuilders to handle every potential use case. Here is a greatly simplified example of a filtered search with aggregates:</p>
<script src="https://gist.github.com/92772945f5281df54c3b.js?file=SRBExample"></script>
<h2>Logstash and Kibana</h2>
<p>With our Elasticsearch based system up ready to roll, the next step was to fulfil our sexy query logging project requirement. This raised an interesting question. Where are the query logs? As it turns out, (please contact us if we&#8217;re wrong), the only query logging available is something called <a title="Slow Log" href="http://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-slowlog.html">slow logging</a>. It is a shard level log where you can set thresholds for the query or fetch phase of the execution. We found this log severely lacking in basic details such as hit count and actual query parameters. It seemed like we could only track query time and the query string.</p>
<p>Rather than fight with this slow log, we implemented our own custom logger in our web app to log salient parts of the search request and response. To make our lives easier everything is logged as JSON. This makes hooking up with <a title="Logstash" href="http://logstash.net/">Logstash</a> trivial, as our logstash config reveals:</p>
<script src="https://gist.github.com/43e3603bd75fd549a582.js?file=logstashconf"></script>
<p><a title="Kibana 4" href="http://blog.comperiosearch.com/blog/2015/02/09/kibana-4-beer-analytics-engine/">Kibana 4</a>, the latest version of Elastic&#8217;s log visualisation suite, was released in February, around the same time as we were wrapping up our logging logic. We had been planning on using Kibana 3, but this was a perfect opportunity to learn how to use version 4 and create some awesome dashboards for our customer:</p>
<p><a href="http://blog.comperiosearch.com/wp-content/uploads/2015/03/kibana_query.png"><img class="aligncenter size-medium wp-image-3444" src="http://blog.comperiosearch.com/wp-content/uploads/2015/03/kibana_query-300x169.png" alt="kibana_query" width="300" height="169" /></a></p>
<p><a href="http://blog.comperiosearch.com/wp-content/uploads/2015/03/kibana_ams.png"><img class="aligncenter size-medium wp-image-3443" src="http://blog.comperiosearch.com/wp-content/uploads/2015/03/kibana_ams-300x135.png" alt="kibana_ams" width="300" height="135" /></a></p>
<p>Kibana 4 is wonderful to work with and will generate so much extra value for Posten and their customers.</p>
<h2>Conclusion</h2>
<ul>
<li>Although the Elasticsearch Java API itself is well rounded and complete, its documentation can be a bit frustrating. But this is why we write blog posts to share our experiences!</li>
<li>Once we got past the initial learning curve, we were able to create an awesome Elasticsearch Java API toolbox</li>
<li>We were severely disappointed with the built in query logging. I hope to extract our custom logger and make it more generic so everyone else can use it too.</li>
<li>The Google Maps API is fun and super easy to work with</li>
</ul>
<p>Rivers as a data ingestion tool have long been marked for deprecation. When we next want to upgrade our Elasticsearch version we will need to replace them entirely with some other tool. Although Logstash is touted as Elasticsearch&#8217;s main equivalent of a connector framework, it currently lacks classic Enterprise search data source connectors. <a title="Apache Manifold" href="http://manifoldcf.apache.org/">Apache Manifold</a> is a mature open source connector framework that would cover our needs. The latest release has not been tested with the latest version of Elasticsearch, but it supports versions 1.1-3.</p>
<p>Once the solution goes live, during April, Kibana will really come into its own as we get more and more data.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.comperiosearch.com/blog/2015/03/20/elasticsearch-at-posten/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Elastic{ON}15: Day one</title>
		<link>http://blog.comperiosearch.com/blog/2015/03/11/elasticon15-day-one/</link>
		<comments>http://blog.comperiosearch.com/blog/2015/03/11/elasticon15-day-one/#comments</comments>
		<pubDate>Wed, 11 Mar 2015 16:07:48 +0000</pubDate>
		<dc:creator><![CDATA[Christoffer Vig]]></dc:creator>
				<category><![CDATA[Elasticsearch]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[.net]]></category>
		<category><![CDATA[aggregations]]></category>
		<category><![CDATA[Elasticon]]></category>
		<category><![CDATA[found]]></category>
		<category><![CDATA[Kibana]]></category>
		<category><![CDATA[logstash]]></category>
		<category><![CDATA[san francisco]]></category>

		<guid isPermaLink="false">http://blog.comperiosearch.com/?p=3393</guid>
		<description><![CDATA[March 10, 2015 At Comperio we have been speculating for a while now that Elasticsearch might just drop search from their name. With Elasticsearch spearheading the expansion of search into analytics and all sorts of content and data driven applications such a change made sense to us. What the name would be we had no [...]]]></description>
				<content:encoded><![CDATA[<h6>March 10, 2015<br />
<a href="http://blog.comperiosearch.com/wp-content/uploads/2015/03/IMG_20150310_1112452cropped.jpg"><img class="alignright size-medium wp-image-3396" src="http://blog.comperiosearch.com/wp-content/uploads/2015/03/IMG_20150310_1112452cropped-300x140.jpg" alt="IMG_20150310_111245~2cropped" width="300" height="140" /></a></h6>
<p>At Comperio we have been speculating for a while now that Elasticsearch might just drop search from their name. With Elasticsearch spearheading the expansion of search into analytics and all sorts of content and data driven applications such a change made sense to us. What the name would be we had no idea about however &#8211; ElasticStash, KibanElastic StashElasticLog &#8211; none of these really rolled of the tongue like a proper brand.</p>
<p>More surprising is the Elasticsearch move into the cloud space by acquiring Found. A big and heartfelt congratulations to our Norwegian colleagues from us at Comperio. Found has built and delivered an innovative and solid product and we look forward to seeing them build something even better as a part of Elastic.</p>
<p>Elasticsearch is renamed to Elastic, and Found is no longer just Found, but Found by Elastic. The opening keynote held by CEO Steven Shuurman and Shay Banon was a tour of triumph through the history of Elastic, detailing how the company has grown sort of in an organic, natural manner, into what it is today. Kibana and Logstash started as separate projects but were soon integrated into Elastic. Shay and Steven explained how old roadmaps for the development of Elastic included plans to create CloudES, search as a cloud service. CloudES was never created, due to all the other pressing issues. Simultaneously, the Norwegian company Found made great strides with their cloud search offering, and an acquisition became a very natural fit.</p>
<p>Elastic{ON} is the first conference devoted entirely to the Elastic family of products. The sessions consist on one hand of presentations by developers and employees of Elastic, on the other there is “ELK in the wild” showcasing customer use cases, including Verizon, Github, Facebook and more.</p>
<p>On day one the sessions about core elasticsearch, Lucene, Kibana and Logstash were of particular interest to us.</p>
<h4><strong>Elasticsearch</strong></h4>
<p>The session about “Recent developments in elasticsearch 2.0” held by Clinton Gormley and Simon Wilnauer revealed a host of interesting new features in the upcoming 2.0 release. There is a very high focus on stability, and making sure that no releases should contain bugs. To illustrate this Clinton showed graphs relating the number of lines of code compared to lines of tests, where the latter was rising sharply in the latest releases. It was also interesting to note that the number of lines of code has been reduced recently due to refactoring and other improvements to the code base.</p>
<p>Among interesting new features are a new “reducer” step for aggregations allowing calculations to be done on top of aggregated results and a Changes API which helps managing changes to the index. The Changes API will be central in creating other features, for example update by query, where a typical use case involves logging search results, where the changes API will allow updating  information about click activity in the same log entry as the one containing the query.</p>
<p>There will also be a Reindex API that simplifies the development cycle when you have to refeed an entire index because you need to change a mapping or field type.</p>
<h4>Kibana</h4>
<p>Rashid Khan went through the motivations behind the development of Kibana 4, where support for aggregations, and making the product easier to work with and extendable really makes this into a fitting platform for creating tools for creating visualizations of data. Followed by “The Contributor&#8217;s Guide to the Kibana Galaxy” by Spencer Alger who demoed how to setup the development environment for Kibana 4 using using npm, grunt and bower- the web development standard toolset of today ( or was it yesterday?)</p>
<h4>Logstash</h4>
<p>Logstash creator Jordan Sissel presented the new features of Logstash 1.5, and what to expect in future versions. 1.5 introduces a new plugin system, and to great relief of all windows users out there the issues regarding file locking on rolling log files have been resolved! The roadmap also aims to vastly improve the reliability of Logstash, no more losing documents in planned or unplanned outages. In addition there are plans to add event persistence and various API management tools. As a consequence of the river technology being deprecated, Logstash will take the role as document processing framework that those of us who come from FAST ESP have missed for some time now. So in effect, all rivers, (including JDBC) will be ported to Logstash.</p>
<h4>Aggregations</h4>
<p>Mark Harwood presented a novel take on optimizing index creation for aggregations in the session “Building Entity Centric Indexes”. You may have tried to run some fancy aggregations,only to have elasticsearch dying from out of memory errors. Avoiding this often takes some insight into the architecture to<br />
structure your aggregations in the best possible manner. Mark essentially showed how to move some of the aggregation to indexing time rather than query time. The original use case was a customer who needed to know what is the average session length for the users of his website. Figuring that out involved running through the whole index, sorting by session id, picking the timestamp of the first item and subtracting from the second, a lot of operations with an enormous consumption of resources. Mark approaches the problems in a creative and mathematical manner, and it is always inspiring to attend his presentations. It will be interesting to see whether the Changes API mentioned above will deliver functionality that can be used to improve aggregated data.</p>
<h4>.NET</h4>
<p>Deep dive into the .NET clients with Martijn Laarman showed how to use a strongly typed language as C# with elasticsearch. Yes, it is actually possible, and it looked very good. There is a low-level client that just connects to the api where you have to to do all the parsing yourself, and a high-level client called NEST building on top of that offering a strongly typed query DSL having almost 1 to 1 mapping to the elasticsearch dsl. Particularly nifty was the covariant result handling, where you can specify the type of results you need back, considering a search result from elasticsearch can contain many types.</p>
<p>Looking forwards to day 2!<br />
<a href="http://blog.comperiosearch.com/wp-content/uploads/2015/03/IMG_20150310_213606.jpg"><img class="alignright size-medium wp-image-3391" src="http://blog.comperiosearch.com/wp-content/uploads/2015/03/IMG_20150310_213606-300x222.jpg" alt="IMG_20150310_213606" width="300" height="222" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.comperiosearch.com/blog/2015/03/11/elasticon15-day-one/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>ELK in one (Vagrant) box</title>
		<link>http://blog.comperiosearch.com/blog/2014/08/14/elk-one-vagrant-box/</link>
		<comments>http://blog.comperiosearch.com/blog/2014/08/14/elk-one-vagrant-box/#comments</comments>
		<pubDate>Thu, 14 Aug 2014 14:06:18 +0000</pubDate>
		<dc:creator><![CDATA[Murhaf Fares]]></dc:creator>
				<category><![CDATA[English]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Elasticsearch]]></category>
		<category><![CDATA[elk]]></category>
		<category><![CDATA[Kibana]]></category>
		<category><![CDATA[logstash]]></category>
		<category><![CDATA[puppet]]></category>
		<category><![CDATA[vagrant]]></category>

		<guid isPermaLink="false">http://blog.comperiosearch.com/?p=2813</guid>
		<description><![CDATA[In this blog post we introduce a Vagrant box to easily create configurable and reproducible development environments for ELK (Elasticsearch, Logastash and Kibana). At Comperio, we mainly use this box for query log analysis using the ELK stack. In case you don’t know, Vagrant is a free and open-source software that combines VirtualBox (a virtualization [...]]]></description>
				<content:encoded><![CDATA[<p><img class="alignright size-medium wp-image-2828" src="http://blog.comperiosearch.com/wp-content/uploads/2014/08/elk_vagrant_chilling1-300x300.png" alt="elk_vagrant_chilling" width="300" height="300" /></p>
<p>In this blog post we introduce a Vagrant box to easily create configurable and reproducible development environments for ELK (Elasticsearch, Logastash and Kibana). At Comperio, we mainly use this box for query log analysis using the ELK stack.<br />
In case you don’t know, <a href="http://www.vagrantup.com/">Vagrant</a> is a free and open-source software that combines VirtualBox (a virtualization software) with configuration management softwares such as Puppet and Chef.</p>
<p><strong>ELK stack up and running in two commands</strong></p>
<blockquote><p>$ git clone https://github.com/comperiosearch/vagrant-elk-box.git<br />
$ vagrant up</p></blockquote>
<p>By cloning this <a href="https://github.com/comperiosearch/vagrant-elk-box">github repo</a> and then typing “vagrant up”, you will be installing elasticsearch, logstash, kibana and nginx (the latter used to serve kibana).</p>
<p>Elasticsearch will be running on port 9200, as usual, which is forwarded to the host machine. As for Kibana, it will be served on port 5601 (also accessible from the host OS).</p>
<p><strong>How does it work?</strong><br />
As mentioned above, Vagrant is a wrapper around VirtualBox and some configuration management software. In our box, we use pure shell scripting and Puppet to configure the ELK stack.<br />
There are two essential configuration files in this box: <a href="https://github.com/comperiosearch/vagrant-elk-box/blob/master/Vagrantfile">Vagrantfile</a> and the Puppet manifest <a href="https://github.com/comperiosearch/vagrant-elk-box/blob/master/manifests/default.pp">default.pp</a>.<br />
Vagrantfile includes the settings of the virtual box such as operating system, memory size, number of CPUs, forwarded ports, etc…<br />
<script src="https://gist.github.com/0ba6fa7ecece4fdac1ff.js?file=Vagrantfile"></script></p>
<p>Vagrantfile also includes a shell script that installs, among other things, the official Puppet modules for <a href="https://github.com/elasticsearch/puppet-elasticsearch">elasticsearch</a> and <a href="https://github.com/elasticsearch/puppet-logstash">logstash</a>. By using that shell script we stay away from git submodules which were used in <a href="https://github.com/comperiosearch/vagrant-elasticsearch-box">another Vagrant image</a> we made earlier for elasticsearch.<br />
<script src="https://gist.github.com/fb50e0cfcdee2e14898a.js?file=Vagrantfile"></script></p>
<p>In the Puppet manifest, default.pp, we define what version of elasticsearch to install and make sure that it is running as a service.<br />
<script src="https://gist.github.com/3abbe1b3aee8ecbe1b9e.js?file=default.pp"></script></p>
<p>We do the same for logstash and additionally link the default logstash configuration file to <a href="https://github.com/comperiosearch/vagrant-elk-box/blob/master/confs/logstash/logstash.conf">this file</a> under /Vagrant/confs/logstash which is shared with the host OS. Finally, we install nginx and Kibana, and configure Kibana to run on port 5601 (by linking the nginx conf file to <a href="https://github.com/comperiosearch/vagrant-elk-box/blob/master/confs/nginx/default">this file</a> in the Vagrant directory also).</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.comperiosearch.com/blog/2014/08/14/elk-one-vagrant-box/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>SharePoint ULS log analysis using ELK</title>
		<link>http://blog.comperiosearch.com/blog/2014/08/01/sharepoint-log-analysis-using-elk/</link>
		<comments>http://blog.comperiosearch.com/blog/2014/08/01/sharepoint-log-analysis-using-elk/#comments</comments>
		<pubDate>Fri, 01 Aug 2014 11:31:06 +0000</pubDate>
		<dc:creator><![CDATA[Madalina Rogoz]]></dc:creator>
				<category><![CDATA[English]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Elasticsearch]]></category>
		<category><![CDATA[Kibana]]></category>
		<category><![CDATA[log analysis]]></category>
		<category><![CDATA[logstash]]></category>
		<category><![CDATA[sharepoint]]></category>

		<guid isPermaLink="false">http://blog.comperiosearch.com/?p=2775</guid>
		<description><![CDATA[E is for Elasticsearch Elasticsearch is an open source search and analytics engine that extends the limits of full-text search through a robust set of APIs and DSLs, to deliver a flexible and almost limitless search experience. L is for Logstash One of the most popular open source log parser solutions on the market, Logstash has the possibility of reading any data source [...]]]></description>
				<content:encoded><![CDATA[<h3>E is for Elasticsearch</h3>
<p><a href="http://www.elasticsearch.org/">Elasticsearch</a> is an open source search and analytics engine that extends the limits of full-text search through a robust set of APIs and DSLs, to deliver a flexible and almost limitless search experience.</p>
<h3>L is for Logstash</h3>
<p>One of the most popular open source log parser solutions on the market, <a href="http://logstash.net/">Logstash</a> has the possibility of reading any data source and extracting the data in JSON format, easy to use and running in minutes.</p>
<h3>K is for Kibana</h3>
<p>A data visualization engine, <a href="http://www.elasticsearch.org/overview/kibana/">Kibana</a> allows the user to create custom dashboards and to analyze Elasticsearch data on-the-fly and in real-time.</p>
<h3>Getting set up</h3>
<p>To start using this technology, you just need to <a href="http://www.elasticsearch.org/overview/elkdownloads/">install</a> the three above mentioned components, which actually means downloading and unzipping three archive files.</p>
<p>The data flow is this: the log files are text files residing in a folder. Logstash will use a configuration file to read from the logs and parse all the entries. The parsed data will be sent to Elasticsearch for storing. Once here, it can be easily read and displayed by Kibana.</p>
<p><img class="alignnone size-full wp-image-2779" src="http://blog.comperiosearch.com/wp-content/uploads/2014/08/elk004.jpg" alt="elk004" width="608" height="107" /></p>
<h3>Parsing SharePoint ULS log files with Logstash</h3>
<p>We will now focus on the most simple and straightforward way of getting this to work, without any additional configuration or settings. Our goal is to open Kibana and be able to configure some charts that will help us visualize and explore what type of entries we have in the SharePoint ULS logs, and to be able to search the logs for interesting entries.</p>
<p>To begin, we need some ULS log files from SharePoint that will be placed in a folder on the server (I am working on a Windows Server virtual environment) where we are testing the ELK stack. My ULS logs are located here: C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\14\LOGS</p>
<p>As an example, the first line in one of my log files looks like this:</p><pre class="crayon-plain-tag">05/06/2014 10:20:20.85 wsstracing.exe (0x0900)                 0x0928SharePoint Foundation         Tracing Controller Service    5152InformationTracing Service started.</pre><p><span class="TextRun SCX192432813" style="color: #000000" xml:lang="EN-US"><span class="NormalTextRun SCX192432813">The next step is to build the configuration file. This is a text file with a .</span><span class="SpellingError SCX192432813">config</span><span class="NormalTextRun SCX192432813"> extension, located by defau</span></span><span class="TextRun SCX192432813" style="color: #000000" xml:lang="EN-US"><span class="NormalTextRun SCX192432813">l</span></span><span class="TextRun SCX192432813" style="color: #000000" xml:lang="EN-US"><span class="NormalTextRun SCX192432813">t in the </span><span class="SpellingError SCX192432813">Logstash</span><span class="NormalTextRun SCX192432813"> folder. The starting point for the content of this file </span></span><span class="TextRun SCX192432813" style="color: #000000" xml:lang="EN-US"><span class="NormalTextRun SCX192432813">would be</span></span><span class="TextRun SCX192432813" style="color: #000000" xml:lang="EN-US"><span class="NormalTextRun SCX192432813">:</span></span><span class="EOP SCX192432813" style="color: #000000"> </span></p><pre class="crayon-plain-tag">input {  
 file {  
  type =&gt; "sharepointlog" 
    path =&gt; ["[folder where the logs reside]/*.log"] 
   start_position =&gt; "beginning" 
   codec =&gt; "plain" 
} 
} 
filter  
{ 
 } 
output  
{    
 elasticsearch {  
embedded =&gt; true 
 } 
}</pre><p>The Input defines the location of the logs and some reading parameters, like the starting position where Logstash will begin parsing the files. The Output section defines the location of the parsed data, in our case the Elasticsearch instance installed on the same server.</p>
<p>Now for the important part, the Filter section. The Filter section contains one or more GROK patterns that are used by Logstash for identifying the format of the log entries. There are many types of entries, but we are focusing on the event type and message, so we have to parse all the parameters up to the message part in order to get what we need.</p>
<p>The documentation is pretty detailed when it comes to GROK and a <a href="http://grokdebug.herokuapp.com/">pattern debugger website</a> with a GROK testing engine is available online, so you can develop and test your patterns before actually running them in Logstash.</p>
<p>So this is what I came up with for the SharePoint ULS logs:</p><pre class="crayon-plain-tag">filter { 
   if [type] == "sharepointlog" { 
grok { 
match =&gt; [ "message",  
"(?&lt;parsedtime&gt;%{MONTHNUM}/%{MONTHDAY}/%{YEAR} %{HOUR}:%{MINUTE}:%{SECOND}) \t%{DATA:process} \(%{DATA:processcode}\)(\s*)\t%{DATA:tid}(\s*)\t(?&lt;area&gt;.*)(\s*)\t(?&lt;category&gt;.*)(\s*)\t%{WORD:eventID}(\s*)\t%{WORD:level}(\s*)\t%{DATA:eventmessage}\t%{UUID:CorrelationID}"] 
match =&gt; [ "message",  
"(?&lt;parsedtime&gt;%{MONTHNUM}/%{MONTHDAY}/%{YEAR} %{HOUR}:%{MINUTE}:%{SECOND}) \t%{DATA:process} \(%{DATA:processcode}\)(\s*)\t%{DATA:tid}(\s*)\t(?&lt;area&gt;.*)(\s*)\t(?&lt;category&gt;.*)(\s*)\t%{WORD:eventID}(\s*)\t%{WORD:level}(\s*)\t%{DATA:eventmessage}"] 
match =&gt; [ "message",  
“(?&lt;parsedtime&gt;%{MONTHNUM}/%{MONTHDAY}/%{YEAR} %{HOUR}:%{MINUTE}:%{SECOND})%{GREEDYDATA}\t%{DATA:process} \(%{DATA:processcode}\)(\s*)\t%{DATA:tid}(\s*)\t(?&lt;area&gt;.*)(\s*)\t(?&lt;category&gt;.*)(\s*)\t%{WORD:eventID}(\s*)\t%{WORD:level}(\s*)\t%{DATA:eventmessage}"] 
} 
date { 
match =&gt; ["parsedtime","MM/dd/YYYY HH:mm:ss.SSS"] 
} 
   } 
}</pre><p></p>
<h3>Logstash in action</h3>
<p>All that&#8217;s left to do is to get Logstash going and see what comes out. Run the following on the command line:</p><pre class="crayon-plain-tag">logstash.bat agent -f "sharepoint.conf"</pre><p>This runs logstash as an agent, so it will monitor the file or the folder you specify in the input section of the config for changes. If you are indexing a folder where files appear periodically, you don&#8217;t need to worry about restarting the process, it will continue on its own.</p>
<h3>Kibana time</h3>
<p>Now let&#8217;s create a new dashboard in Kibana and see what was indexed. The most straight-forward panel type is Histogram. Make no changes to the default settings of this panel (Chart value = count, Time field = @timestamp) and you should see something similar to this:</p>
<p><a href="http://blog.comperiosearch.com/wp-content/uploads/2014/08/elk005.jpg"><img class="alignnone wp-image-2780 size-medium" src="http://blog.comperiosearch.com/wp-content/uploads/2014/08/elk005-300x125.jpg" alt="elk005" width="300" height="125" /></a></p>
<p><span class="TextRun SCX61371348" style="color: #000000" xml:lang="EN-US"><span class="NormalTextRun SCX61371348">To get some more relevant information, we can add some pie charts and let them display other properties that we have mapped, for example ‘process’ or ‘</span></span><span class="TextRun SCX61371348" style="color: #000000" xml:lang="EN-US"><span class="NormalTextRun SCX61371348">area</span></span><span class="TextRun SCX61371348" style="color: #000000" xml:lang="EN-US"><span class="NormalTextRun SCX61371348">’. </span></span><span class="LineBreakBlob BlobObject SCX61371348" style="color: #000000"><span class="SCX61371348"> </span><br class="SCX61371348" /></span></p>
<p><a href="http://blog.comperiosearch.com/wp-content/uploads/2014/08/elk001.jpg"><img class="alignnone size-medium wp-image-2776" src="http://blog.comperiosearch.com/wp-content/uploads/2014/08/elk001-300x68.jpg" alt="elk001" width="300" height="68" /></a></p>
<p><span class="TextRun SCX4542470" style="color: #000000" xml:lang="SV-SE"><span class="SpellingError SCX4542470">Now</span><span class="NormalTextRun SCX4542470"> </span><span class="SpellingError SCX4542470">let&#8217;s</span><span class="NormalTextRun SCX4542470"> </span><span class="SpellingError SCX4542470">turn</span><span class="NormalTextRun SCX4542470"> </span><span class="SpellingError SCX4542470">this</span><span class="NormalTextRun SCX4542470"> </span><span class="SpellingError SCX4542470">up</span><span class="NormalTextRun SCX4542470"> a </span><span class="SpellingError SCX4542470">notch</span><span class="NormalTextRun SCX4542470">:</span></span><span class="TextRun SCX4542470" style="color: #000000" xml:lang="SV-SE"><span class="NormalTextRun SCX4542470"> </span><span class="SpellingError SCX4542470">t</span></span><span class="TextRun SCX4542470" style="color: #000000" xml:lang="SV-SE"><span class="SpellingError SCX4542470">hrough</span><span class="NormalTextRun SCX4542470"> </span><span class="SpellingError SCX4542470">Kibana</span><span class="NormalTextRun SCX4542470"> </span><span class="SpellingError SCX4542470">we</span><span class="NormalTextRun SCX4542470"> </span><span class="SpellingError SCX4542470">can</span><span class="NormalTextRun SCX4542470"> </span><span class="SpellingError SCX4542470">take</span><span class="NormalTextRun SCX4542470"> a look at the </span><span class="SpellingError SCX4542470">err</span></span><span class="TextRun SCX4542470" style="color: #000000" xml:lang="SV-SE"><span class="SpellingError SCX4542470">ors</span></span><span class="TextRun SCX4542470" style="color: #000000" xml:lang="SV-SE"><span class="NormalTextRun SCX4542470"> in the SharePoint logs. </span></span><span class="TextRun SCX4542470" style="color: #000000" xml:lang="SV-SE"><span class="SpellingError SCX4542470">Create</span></span><span class="TextRun SCX4542470" style="color: #000000" xml:lang="SV-SE"><span class="NormalTextRun SCX4542470"> a </span><span class="SpellingError SCX4542470">pie</span><span class="NormalTextRun SCX4542470"> </span><span class="SpellingError SCX4542470">chart</span><span class="NormalTextRun SCX4542470"> </span><span class="SpellingError SCX4542470">that</span><span class="NormalTextRun SCX4542470"> displays the &#8220;</span><span class="SpellingError SCX4542470">level</span></span><span class="TextRun SCX4542470" style="color: #000000" xml:lang="SV-SE"><span class="NormalTextRun SCX4542470">&#8221; </span><span class="SpellingError SCX4542470">field</span></span><span class="TextRun SCX4542470" style="color: #000000" xml:lang="SV-SE"><span class="NormalTextRun SCX4542470">. By</span></span><span class="TextRun SCX4542470" style="color: #000000" xml:lang="SV-SE"><span class="NormalTextRun SCX4542470"> </span><span class="SpellingError SCX4542470">clicking</span><span class="NormalTextRun SCX4542470"> on the &#8220;</span><span class="SpellingError SCX4542470">Unexpected</span><span class="NormalTextRun SCX4542470">&#8221; slice in </span><span class="SpellingError SCX4542470">this</span><span class="NormalTextRun SCX4542470"> </span><span class="SpellingError SCX4542470">chart</span><span class="NormalTextRun SCX4542470"> </span><span class="SpellingError SCX4542470">you</span><span class="NormalTextRun SCX4542470"> </span><span class="SpellingError SCX4542470">will</span><span class="NormalTextRun SCX4542470"> filter all the </span><span class="SpellingError SCX4542470">dashboard</span><span class="NormalTextRun SCX4542470"> on </span><span class="SpellingError SCX4542470">this</span></span><span class="TextRun SCX4542470" style="color: #000000" xml:lang="SV-SE"><span class="NormalTextRun SCX4542470"> </span><span class="SpellingError SCX4542470">value</span></span><span class="TextRun SCX4542470" style="color: #000000" xml:lang="SV-SE"><span class="NormalTextRun SCX4542470">. </span></span><span class="EOP SCX4542470" style="color: #000000"> </span></p>
<p><a href="http://blog.comperiosearch.com/wp-content/uploads/2014/08/elk002.jpg"><img class="alignnone size-medium wp-image-2777" src="http://blog.comperiosearch.com/wp-content/uploads/2014/08/elk002-300x151.jpg" alt="elk002" width="300" height="151" /></a></p>
<p>Kibana will automatically refresh the page, the filter itself will be displayed in the &#8220;Filter&#8221; row and all you will see are the &#8220;Unexpected&#8221; events.  Time to turn to the help of a Table chart: by displaying the columns you select on the Fields section of this chart, you can view and sort the log entries for a more detailed analysis of the unexpected events.</p>
<p><a href="http://blog.comperiosearch.com/wp-content/uploads/2014/08/elk003.jpg"><img class="alignnone size-medium wp-image-2778" src="http://blog.comperiosearch.com/wp-content/uploads/2014/08/elk003-300x74.jpg" alt="elk003" width="300" height="74" /></a></p>
<p><span class="TextRun SCX126625747" style="color: #000000" xml:lang="SV-SE"><span class="NormalTextRun SCX126625747">As the </span></span><span class="TextRun SCX126625747" style="color: #000000" xml:lang="SV-SE"><span class="SpellingError SCX126625747">Logstash</span><span class="NormalTextRun SCX126625747"> process </span><span class="SpellingError SCX126625747">runs</span><span class="NormalTextRun SCX126625747"> as an agent, </span><span class="SpellingError SCX126625747">you</span><span class="NormalTextRun SCX126625747"> </span><span class="SpellingError SCX126625747">can</span><span class="NormalTextRun SCX126625747"> monitor the SharePoint events in </span><span class="SpellingError SCX126625747">real-</span></span><span class="TextRun SCX126625747" style="color: #000000" xml:lang="SV-SE"><span class="SpellingError SCX126625747">time</span><span class="NormalTextRun SCX126625747">!</span></span><span class="EOP SCX126625747" style="color: #000000"> So there you have it, SharePoint log analysis using ELK.</span></p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.comperiosearch.com/blog/2014/08/01/sharepoint-log-analysis-using-elk/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Elasticsearch Visits Comperio</title>
		<link>http://blog.comperiosearch.com/blog/2014/04/04/elasticsearch-visits-comperio/</link>
		<comments>http://blog.comperiosearch.com/blog/2014/04/04/elasticsearch-visits-comperio/#comments</comments>
		<pubDate>Fri, 04 Apr 2014 08:28:04 +0000</pubDate>
		<dc:creator><![CDATA[Fergus McDowall]]></dc:creator>
				<category><![CDATA[Business]]></category>
		<category><![CDATA[English]]></category>
		<category><![CDATA[Elasticsearch]]></category>
		<category><![CDATA[Kibana]]></category>
		<category><![CDATA[logstash]]></category>

		<guid isPermaLink="false">http://blog.comperiosearch.com/?p=2171</guid>
		<description><![CDATA[Yesterday the legendary Shay Banon, inventor of Elasticsearch and Arie Chapman dropped into to Comperio’s Oslo office on their way to the Oslo Elasticsearch Meetup to talk about whats hot in Elasticsearch v1.x. Shay gave the team the lowdown on the latest functionality, and Arie outlined interesting cutomers and use-cases. Shay also talked about how [...]]]></description>
				<content:encoded><![CDATA[<p><a href="http://blog.comperiosearch.com/wp-content/uploads/2014/04/rsz_bilde4.jpg"><img class="alignright size-medium wp-image-2181" title="rsz_bilde" src="http://blog.comperiosearch.com/wp-content/uploads/2014/04/rsz_bilde4-300x225.jpg" alt="" width="300" height="225" /></a>Yesterday the legendary <a href="https://twitter.com/kimchy">Shay Banon</a>, inventor of Elasticsearch and <a href="https://twitter.com/ArieChapman">Arie Chapman</a> dropped into to Comperio’s Oslo office on their way to the Oslo Elasticsearch Meetup to talk about <a href="http://www.elasticsearch.org/">whats hot in Elasticsearch v1.x.</a></p>
<p>Shay gave the team the lowdown on the latest functionality, and Arie outlined interesting cutomers and use-cases. Shay also talked about how <a href="http://www.elasticsearch.org/overview/kibana/">Kibana</a> and <a href="http://logstash.net/">Logstash</a> can make pushing data in and out of indexes easier.</p>
<p>Comperio is really interested in the opportunities that Elasticsearch opens up for visualizing large datasets, particularly those generated by large distributed electronic systems. We will definitely be following up these opportunities with our customers, and hope to bounce some ideas off of the Elastisearch guys again soon.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.comperiosearch.com/blog/2014/04/04/elasticsearch-visits-comperio/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Query log analysis – using logstash, elasticsearch and kibana</title>
		<link>http://blog.comperiosearch.com/blog/2013/12/02/query-log-analysis-using-logstash-elasticsearch-and-kibana/</link>
		<comments>http://blog.comperiosearch.com/blog/2013/12/02/query-log-analysis-using-logstash-elasticsearch-and-kibana/#comments</comments>
		<pubDate>Mon, 02 Dec 2013 14:00:21 +0000</pubDate>
		<dc:creator><![CDATA[Niels Henrik Hagen]]></dc:creator>
				<category><![CDATA[English]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Elasticsearch]]></category>
		<category><![CDATA[Kibana]]></category>
		<category><![CDATA[logstash]]></category>
		<category><![CDATA[query log]]></category>

		<guid isPermaLink="false">http://blog.comperiosearch.com/?p=1850</guid>
		<description><![CDATA[As a search consultant I need to understand how a search application is used with the end goal of providing a better search experience for the end user. That story can come from many places and part of that story can be found in the query logs. Analyzing query logs brings insight into how a search application [...]]]></description>
				<content:encoded><![CDATA[<p>As a search consultant I need to understand how a search application is used with the end goal of providing a better search experience for the end user. That story can come from many places and part of that story can be found in the query logs.</p>
<blockquote><p>Analyzing query logs brings insight into how a search application is used, use that insight to improve the next version of the search application.</p></blockquote>
<h2>What is a query log?</h2>
<p>A query log consists of information about a query against a search application or search engine. They often contain information about when the query was executed, the query text, the context, applied facets, pagination, hit counts and so on.</p>
<h2>The source data is a mess</h2>
<div id="attachment_1856" style="width: 310px" class="wp-caption alignleft"><a href="http://blog.comperiosearch.com/wp-content/uploads/2013/12/query-log-json.png"><img class="size-medium wp-image-1856" src="http://blog.comperiosearch.com/wp-content/uploads/2013/12/query-log-json-300x225.png" alt="JSON query log" width="300" height="225" /></a><p class="wp-caption-text">JSON query log</p></div>
<p>The problem however is these logs are often plain text files located on multiple servers and are created by the search engine, not the search application that &#8220;knows&#8221; the user or at least the context. To solve these two issues I added logging to the search application and wrote each line in the log file as JSON. This gave me a couple of improvements, I now have 1 entry in the log file per user query (no matter if the application executes parallel queries against the search engine) and the content is now structured.</p>
<div id="attachment_1857" style="width: 310px" class="wp-caption alignright"><a href="http://blog.comperiosearch.com/wp-content/uploads/2013/12/querylog_analysis_03.png"><img class="size-medium wp-image-1857 " style="background: white" src="http://blog.comperiosearch.com/wp-content/uploads/2013/12/querylog_analysis_03-300x225.png" alt="Get the query log from the search application, Graphics: [Espen Klem](http://lab.klemespen.com/)" width="300" height="225" /></a><p class="wp-caption-text">Get the query log from the search application, Graphics: Espen Klem</p></div>
<h2>Logstash, ElasticSearch and Kibana to the rescue</h2>
<p>These log files are still hard to use on their own. They are made by a computer for a computer and I am human trying to understand other humans. Logstash, ElasticSearch and Kibana to the rescue! Logstash is a tool for managing events and logs. You can use it to collect logs, parse them, and store them for later use (statement humbly borrowed from <a title="http://logstash.net" href="http://logstash.net" target="_blank">http://logstash.net</a>). ElasticSearch is a distributed restful search and analytics engine (yet again borrowed, but from<a title=" http://www.elasticsearch.org/overview/" href="http://www.elasticsearch.org/overview/" target="_blank"> http://www.elasticsearch.org/overview/</a>). Kibana is a GUI tool to visualize logs and time-stamped data in realtime (yet again borrowed, but from <a title="http://www.elasticsearch.org/overview/kibana/" href="http://www.elasticsearch.org/overview/kibana/" target="_blank">http://www.elasticsearch.org/overview/kibana/</a>). These three tools make up a pretty good toolkit for creating some graphs and dashboards.</p>
<h2>Logstash</h2>
<p>Since I have already structured my log files in the search application Logstash does not have to do that for me, but there are other features of Logstash that are quite useful for dealing with event data. First of it can read my log files and send the data off to a central queue solving the issue of my log files living on multiple servers. After that another instance of Logstash picks up each event from the queue and forwards it to ElasticSearch and creating one new index per day. Since Logstash is a powerful content processing framework, should I need to do some kind of pre-processing of my data before indexing I have somewhere to do it.</p>
<h2>ElasticSearch &amp; Kibana</h2>
<p>As mentioned Logstash puts the events into one ElasticSearch index per day. Kibana retrieves relevant data from ElasticSearch using a set of configured queries and facets. Since ElasticSearch by default tokenizes all string properties in an indexed document I had to disable that for the query text property, but I wanted make sure that the content was lowercased so that the casing of the users input did not matter (since that was the case for the search application too). Here is a snippet of the how this configuration could look in an index template.</p>
<script src="https://gist.github.com/nhhagen/7749063.js?file=logstash-index-template.json"></script>
<h2>The end result</h2>
<div id="attachment_1854" style="width: 310px" class="wp-caption alignleft"><a href="http://blog.comperiosearch.com/wp-content/uploads/2013/12/query-log.png"><img class="size-medium wp-image-1854" src="http://blog.comperiosearch.com/wp-content/uploads/2013/12/query-log-300x225.png" alt="Quick overview over search application usage" width="300" height="225" /></a><p class="wp-caption-text">Quick overview over search application usage</p></div>
<p>The overview dashboard gives a quick overview over what the users are searching for and what facets they are using. There is also a graph for displaying the query load on the application and one for the average query load. The last histogram contains information for facet use over time. Clicking on the bars in the &#8220;Top Search Modes&#8221; graph allows for focusing all the graphs on a single search mode from the application.</p>
<div id="attachment_1855" style="width: 310px" class="wp-caption alignright"><a href="http://blog.comperiosearch.com/wp-content/uploads/2013/12/query-log-performance.png"><img class="size-medium wp-image-1855" src="http://blog.comperiosearch.com/wp-content/uploads/2013/12/query-log-performance-300x225.png" alt="Search application performance information" width="300" height="225" /></a><p class="wp-caption-text">Search application performance information</p></div>
<p>Performance information is provided by the second dashboard. The main goal of this board is to answer the question &#8220;are the users getting their results back quickly&#8221;, in technical terms &#8220;is search latency low (good) or high (bad)&#8221;. The query log is broken into blocks of interesting ranges of latency giving a quick overview if most of the queries are &#8220;in the green&#8221; meaning that they are executed in under 50ms (defined as good performance in this case).</p>
<h2>What is missing?</h2>
<p>This is all a good start, but there are pieces to the puzzle I am missing. I would like to collect information about the hits users are choosing and link that to the queries they are executing, this would allow for &#8220;calculation&#8221;/analysis of how relevant the search result presented by the application are. Next it would be nice to have information about a user session to provide an understanding about the users query chain (he/she first queried for term1, clicked on hit no. 3, the queried for term2 and clicked hit no. 1. I looked into using Google Analytics, <a title="http://piwik.org/" href="http://piwik.org/" target="_blank">Piwik</a> or some other non-custom tool, but I chose this because I felt need for something that I could customize fully and was extensible to deal with any kind of log files. It did not hurt either that all the technologies used are OpenSource.</p>
<h2>References</h2>
<ul>
<li><a title="http://www.elasticsearch.org" href="http://www.elasticsearch.org" target="_blank">http://www.elasticsearch.org</a></li>
<li><a title="http://logstash.net" href="http://logstash.net" target="_blank">http://logstash.net</a></li>
<li>various post on <a title="http://stackoverflow.com" href="http://stackoverflow.com" target="_blank">http://stackoverflow.com</a></li>
</ul>
<p>Big thanks to all the people on the #elasticsearch and #logstash IRC channels on freenode that helped me figuring out the quirks.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.comperiosearch.com/blog/2013/12/02/query-log-analysis-using-logstash-elasticsearch-and-kibana/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
