<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Search Nuggets &#187; norwegian</title>
	<atom:link href="http://blog.comperiosearch.com/blog/tag/norwegian-2/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.comperiosearch.com</link>
	<description>A blog about Search as THE solution</description>
	<lastBuildDate>Mon, 13 Jun 2016 08:59:45 +0000</lastBuildDate>
	<language>en-US</language>
		<sy:updatePeriod>hourly</sy:updatePeriod>
		<sy:updateFrequency>1</sy:updateFrequency>
	<generator>https://wordpress.org/?v=3.9.40</generator>
	<item>
		<title>Beer and searching at Elasticon</title>
		<link>http://blog.comperiosearch.com/blog/2015/03/09/beer-searching-elasticon/</link>
		<comments>http://blog.comperiosearch.com/blog/2015/03/09/beer-searching-elasticon/#comments</comments>
		<pubDate>Sun, 08 Mar 2015 22:58:16 +0000</pubDate>
		<dc:creator><![CDATA[André Lynum]]></dc:creator>
				<category><![CDATA[Technology]]></category>
		<category><![CDATA[Chinatown]]></category>
		<category><![CDATA[Elasticon]]></category>
		<category><![CDATA[home grown hops and rare yeasts]]></category>
		<category><![CDATA[Liars Dice]]></category>
		<category><![CDATA[norwegian]]></category>
		<category><![CDATA[Nøgne Ø IPA]]></category>
		<category><![CDATA[Sierra Nevada pale ale]]></category>

		<guid isPermaLink="false">http://blog.comperiosearch.com/?p=3379</guid>
		<description><![CDATA[Christoffer was pacing angrily back and forth, Nøgne Ø IPA in his left hand, phone in the other. I was looking at the the long list of cancellations, including our connecting flight to Arlanda on the way to SF. The Norwegian strike was hitting hard with nearly no planes flying in Europe. Arlanda airport is [...]]]></description>
				<content:encoded><![CDATA[<p><a href="http://blog.comperiosearch.com/wp-content/uploads/2015/03/IMG_20150307_005509-3.jpg"><img src="http://blog.comperiosearch.com/wp-content/uploads/2015/03/IMG_20150307_005509-3-300x219.jpg" alt="IMG_20150307_005509 (3)" width="300" height="219" class="alignnone size-medium wp-image-3380" /></a></p>
<p />
Christoffer was pacing angrily back and forth, Nøgne Ø IPA in his left hand, phone in the other. I was looking at the the long list of cancellations, including our connecting flight to Arlanda on the way to SF. The Norwegian strike was hitting hard with nearly no planes flying in Europe.</p>
<p />
Arlanda airport is fairly described as the butthole of the world. Filled up with angry swedes and featuring the worlds slowest transfer security gate. We had booked the flight being fully aware of the pain and the risks and now the plane wasn’t even going to land there.</p>
<p />
&#8220;Listen up&#8221; Christoffer was growling between gulps of strong IPA, &#8220;There’s no way we’ll be planted in your crap airport for over 12 hours bullshit strike or not.&#8221;</p>
<p />
&#8220;Calm down&#8221; I looked up at the furious bearded giant. &#8220;You’re rattling my nerves.&#8221;</p>
<p />
&#8220;Besides we still have CEOs credit card. Get some other plane and upgrade our tickets to business class while you’re at it. I’ll need some rest and proper legspace after all this crap.&#8221;</p>
<p />
&#8230;</p>
<p />
Out of breath after a mad dash to the airport service desk with Christoffers 60 pound america suitcase in tow, we were swaying dangerously with a Nøgne Ø beer heavy on our breath.</p>
<p />
&#8220;We need tickets for the next plane to Arlanda. It’s of the utmost importance that we reach Arlanda as quickly as possible!&#8221;</p>
<p />
The man at the counter tried his best to ignore us. Maybe we skipped line, I don’t know.</p>
<p />
&#8220;Listen up, we are going to Elasticon in SF. The most degenerate collection of search professionals in the world. It&#8217;s a nasty assignment but somebody has to do it. We need to catch our flight from Arlanda and all this strike bullshit has left me with no patience!&#8221;</p>
<p />
The man at the desk was sensíng an ugly scene. Two mean IT bums with their CEOs card and shot nerves. He ignored the shouting and commotion behind us and got to work.</p>
<p />
&#8230;</p>
<p />
Feet up in bclass seats on our way to SF we finally got some rest. Christoffer scanning the bclass microbrewery selection with a critical eye. &#8220;I need an imperial stout! In case of turbulence you know, need the extra weight. Maybe two or three even.&#8221; &#8220;Suit yourself&#8221; I said nipping a light lager while Christoffer was waving and hollering at the flight attendants. &#8220;Turbulence can be heavy shit for sure.&#8221;</p>
<p />
&#8230;</p>
<p />
Coming down in SF we knew we had to tighten up in front of the nasty customs procedure awaiting us. Christoffers america suitcase was another concern, but we were banking his large collection of home grown hops and rare yeasts not triggering any bomb sensors or something. The trick is to stomp your toe into something right before approaching the customs official. This will give you the steely stare and tight grimace needed to pass muster at the in front of the customs official asking why you’ld ever come here and if you have the means to butt yourself out before you become too much of a nuisance. &#8220;It’s a conference&#8221; I’ve wheezed through my gritting teeth. &#8220;But it might as well be a madhouse. These people are serious business&#8221;. The man behind the counter wasn’t really satisfied but didn’t press any further, and we hobbled on to pick up Christoffers gigantic trunk at arrivals.</p>
<p />
&#8230;</p>
<p />
Eleven hours of dry recycled air and flat beers can break any man, and stumbling out of Oakland airport we knew we were in a particularly bad shape. After his half a dozen heavy stouts Christoffer had entered a nasty beer induced coma and snored like an annoyed elephant for nine consecutive hours. Now he was wide awake of course ready for action while the rest of us hadn’t slept for 28 hours or so. Jetlag was about to do a double trick on us and I knew we had to take action.</p>
<p />
There is only two remedies available to a man facing the disorientation and confusion of a serious jetlag and that is either to desperately stay awake long enough or induce a serious comatose sleep by an means necessary.<br />
&#8220;If we hurry we can get into some dive before it’s too late. Load you up and then get you back to the hotel so I can get some proper rest.&#8221; I said.  A plan I deeply regretted trying to keep myself together in a dingy Liars Dice den in Chinatown watching Christoffer enjoying a Sierra Nevada pale ale. The sharp rattling of dice cups among huge piles of dollar bills on the bar counter was jangling my nerves and leaving me with a sense that we had embarked something quite different than what we had signed up for &#8230;​</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.comperiosearch.com/blog/2015/03/09/beer-searching-elasticon/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Searching for &#8220;miljø&#8221; inside of &#8220;arbeidsmiljø&#8221; using Elasticsearch and the ngram tokenizer</title>
		<link>http://blog.comperiosearch.com/blog/2014/06/11/searching-miljo-inside-arbeidsmiljo-using-elasticsearch-ngram-tokenizer/</link>
		<comments>http://blog.comperiosearch.com/blog/2014/06/11/searching-miljo-inside-arbeidsmiljo-using-elasticsearch-ngram-tokenizer/#comments</comments>
		<pubDate>Tue, 10 Jun 2014 22:59:58 +0000</pubDate>
		<dc:creator><![CDATA[Christoffer Vig]]></dc:creator>
				<category><![CDATA[English]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[analyzers]]></category>
		<category><![CDATA[Elasticsearch]]></category>
		<category><![CDATA[language]]></category>
		<category><![CDATA[ngram]]></category>
		<category><![CDATA[norwegian]]></category>
		<category><![CDATA[underbukser]]></category>

		<guid isPermaLink="false">http://blog.comperiosearch.com/?p=2495</guid>
		<description><![CDATA[Compound words are a big problem for Norwegians. The young don&#8217;t know how to use them, search engines struggle with them as well. Elasticsearch and the ngram tokenizer offers one possible solution. There is a Facebook  group dedicated to the task of spreading the knowledge, using images showing the difference between for instance &#8220;underbukser&#8221; (under wear) and &#8220;under [...]]]></description>
				<content:encoded><![CDATA[<p>Compound words are a big problem for Norwegians. The young don&#8217;t know how to use them, search engines struggle with them as well. Elasticsearch and the <a href="http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-ngram-tokenizer.html">ngram tokenizer</a> offers one possible solution.<br />
<span id="more-2495"></span></p>
<p>There is a <a href="https://www.facebook.com/photo.php?fbid=687044034678144&amp;set=a.499093253473224.1073741826.499091746806708&amp;type=1&amp;theater">Facebook</a> <a href="https://www.facebook.com/photo.php?fbid=687044034678144&amp;set=a.499093253473224.1073741826.499091746806708&amp;type=1&amp;theater"> </a>group dedicated to the task of spreading the knowledge, using images showing the difference between for instance &#8220;underbukser&#8221; (under wear) and &#8220;under bukser&#8221; (positioned below trousers).</p>
<div style="width: 301px" class="wp-caption alignright"><a href="https://www.facebook.com/photo.php?fbid=687044034678144&amp;set=a.499093253473224.1073741826.499091746806708&amp;type=1&amp;theater"><img src="https://fbcdn-sphotos-a-a.akamaihd.net/hphotos-ak-xpa1/t1.0-9/q71/s720x720/10294335_687044034678144_6234252359337064096_n.jpg" alt="" width="291" height="206" /></a><p class="wp-caption-text">Underwear  or under wear. Not the same thing! <br />Photo: André Ulveseter</p></div>
<p>&nbsp;</p>
<p>Elasticsearch offers a wide range of <a href="http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis.html">analysing options</a>. The <a href="http://en.wikipedia.org/wiki/N-gram">ngram</a> tokenizer splits a string into a series of continuous letters. For instance &#8220;underbukser&#8221;, with a size two ngram would split the word into &#8220;un&#8221; &#8220;nd&#8221; &#8220;de&#8221; &#8220;er &#8220;rb&#8221; &#8220;bu &#8220;uk&#8221; &#8220;ks&#8221; &#8220;se&#8221; &#8220;er&#8221;.  Elasticsearch will <a href="http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/analysis-intro.html">use the same analyzer </a> when querying the field, so if we search for &#8220;bukser&#8221; it will be split into &#8220;bu&#8221;, &#8220;uk&#8221;, and so on, and matches will result.</p>
<p>Well, enough chit chat, time for some code. Using the excellent <a href="https://found.no/play/">Play </a> tool created by Elasticsearch experts found.no we can even test it all out in our browser, no need for a server, you can do it at home on your ipad, chromebook, or even on a windows phone.</p>
<div id="attachment_2508" style="width: 212px" class="wp-caption alignright"><img class="wp-image-2508 size-medium" src="http://blog.comperiosearch.com/wp-content/uploads/2014/06/myAnalyzer-in-Action-202x300.png" alt="myAnalyzer in Action" width="202" height="300" /><p class="wp-caption-text">myAnalyzer in Action, showing how the word is split into ngrams</p></div>
<p>Here is <a href="https://found.no/play/gist/4ec22e1e67c9c5f9bcc0#">a simple demo showing</a> the ngram tokenizer in action.<br />
The demo has three documents indexed, all containing the field foo with respective values &#8220;arbeidsmiljø&#8221;, &#8220;arbeid&#8221;, and &#8220;arbeidsmiljøloven&#8221;.<br />
It uses an analyzer aptly called &#8220;myAnalyzer&#8221;,  this analyzer is using a custom tokenizer called &#8220;my_toknizer&#8221;, where the actual ngram tokenization is taking place.  The  ngrams for this sample are created with sizes ranging from 2 to 3. Testing it out on found.no/play, you can see how the various stages of the analyzer modifies the text. Neat!</p>
<p>The mapping enables the &#8220;myAnalyzer&#8221; to be used for  the &#8220;foo&#8221; field. Finally, I create a query, for the term &#8220;miljø&#8221;, which I expect to be found in the middle of documents nr. 2 and 3. Pressing the &#8220;run&#8221; button executes the setup, displaying the search results at the bottom right.</p>
<p>If you are really interested to learn more about analyzers, try the <a href="http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/languages.html">elasticsearch guide on languages</a>, which is getting better day by day, or the articles on <a href="http://blog.qbox.io/multi-field-partial-word-autocomplete-in-elasticsearch-using-ngrams">qbox on autocomplete using ngrams</a>  or  <a href="http://www.found.no/foundation/">found.no on language analyzers</a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.comperiosearch.com/blog/2014/06/11/searching-miljo-inside-arbeidsmiljo-using-elasticsearch-ngram-tokenizer/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
