Making Synonyms Visible in SharePoint 2013 Search Results
SharePoint 2013 Search has built-in support for thesaurus enrichment of queries.
However, synonyms are often not visible in the search results.
This post will show you how you can modify the synonym weight using the Ceres shell.
The internal workings of SharePoint 2013 Search can be controlled using the Ceres shell, a set of powershell cmdlets. Using the shell we can inspect and modify a whole lot of stuff that probably never was meant to be touched by end users. Modifying the flow configurations can potentially ruin your SharePoint installation. Comperio Search will take no responsibility for any damage caused by actions taken based on what you read in this blog .
The thesaurus lookup is performed at query time, and the dictionary can be set up with support for various languages. The thesaurus must be deployed as a csv file using powershell, it has columns for key, synonym, and an optional language. The “key” column can be a phrase, and so can the synonym. ie. “Go fishing” can be a synonym for “hunt for fish”. To provide several synonyms for a word, simply add it several times. To make the synonym go both ways, add a second entry with the word and synonym switching place. (See Microsoft for further details )
The thesaurus is simple, but it works. Or, does it really? Testing synonyms fetched from the internet on a SharePoint search index populated with US Court records crawled from theinfo.org. AKA is a common legal term synonymous with “also known as”. So let’s try that out.
Searching for “aka” yields some hundred results, searching for “also known as” yields a different set of results of roughly the same size. By adding synonyms we would expect the search results to combine the hits for both queries into one, so to speak.
So I create a thesaurus containing:
1 2 |
Key,Synonym,Language aka,also known as |
And upload it with the powershell command:
1 2 |
$searchApp = Get-SPEnterpriseSearchServiceApplication Import-SPEnterpriseSearchThesaurus -SearchApplication $searchApp -Filename \\spbox\temp\thesaurus.csv |
I wait for a few seconds, and search for “aka”.
Now, I would expect to find hits containing “also known as”. But where is it? I have to scroll and page down to the bottom of page 3 before I find it:
Why? Could the ULS logs provide any clues? Turning on verbose logging on the Search Query Processing, and search again. Now, in the ULS logs there are entries with “After thesaurus tree modification” (the indentation is mine, trying to make it look a little clearer, also abbreviated it some).
1 2 3 4 5 6 7 8 9 |
Microsoft.Office.Server.Search.Query.Pipeline.Executors.LinguisticQueryProcessingExecutor : After thesaurus tree modification: 'AndNode(FirstChild=StringNode (FirstChild=WordsNode(FirstChild=TokenNode(FirstChild=null,NextSibling=OnearNode(FirstChild=TokenNode(FirstChild=null,NextSibling=TokenNode (FirstChild=null,NextSibling=TokenNode (FirstChild=null,NextSibling=null,Length=1,Linguistics=True,Token=<strong>as</strong>,Weight=1), Length=1,Linguistics=True,Token=<strong>known</strong>,Weight=1), Length=1,Linguistics=True,Token=<strong>also</strong>,Weight=1),NextSibling=null ,ExtraTermsAllowed=0,<strong>Weight=0.2</strong>), Length=1,Linguistics=True,Token=<strong>aka</strong>,Weight=1),NextSibling=null), |
The synonyms are given a weight of 0.2. The original term has a weight of 1, theoretically that means the synonym has 20% percent weight of the original term. Perhaps we could make the synonyms show up by increasing the weight. So, how can we do that? Apparently, there is no way. Not unless we open up the magic box of the Ceres shell.
We begin by connecting to the Interactionengine:
1 2 3 4 |
Add-PsSnapin Microsoft.SharePoint.Powershell & "C:\Program Files\Microsoft Office Servers\15.0\Search\Scripts\ceresshell.ps1" Connect-System -Uri (Get-SPEnterpriseSearchServiceApplication).SystemManagerLocations[0] -ServiceIdentity (Get-SPEnterpriseSearchService).ProcessIdentity Connect-Engine -NodeTypes InterActionEngine |
Now, let’s try to rip out the configurations of the SharePointSearchProvider flow:
1 2 |
$flowname = Microsoft.SharePointSearchProviderFlow Get-Flow $flowname > $flowname.txt |
Reading the flow configurations for the SharePointSearchProviderFlow, we find an option named synonymWeight.
1 2 3 4 5 6 7 8 9 10 |
<Operator name="Linguistics" type="LinguisticQueryProcessing"> lt;Property name="querySpellingCorrectionTokenLimit" value="10" /> ... <Property name="stemWeight" value="0.2" /> <Property name=<strong>"synonymWeight" value="0.2" </strong>/> </Properties> </Operator> |
Strangely, it has the weight 0.2. Ring any bells, anyone? It is the same weight we saw in the ULS. Now, let us try to see what happens if we increase the weight here.
So, we set the synonymWeight to 1, and upload the file.
1 2 3 |
Remove-Flow $flowname Get-Content $flowname.txt | Out-String | Add-Flow $flowname Stop-Flow –FlowName $flowname –ForceAll |
Now, when searching for “aka”, we get hits containing “also know as” on the first search results page.
Voila!
[...] Making Synonyms Visible in SharePoint 2013 Search Results Share this:TwitterFacebookLinkedInWordPress:J’aime chargement… Catégories:SharePoint 2013 Tags:search, SharePoint 2013, Synonym, thésaurus Commentaires (0) Rétroliens (0) Poster un commentaire Rétrolien [...]
Hi, great article.
I’m trying to get the synonym into a separate keyword or manager property. How can i get that?
Christoffer,
Excellent post and exactly what is needed to treat synonyms equally.
Have you also modified the stemWeight property?
It is the same .2 aka 20% by default. In my testing, stemmed words appeared to behave the same way, plurals were not ranked the equivalent of the root word.
I tested modifying stemWeight up to “1″ using the same method. Voila, it appears to behave exactly as expected. I am curious if you’ve done similar tests.
On a separate note, I’ve run your powershell in several SP2013 environments without problems. However at one customer (all 3 ENVs) the powershell keeps failing on the Connect-System. Have you run into this?
thanks.
> Connect-System -Uri (Get-PEnterpriseSearchServiceApplication).SystemManagerLocations[0] -ServiceIdentity (Get-SPEnterpriseSearchService).ProcessIdentity
Connect-System : Could not connect to system: initial URI: net.tcp://servername/3E9C46/AdminComponent1/Management, tried:
net.tcp://servername/3E9C46/AdminComponent1/Management
At line:1 char:1
+ Connect-System -Uri (Get-SPEnterpriseSearchServiceApplication).SystemManagerLoc …
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : NotSpecified: (:) [Connect-System], ControllerException
+ FullyQualifiedErrorId : Microsoft.Ceres.CoreServices.Tools.Management.SystemController.ControllerException,Microsoft.Cer
es.CoreServices.Tools.Management.Cmdlets.ConnectSystem
We have imported a thesaurus for a customer employee search site in SharePoint 2013. The synonyms work fine in standard search, but when using the people search scope.. the synonyms does not apply. Any explanations for this anyone? :)
Hi Terje A T,
I asked around here, but we didn’t have any good answer on your question, sorry.
even i am facing the issue while running the powershell command what richard is facing i am getting an error
“> Connect-System -Uri (Get-PEnterpriseSearchServiceApplication).SystemManagerLocations[0] -ServiceIdentity (Get-SPEnterpriseSearchService).ProcessIdentity
Connect-System : Could not connect to system: initial URI: ”
Have anyone got this?