SharePoint 2013 Search internals: The Ceres shell
SharePoint 2013 continues assimilating the FAST ESP search engine. In FAST Search Server 2010 for SharePoint, the remains of ESP were still visible, and in part available for modification. In SharePoint 2013, you must search hard to find any mentions of FAST or ESP. Most options for modifying the internal operations of search are locked down, much to the chagrin of search solution developers, for whom the ability to tune and improve is bread and butter.
When Microsoft bought FAST, they were in the process of developing improvements to their search solution, code named Mars. Among the improvements were graphical interfaces for the flow engines for Query (IMS) and Content Processing (CTS). These were packaged and sold As FSIS, Fast Search for Internet Sites. SharePoint 2013 has integrated the pipeline workflow from CTS and IMS, and is using it internally. There is, however, no (apparent) option for configuring the flows by the end user. Of course, there is no nice graphical wizards for drawing up the pipeline steps as in FSIS.
Buried deep down in the folders of SharePoint, there is a file called ceresshell.ps1. Ceres is a dwarf planet circulating between Mars and Jupiter. Incidentally, we can use the Ceres shell to gain access to the configurations of the internal SharePoint flows. Once Mars was reached, the team behind SharePoint search kept on pushing into space, until they reached solid ground on the next planet, Ceres.
So what is the Ceres shell, and what can it do? Looking at the contents of the file, it is a powershell script that loads some snap ins.
1 2 3 4 5 |
Add-PSSnapin hostcontrollerpssnapin Add-pssnapin junopssnapin Add-pssnapin searchcorepssnapin Add-pssnapin enginepssnapin Add-pssnapin analysisenginepssnapin |
So, we start up a SharePoint powershell session as Farm administrator, and load in the Ceres shell.
1 |
PS> & "C:\Program Files\Microsoft Office Servers\15.0\Search\Scripts\ceresshell.ps1" |
Now the Ceres cmdlets have been loaded up and are ready to use.
Before we can have any fun with this, we need to connect to the “system”.
1 |
Connect-System -ServiceIdentity (Get-SPEnterpriseSearchService).ProcessIdentity |
Once connected to the system, we need to connect to the “engine”
1 |
Connect-Engine |
Now we are ready. To get a list of all flows available for inspection, type
1 |
Get-Flow |
To show the flow configuration for a single flow:
1 |
Get-Flow Microsoft.CrawlerIndexingSubFlow |
Here is a part of that file
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
<OperatorGraph name="Microsoft.CrawlerIndexingSubFlow" xmlns="http://schemas.microsoft.com/ceres/studio/2009/10/flow"> <Parameters /> <Operators> <Operator name="MarsWriter" type="Microsoft.Ceres.ContentEngine.Operators.BuiltIn.MarsWriter"> <Properties> <Property name="callbackType" value="&quot;Completed&quot;" /> <Property name="callbackWarningField" value="&quot;ParsingErrors&quot;" /> <Property name="commitInterval" value="1" /> <Property name="crawledPropertyBuckets" value="[&quot;content&quot;]" /> <Property name="defaultMaxIndexSize" value="524288" /> <Property name="defaultMaxResultSize" value="16384" /> <Property name="idField" value="&quot;externalId&quot;" /> <Property name="managedPropertiesListName" value="&quot;ManagedProperties&quot;" /> <Property name="managedPropertyBuckets" value="[&quot;ManagedPropertiesBucket&quot;]" /> <Property name="marsCallbackInfoProperty" value="&quot;marsLinkDBSynchronization&quot;" /> <Property name="provideCallbacks" value="True" /> <Property name="siteCollectionIdField" value="&quot;sitecollectionid&quot;" /> <Property name="tenantIdField" value="&quot;tenantId&quot;" /> <Property name="truncatedFlagField" value="&quot;IsPartiallyProcessed&quot;" /> </Properties> </Operator> <Operator name="SubFlowInput" type="Microsoft.Ceres.ContentEngine.Operators.BuiltIn.SubFlow.SubFlowInput"> <Targets> <Target> <operatorMoniker name="/Microsoft.CrawlerIndexingSubFlow/MarsWriter" /> |
Interesting, isn’t it? There seems to be some configurable values for stuff like “defaultMaxResultSize”.
Now you see how you can open the Ceres shell, and look at the flow configurations.
Next post will show you how you can modify values in the existing flows.