Content Enrichment Web Service SharePoint 2013 – Advantages and Challenges

If you have worked with search solutions before, you will know that very often there is a need to process data before it can be displayed in search results. This processing might be required to address some of(but not limited to) these common issues:

  • Missing metadata issues
  • Inconsistent metadata issues
  • Cleansing of content
  • Integration of semantic layers/Automatic tagging
  • Integration with 3rd party service
  • Merging data from other sources

Content Enrichment Web Service in SharePoint 2013 is a SOAP-based service within the content processing component that can be used to achieve this. The figure below shows a part of the process that takes place in the content processing component of SharePoint search. Content enrichment within content processing

Content Enrichment Web Service SharePoint 2013 combines the goodness of both FAST for SharePoint Search and SharePoint Search  to offer a whole new set of possibilities and has its own challenges. To see an implementation example, check the MSDN link which pretty much sums up the basic steps. In this post we are going to look at some of the advantages and challenges of CEWS coming from a FAST 2010 background:

1. CEWS is a service and you DON’T have to deploy it in your SharePoint environment: Perhaps this is the biggest architectural change  from the content processing perspective. What this means is that your code no longer runs in a sandbox environment within SharePoint Server. The webservice can be hosted anywhere outside your SharePoint server thus reducing deployment headaches and huge number of approvals required to deploy the executable files. I can see operations/infrastructure team/administrators smiling.

2.The web service processes and returns managed properties, not crawled properties: Managed properties correspond to what actually gets indexed and displayed in search results. So, this reduces some of the confusion as why I cant see the updated results( perhaps you had forgotten to map your crawled property to a managed property and wait you will have to index it AGAIN!). Nightmare!

3. You can define a trigger to limit the set of items that are processed by the web service: In FAST 2010, each item had to pass through the pipeline whether you wanted to process it or not. This check had to be done in the code. Trigger in 2013 will allow us to define this check outside the code so that only for selected content, web service is called. This will optimize the overall performance and improve crawling time, if you only want to process a subset of the content.

So far, so good! But.. there are certain challenges we need to look at and see how we can overcome it. In fact, this is the most important part when you are architecting your CEWS solution:

1. The content enrichment callout step can only be configured with a single web service endpoint : Now this sounds very limiting.  I have multiple search applications and earlier I maintained the logic in different solutions. Do I need to combine them all into a single service? What about the maintenance and change request? Well there are several possible technologies one could consider to solve this but what I did in my project was to create a WCF routing service and let the routing service handle my multiple web services based on filters. You could also use it to implement load-balancing and fault tolerance. Here in the following example, I have two content sources “xmlfile” and “EpiFileShare”. I want to have two different services “xmlsvc” and “episvc” to process these different sources. This is how I will configure the end points in my WCF Routing Service:   endpoints 2. Only one condition can be configured for Trigger. Different search application will require different triggers: Now, this can again be solved by using WCF routers and filters and configuring separate endpoints for separate triggers. Here I am using default managed property “ContentSource” as a trigger/filter to determine my service endpoint. config file To summarize, I have shown some of the advantages and challenges of the new CEWS architecture in SharePoint 2013 search and how you can overcome it. Hope that now you want  to try this soon and share your experience with us.

Article written by

Mridu Agarwal
Enterprise Search Consultant with a love for reading, travelling and exploring the unknown.


Leave a response





XHTML: These tags are allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">


OSLO

Comperio AS
Øvre Slottsgate 27
NO-0157 Oslo,
Norway
+47 22 33 71 00
View map

STOCKHOLM

Search Provider Sverige AB
Gamla Brogatan 34
SE-11 120 Stockholm
Sweden
+46 8-21 49 00
View map