Increasing the summary length in FS4SP
In the settings for the Core Result Web Part you have the possibility to set the length of your hit summary. The default is 185 characters, and the upper limit seems to be somewhere around 400 when running against FAST Search Server 2010 for SharePoint.
As an example I have indexed a page on our company website. This is a paragraph from the original text which is 730 characters long.
FAST provides organizations with a scalable, high-performance enterprise search and information access platform designed to give instant access to information that is secure, relevant, accurate, and timely. With FAST and its Contextual Insight capability, it is possible to detect the context and intent of the query, search for terms and phrases, and return requested entities that appear in the context of the matching text. You will get both the contextual results with extreme precision and the contextual, dynamic navigation for further investigation of related information. Advanced linguistics and relevancy management features further improve and simplify your users’ search experience, enabling truly user-centric search.
A search on the term “FAST insight” yields this 160 character summary with the default settings:
By checking “Limit Characters In Summary” and setting the character count to 400
we get this output of 370 characters:
Certainly a much better read than the first one. But still the sentences are quite short, and fewer but more complete sentences would be better in my opinion.
The first step is to change the logic on how FAST generates summaries. Open up
C:\FASTSearch\etc\config_data\RTSearch\webcluster\fsearch.addon
in a text editor and add the following lines:
# Length of the generated summary in bytes. This is a hint to Juniper.
# The result may be slightly longer or shorter depending on the structure
# of the available document text and the submitted query.
juniper.dynsum.length 2048
# The number of (possibly partial) set of keywords matching the query
# to try to include in the summary. The larger this value compared is
# set relative to the length parameter, the more dense the keywords
# may appear in the summary.
juniper.dynsum.max_matches 3
# The maximal number of bytes of context to prepend and append to each
# of the selected query keyword hits. This parameter defines the max
# size a summary would become if there are few keyword hits (max_matches
# set low or document contained few matches of the keywords.
juniper.dynsum.surround_max 512
# The size of the sliding window used to determine if
# multiple query terms occur together. The larger the value, the more
# likely the system will find (and present in dynamic summary) complete
# matches containing all the search terms. The downside is a potential
# performance overhead of keeping candidates for matches longer during
# matching, and consequently updating more candidates that eventually
# gets thrown
juniper.matcher.winsize 600
This changes the default behavior on how summaries are generated. These parameters were found in the old FAST ESP, but has for some reason been left out in FS4SP. These values are all hints as to how the summary should be generated.
Apply the same settings to
C:\FASTSearch\META\config\profiles\default\templates\bliss\FsearchAddonWriter\fsearch.addon.tmpl
Next execute the following commands from the FAST Powershell prompt
nctrl stop configserver search-1
nctrl start configserver search-1
Then we up the character limit in the web part to 2000 and we get this summary of 1500 characters:
If we apply our own summary logic and pick out the sentence with the most hits, we could end up with a summary of just the highlighted sentence, which gives more context than the original summaries. This logic could either be embedded in the XSLT (preferably via a callback to make the code cleaner) or you could override the web part and modify the summary before it’s being output to the XSLT.
(This post is cross-posted at http://techmikael.blogspot.com/2010/11/increasing-summary-length-in-fs4sp.html)
Hi Mikael!
I agree 100% that the default setting for juniper, and the summaries it produces are often less than readable. Usually there’s too many truncated sentences.
Setting max_matches as low as 1 can often force juniper to create more coherent summaries. Here’s an example from VisitNorway with
juniper.dynsum.length 200
juniper.dynsum.max_matches 2
juniper.dynsum.surround_max 64
juniper.matcher.winsize 600
http://www.visitnorway.com/en/VN/Search/?querystring=lofoten&configuration=multi&offset=0&sortby=relevance¶m=searchview,vninternationalprod&hits=10
Some unofficial juniper documentation list a whole range of additional juniper settings that could be of interest to you:
juniper.dynsum.min_length
juniper.dynsum.tok_surround_max
juniper.dynsum.mode
juniper.dynsum.scope_limit
juniper.dynsum.allow_overlap
juniper.dynsum.merge_tags
Ask me for a reference :-)
Cheers,
Vegard
Hi,
I’m still new to FAST ESP and would like to ask this question. Do you know where to get the Juniper documentation that is used by FAST ESP?
Thanks.
You can find some information in the documentation included with ESP 5.3 (FSIS/FSIA). Other information we use have been gathered over the years working with the product.
Hi Mikael,
Are you referring to the group of PDF documents? So there is no specific document regarding Juniper.
Thanks,
Kenneth
I haven’t seen a specific document on this. Only the general ones and the comments included in fsearch.addon. There might exist some unofficial ones around at FAST distributed to partners at one time or another (which is what Vegard refers to in his comment).
Hi Vegard,
Do you have a reference on the Juniper parameters? Thanks.
Mikael,
Is it possible to decrease the length of the generated dynamic teaser from query parameter rather than modifying the fsearch.addon file?
Thanks a lot!
Hi Ken,
Unfortunately there is no query parameter for limiting the teaser from Juniper. Your best bet is what SharePoint does, it applies code logic to the teaser afterwards to cut it like they want to.
That was also my intention by increasing the teaser. Juniper creates a not so good teaser, so give me more text and I can create it myself.
Hi Ken!
You can pass configuration to juniper as parameters to the qrserver. Simply add juniper=<param_name>.<value>[_<param_name>.<value>]* to the qrserver URL. Pay attention to the underscore for separating options.
Some useful parameter may be dynlength, dynmatches, dynsurmax, and winsize.
An example:
juniper=near.2_dynlength.512_dynmatches.8
I’m not at liberty to share the reference, I’m afraid, but I’m happy to answer more questions.
I hope this information is useful to you!
That’s great information Vegard. I wish this would work in the SharePoint space as well, but the QR server is not directly accessible, and the WCF wrappers have limited support for qr parameters.
Hi Vegard,
Thanks for your post, this is what I was exactly looking for. If you don’t mind explaining the paramater near.2 param means?
If I’m not mistaken dynlength sets the length of the teaser and dynmatches sets the number of matching terms in the document to fit in the teaser.
FYI.
I’m still using the stand alone product FAST ESP 5.3. :)