Increasing the summary length in FS4SP

In the settings for the Core Result Web Part you have the possibility to set the length of your hit summary. The default is 185 characters, and the upper limit seems to be somewhere around 400 when running against FAST Search Server 2010 for SharePoint.

image

As an example I have indexed a page on our company website. This is a paragraph from the original text which is 730 characters long.

FAST provides organizations with a scalable, high-performance enterprise search and information access platform designed to give instant access to information that is secure, relevant, accurate, and timely. With FAST and its Contextual Insight capability, it is possible to detect the context and intent of the query, search for terms and phrases, and return requested entities that appear in the context of the matching text. You will get both the contextual results with extreme precision and the contextual, dynamic navigation for further investigation of related information. Advanced linguistics and relevancy management features further improve and simplify your users’ search experience, enabling truly user-centric search.

A search on the term “FAST insight” yields this 160 character summary with the default settings:

image

By checking “Limit Characters In Summary” and setting the character count to 400

image

we get this output of 370 characters:

image

Certainly a much better read than the first one. But still the sentences are quite short, and fewer but more complete sentences would be better in my opinion.

The first step is to change the logic on how FAST generates summaries. Open up

C:\FASTSearch\etc\config_data\RTSearch\webcluster\fsearch.addon

in a text editor and add the following lines:

# Length of the generated summary in bytes. This is a hint to Juniper.
# The result may be slightly longer or shorter depending on the structure
# of the available document text and the submitted query.
juniper.dynsum.length 2048

# The number of (possibly partial) set of keywords matching the query
# to try to include in the summary. The larger this value compared is
# set relative to the length parameter, the more dense the keywords
# may appear in the summary.
juniper.dynsum.max_matches 3

# The maximal number of bytes of context to prepend and append to each
# of the selected query keyword hits. This parameter defines the max
# size a summary would become if there are few keyword hits (max_matches
# set low or document contained few matches of the keywords.
juniper.dynsum.surround_max 512

# The size of the sliding window used to determine if
# multiple query terms occur together. The larger the value, the more
# likely the system will find (and present in dynamic summary) complete
# matches containing all the search terms. The downside is a potential
# performance overhead of keeping candidates for matches longer during
# matching, and consequently updating more candidates that eventually
# gets thrown
juniper.matcher.winsize 600

This changes the default behavior on how summaries are generated. These parameters were found in the old FAST ESP, but has for some reason been left out in FS4SP. These values are all hints as to how the summary should be generated.

Apply the same settings to

C:\FASTSearch\META\config\profiles\default\templates\bliss\FsearchAddonWriter\fsearch.addon.tmpl

Next execute the following commands from the  FAST Powershell prompt

nctrl stop configserver search-1

nctrl start configserver search-1

Then we up the character limit in the web part to 2000 and we get this summary of 1500 characters:

image

If we apply our own summary logic and pick out the sentence with the most hits, we could end up with a summary of just the highlighted sentence, which gives more context than the original summaries. This logic could either be embedded in the XSLT (preferably via a callback to make the code cleaner) or you could override the web part and modify the summary before it’s being output to the XSLT.

(This post is cross-posted at http://techmikael.blogspot.com/2010/11/increasing-summary-length-in-fs4sp.html)

Article written by

Computer literate and search enthusiast with an interest for sharepoint, coding, and life in general :)

10 response to: «Increasing the summary length in FS4SP»

  1. November 30, 2010 at 13:53 | Permalink

    Hi Mikael!

    I agree 100% that the default setting for juniper, and the summaries it produces are often less than readable. Usually there’s too many truncated sentences.

    Setting max_matches as low as 1 can often force juniper to create more coherent summaries. Here’s an example from VisitNorway with

    juniper.dynsum.length 200
    juniper.dynsum.max_matches 2
    juniper.dynsum.surround_max 64
    juniper.matcher.winsize 600

    http://www.visitnorway.com/en/VN/Search/?querystring=lofoten&configuration=multi&offset=0&sortby=relevance&param=searchview,vninternationalprod&hits=10

    Some unofficial juniper documentation list a whole range of additional juniper settings that could be of interest to you:

    juniper.dynsum.min_length
    juniper.dynsum.tok_surround_max
    juniper.dynsum.mode
    juniper.dynsum.scope_limit
    juniper.dynsum.allow_overlap
    juniper.dynsum.merge_tags

    Ask me for a reference :-)

    Cheers,
    Vegard

  2. January 18, 2011 at 11:22 | Permalink

    Hi,

    I’m still new to FAST ESP and would like to ask this question. Do you know where to get the Juniper documentation that is used by FAST ESP?

    Thanks.

  3. January 19, 2011 at 09:02 | Permalink

    You can find some information in the documentation included with ESP 5.3 (FSIS/FSIA). Other information we use have been gathered over the years working with the product.

  4. January 19, 2011 at 10:26 | Permalink

    Hi Mikael,

    Are you referring to the group of PDF documents? So there is no specific document regarding Juniper.

    Thanks,
    Kenneth

  5. January 19, 2011 at 13:40 | Permalink

    I haven’t seen a specific document on this. Only the general ones and the comments included in fsearch.addon. There might exist some unofficial ones around at FAST distributed to partners at one time or another (which is what Vegard refers to in his comment).

  6. January 25, 2011 at 07:34 | Permalink

    Hi Vegard,

    Do you have a reference on the Juniper parameters? Thanks.

    Mikael,

    Is it possible to decrease the length of the generated dynamic teaser from query parameter rather than modifying the fsearch.addon file?

    Thanks a lot!

  7. January 25, 2011 at 09:22 | Permalink

    Hi Ken,
    Unfortunately there is no query parameter for limiting the teaser from Juniper. Your best bet is what SharePoint does, it applies code logic to the teaser afterwards to cut it like they want to.

    That was also my intention by increasing the teaser. Juniper creates a not so good teaser, so give me more text and I can create it myself.

  8. January 25, 2011 at 09:50 | Permalink

    Hi Ken!

    You can pass configuration to juniper as parameters to the qrserver. Simply add juniper=<param_name>.<value>[_<param_name>.<value>]* to the qrserver URL. Pay attention to the underscore for separating options.

    Some useful parameter may be dynlength, dynmatches, dynsurmax, and winsize.

    An example:

    juniper=near.2_dynlength.512_dynmatches.8

    I’m not at liberty to share the reference, I’m afraid, but I’m happy to answer more questions.

    I hope this information is useful to you!

  9. January 25, 2011 at 09:55 | Permalink

    That’s great information Vegard. I wish this would work in the SharePoint space as well, but the QR server is not directly accessible, and the WCF wrappers have limited support for qr parameters.

  10. January 26, 2011 at 03:08 | Permalink

    Hi Vegard,

    Thanks for your post, this is what I was exactly looking for. If you don’t mind explaining the paramater near.2 param means?

    If I’m not mistaken dynlength sets the length of the teaser and dynmatches sets the number of matching terms in the document to fit in the teaser.

    FYI.
    I’m still using the stand alone product FAST ESP 5.3. :)



Leave a response





XHTML: These tags are allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">

Page not found - Sweet Captcha
Error 404

It look like the page you're looking for doesn't exist, sorry

Search stories by typing keyword and hit enter to begin searching.


OSLO

Comperio AS
Øvre Slottsgate 27
NO-0157 Oslo,
Norway
+47 22 33 71 00
View map

STOCKHOLM

Search Provider Sverige AB
Gamla Brogatan 34
SE-11 120 Stockholm
Sweden
+46 8-21 49 00
View map