Padre query processor options

Background

This specifies configuration options that can be supplied to the query processor via the query_processor_options configuration option. The PArallel Document Retrieval Engine (PADRE) query processor is a powerful engine that can be finely controlled through a large list of options that can be given to it. Often these options can be specified in this collection configuration parameter, or as a CGI parameter passed with the search request URL. The list of options available is given here.

Notes

The CGI parameter for a query processor option will have the same name e.g. for the collapsing query processor option you would specify collapsing=on in your CGI request.
If an option is of type boolean then valid values for this are on or off.
Query processing will not occur if the query processor is given an invalid option.
Query processor options can affect Funnelback’s speed and result quality, so change them with caution.
Numerical metadata search is currently only accessible using CGI parameters and not as query processor options.

-categorise_clusters=<boolean>: Whether contextual navigation suggestions are grouped by type.
-cnto=<float> Range: 0.000000 - unlimited: Set contextual navigation time-out to s seconds (s floating point). processing may be omitted entirely if elapsed time for a query already exceeds s seconds. (dflt 1.0).
-contextual_navigation=<boolean>: Whether or not to activate the contextual navigation system.
-contextual_navigation_fields=<string>: String s lists the metadata fields, separated by commas surrounded by square brackets, to scan for contextual navigation suggestions. (dflt '[c,t]'). Note that scanning of document text can be suppressed by including a minus, for example '[-,c,t]'.
-max_phrase_length=<integer> Range: 3 - 7: Maximum length (in words) of contextual navigation suggestions.
-max_phrases=<integer> Range: 0 - unlimited: After this number of candidate phrases have been checked, contextual navigation processing will stop.
-max_results_to_examine=<integer> Range: 0 - 200: Maximum number of search results to scan for contextual navigation suggestions.
-site_max_clusters=<integer> Range: 0 - unlimited: Maximum number of site clusters to present in contextual navigation.
-topic_max_clusters=<integer> Range: 0 - unlimited: Maximum number of topic clusters to present in contextual navigation.
-type_max_clusters=<integer> Range: 0 - unlimited: Maximum number of type clusters to present in contextual navigation.

B. Geospatial options

-geospatial_ranges=<boolean>: Calculate geospatial distance from origin and bounding box ranges when geospatial data is configured and available.
-maxdist=<float> Range: 0.000000 - unlimited: Exclude results not within <f> km of origin.
-origin=<string>: <lat,long> Set origin to lat, long (floating point degrees).

C. Informational options

-canq=<boolean>: Write reordered queries to log. (dflt off)
-countIndexedTerms=<string> [Not CGI]: Metadata fields to have their indexed terms counted in the result set (DAAT only). Unlike rmcf multiple term occurrences in a single document are counted e.g. if metadata 'author' has 'Bob Ada|Bob|Bob' in two documents the resulting counts would be 'Ada': 2, 'Bob': 6. As this counts indexed terms long terms may be truncated depending on the indexer options used. To count fields 'a' and 'c', set this to '[a,c]'.
-countUniqueByGroup=<string> [Not CGI]: Counts the number of unique metadata values grouped by another metadata. Syntax: -countUniqueByGroup=[classToCount]:[groupBy],[classToCount]:[groupBy]. Example: -countUniqueByGroup=[author]:[project] would show us the number of authors contributing to each project. classToCount is a regex and will be expanded to all matching metadata classes e.g. [author.*]:[project] might expand to -countUniqueByGroup=[author]:[project],[authors]:[project].
-count_dates=<string>: Report facet counts for dates such as 'today', 'last week', 'this year'. Note that date categories may overlap. Only value currently supported is 'd'.
-count_urls=<integer> [Not CGI]: Display counts of results grouped by the URL path (Up to depth i). If <I> is 0, then the default value is used. Dflt 5. If <I> is not present count urls is turned off.
-docsPerColl=<boolean>: Show the number documents each collection contributed to the result set.
-rmcf=<string>: Metadata fields to have their words counted in result sets (fields representing facets). If metadata 'author' has 'Bob Ada|Bob|Bob' in two documents the counts would be 'Bob Ada': 2 'Bob': 2. To count fields 'a' and 'c', set this to '[a,c]'.
-rmrf=<string>: Numerical and geospatial fields listed will have their ranges calculated in result sets. To see the ranges of field 'height' and the bounding box geospatial field 'X' set this to '[height,X]'.
-showtimes=<boolean>: Print elapsed times for each stage of query processing.
-sum=<string> [Not CGI]: The sum of a numeric metadata in result set. Syntax: -sum=[sumOn],[sumOn]. Example: -sum=[size] would sum all values of numeric metadata 'size' in the result set. Note sumOn my be a regex which expands sumOn to all matching metadata classes e.g. -sum[size.*] might expanded to -sum=[sizeInKb],[sizeLoc].
-sumByGroup=<string> [Not CGI]: The sum of a numeric metadata by a group. Syntax: -sumByGroup=[sumOn]:[groupBy],[sumOn]:[groupBy]. Example: -sumByGroup=[size]:[project] would sum all values of numeric metadata 'size' grouped by 'project' giving output project 'Foo' has size '128', project 'Bar' has size '12'. Note sumOn my be a regex which expands sumOn to all matching metadata classes e.g. -sumByGroup[size.*]:[project] might expanded to -sumByGroup=[sizeInKb]:[project],[sizeLoc]:[project].

D. Logging options

-ip_to_log=<string>: What form of ip to include in log files: (nothing|ip|ip_hash|remote_user).
-log=<boolean> [Not CGI]: Write query log entries (dflt on).
-qlog_file=<string> [Not CGI]: If writing query log entries, write them to <FILE>.
-username=<string>: A string identifying the current user to be used in padre’s query log.

E. Miscellaneous options

-countgbits=<string>: s is either "all" or a comma-separated list of gscope bitnumbers for which counts are needed. (Bits numbered from zero.)
-exit_on_bad_component=<boolean>: Fail when a component has an incompatible index relative to the first (rather than skip).
-flock=<boolean>: Use flock when locking the query logfile. If set to no, lockf is used instead. Default on Solaris is 'no', all other systems 'yes'.
-mat=<integer> Range: 0 - 2147 [Not CGI]: Set matchset size to n million (dflt 24). Only need to increase on very large collections.
-ndt=<boolean> [Not CGI]: Don’t do tests on docs, e.g. phantom, zombie, *scope, binary, expired.
-unbuf=<boolean>: Don’t buffer the standard output stream. In some specific cases, setting this to 'no' can improve performance.
-view=<string>: The collection view the perform the query against when in CGI mode. Normally 'live' (default), 'offline' or 'snapshot###'.

F. Presentation options

-EORDER=<integer> Range: 0 - 1: Specify presentation order of query biased summary excerpts. 0: natural order in doc. 1: sorted by score. (dflt 0)
-MBL=<integer> Range: 1 - unlimited: Set buffer length per displayed metadata field to n bytes (dflt 250 bytes). Warning: setting very large values will increase query processor memory demands and may cause problems.
-SBL=<integer> Range: 1 - unlimited: Set summary buffer length to n bytes. (dflt 250 bytes)
-SF=<string>: Metadata fields to include in summaries. (if applicable). To include fields author and d set this to [author,d]. This option also supports regex to include all metadata classes set this to [.*] to include fields prefixed with Fun and metadata class author set [Fun.*,author].
-SHLM=<integer> Range: 0 - 7: Select highlighting method within snippets in XML. 0 - No highlighting ; 1 - HTML strong tags ; 2 - Show highlighting regexp. and unhighlighted summary [dflt]; 5 - Use HTML strong tags but remove accents from summary before highlighting, provided query was not accented.
-SM=<string>: Summary mode. Possible values are 'both' (or 'def') - Display description or query-bias summary and metadata fields listed in the 'SF' option; 'snip' - display a generated snippet; 'meta' - display metadata fields listed in 'SF'.; 'qb' - display a query-biased summary; 'auto' - Print metadata codes if specified in user query.; 'off' - Turn off all summaries.
-SQE=<integer> Range: 1 - 10000: Set max no. of query biased summary excerpts to n (dflt 3).
-all_summary_text=<boolean>: Is text used for generating summaries required in the result
-countUniqueByGroupSensitive=<boolean> [Not CGI]: Treat group names and metadata items case sensitively (default no).
-ctest_mode=<integer> Range: 0 - 3: Controls behaviour of padre-sw when -ctest is used. 0: no internal evaluation; 1 - internal evaluation only. Output is brief plain text report of measures; 2 - internal evaluation only. Output in plain text with QBQ output followed by measures; 3 - internal evaluation plus normal CTOUT output in XML (with measures presented as comments)
-explain=<boolean>: Explain rankings by showing score components. (Note that -explain=on turns off result set diversification).
-explore=<integer> Range: 7 - 50: Show 'explore' links against results. The value specifies how many terms to include in the expanded query.
-gscoperesult=<string>: Specifies the bit number that results will be set to in -res gscope or -res docnums modes (dflt 1).
-mdsfhl=<boolean>: Are query terms only highlighted in MDSF metadata summaries
-num_ranks=<integer> Range: 0 - unlimited: Limit number of results to n (min = 0, dflt = 10).
-num_tiers=<integer> Range: 0 - 50: Limit number of result list tiers to n (min = 0 (no ,limit), max = 50, dflt no limit)
-qieval=<float> Range: 0.000000 - 1.000000: Set the value presented for query independent evidence when using the qiecfg result format. (dflt 0.5).
-qwhl=<string>: Determines which parts of a search result are highlighted. S - snippet, M - metadata, U - URL, T - title. E.g. -qwhl=MUT
-res=<string>: Set result format. Possible values are: trec, web, xml, urls, qiez, qieo, gscope, docnums, ctest, qiecfg or flcfg. Note: setting res to docnums, flcfg, gscope, qiecfg, qieo or qiez will override any num_ranks setting so that all results are returned.
-results_in_facet_categories=<integer> Range: 0 - 100: Include the specified number of pre-computed search results under the rmc count element for metadata facet categories.
-rmc_maxperfield=<integer> Range: 0 - unlimited: Set maximum number of RMC items to display per field at n (dflt 100).
-rmc_sensitive=<boolean> [Not CGI]: Treat facet categories (RMC items) case sensitively (default no).
-show_qsyntax_tree=<boolean>: Include an SVG representation of the query-as-processed in output.
-start_rank=<integer> Range: 1 - unlimited: Present results starting from n (dflt 1).
-sumByGroupSensitive=<boolean> [Not CGI]: Treat group names case sensitively (default no).
-tierbars=<boolean>: Display tierbars in result list output (XML and HTML). When turned on (for all -res modes) and -sort is used, results will be first sorted by tier then by the sorting mode, otherwise if -sortall is used then all results will be sorted regardless of tier.
-translucent_DLS_fields=<string> [Not CGI]: Metadata fields which are translucent. Translucent fields are visible on documents which the user can not see. To include fields 'a' and 'd' set this to '[a,d]'. If collapsing is enabled and the collapsing signature contains only fields defined here than collapsing will be permitted on documents the user can not see.

G. Query interpretation options

-STOP=<string> [Not CGI]: Use the stoplist specified in <file> (one word per line)
-binary=<integer> Range: 0 - 3: Determines whether or not binary documents are returned in the results. 0 - show all documents; 1 - show only binary documents; 2 - show only non-binary documents.
-clive=<string>: Dynamic metacollections. Specifies a component name within a .sdinfo file(s) to make active. Can be set multiple times to enable multiple collections.
-daat_termination_type=<integer> Range: 0 - 2: Selects how DAAT early exit is determined. 0 - try for d results with every metafield and every component; 1 - try for d results over every component but not necessarily every metafield; 2 - stop a soon as d results are obtained. (d is the parameter to -daat.)
-daat_timeout=<float> Range: 0.000000 - 3600.000000 [Not CGI]: Impose a soft timeout (in seconds) on the time taken by the DAAT machinery for one query.
-dont_estimate_full_matches=<boolean>: In DAAT mode don’t guess the number of full matches when the DAAT depth did not let us processes an entire postings list.
-events=<boolean>: Must be set if event search is to be used
-fmo=<boolean>: Present full matches only.
-lang=<string>: If a 2-character language code is specified by this means, then stemmers etc specific to that language will be used, IF AVAILABLE. It is also permissible to use a 5-character code like en_GB, but padre behaviour will be the same as for en. Specifying lang also makes title and metadata sorting of results locale-specific, however support for this on Windows platforms is limited and problematic.
-loose=<integer> Range: 0 - unlimited: Phrase looseness in words (min = 0, dflt = 0).
-max_qbatch=<integer> Range: 1 - unlimited: Terminate batch query processing after the specified number of queries have been processed.
-max_terms=<integer> Range: 1 - unlimited: Truncate queries after the specified number of terms. If the query is reordered, truncation will occur after reordering.
-min_truncated_len=<integer> Range: 0 - 20 [Not CGI]: The text part of a query term with a right truncation operator must have at least this length. E.g. if min_truncated_len were 4 funnel* would be accepted but fun* would be processed as fun.
-noexpired=<boolean> [Not CGI]: Exclude expired docs from results. (Nullified by -zom)
-nulqok=<boolean> [Not CGI]: An empty query submitted via CGI will be processed as a null query. The system query must be empty as well. (dflt is to ignore the request).
-phrase_prox_word_limit=<integer> Range: 1 - unlimited [Not CGI]: Phrase or proximity terms with more than this number of words will be shortened by deleting words from the right. E.g. If this limit were 4 then to be or not to be would be processed as to be or not
-prox=<integer> Range: 0 - unlimited: Proximity limit in words (min = 0, dflt = 15).
-qsup=<string>: When blending queries, determines sources of supplementary queries to be tried, with corresponding weights assigned to each source (ranging from 0 to 1). No spaces. 'off' may be specified to disable supplementary queries. E.g. -qsup=SPEL/0.9+USUK/0.4+SYNS/0.1+LANG/0.1. Available sources are: SPEL (spelling suggestions); USUK (table of spelling differences between US and UK English); SYNS (synonyms as defined by the blending.cfg file); LANG (experimental German decompunding); VSYN (vector synonyms as defined by the vector_blends.cfg file)
-query_reorder=<boolean>: Reorder terms in query so that the most discriminating (least common) appear first. Often coupled with -max_terms=
-ras=<integer> Range: 0 - 2: Remove any stopwords from the query. Possible values: 0 - remove none; 1 - remove dynamically depending on the query; 2 - remove all stopwords (dflt 1).
-service_volume=<string> [Not CGI]: Either 'high' or 'low'. A convenience setting to increase or reduce allowable query complexity and timeouts according to service volumes — large or small indexes, high or low query volumes.
-stem=<integer> Range: 0 - 3: Controls stemming of queries. 0 - do not stem (dflt); 1 - do not stem (replaces obsolete option); 2 - Stem all query words (light - English/French plural/singular only); 3 - Stem all query words(heavier).
-stem_lconly=<boolean>: When stemming, stem only lowercase query words (to avoid stemming proper names and acronyms).
-strip_invalid_utf8=<boolean>: Normally, invalid UTF-8 characters are removed during indexing. If this hasn’t happened. This option allows them to be removed from result packets.
-synonyms=<boolean>: If set, the query processor will expand queries using thesaurus in synonyms.cfg.
-truncation_allowed=<integer> Range: 0 - 3 [Not CGI]: Enables the use of the * operator, binary valued, it is only valid in use with an option that disables DAAT mode such as, -service_volume='lo' or -daat=0. When applied all contexts are available such as: :funnelback, funnel, back, and *:*elba.
-wildcard_thresh=<integer> Range: 0 - unlimited: If the postings list for a term is longer than the specified value (in MB) it will be treated as a wildcard.
-zom=<boolean>: Include docs in results even if noindex or killed.

H. Query source options

-ctest=<string> [Not CGI]: Read a batch of queries from testfile (in C_TEST format). Sets output format to RM_CTEST, but that may be overridden. (See es.csiro.au/C-TEST/ for information about C-TEST.)
-s=<string>: System-generated query inserted behind the scenes by a form or front-end.

I. Quicklinks options

-QL=<integer> Range: 0 - 5: Activate QuickLinks facility for default pages down to the specified level. 0 - off; 1 - server root pages; 2 - next level down.
-QL_rank=<integer> Range: 1 - unlimited: If QuickLinks capability is active, show quick links for search results down to the specified rank.
-QL_rank_is_relative=<boolean>: If true, the value of QL_rank will be interpreted relative to the start_rank. E.g. if QL_rank=2, the first two results on each page may show QuickLinks.

J. Ranking options

-SameSiteSuppressionExponent=<float> Range: 0.000000 - unlimited: Same site suppression penalty exponent (dflt 0.5, recommended range 0.2 - 0.7).
-SameSiteSuppressionOffset=<integer> Range: 0 - 1000: Number of additional documents from a site beyond the first that are allowed their full score before applying a same site suppression penalty (dflt 0)
-absscores=<boolean>: Report content scores as % of max possible Okapi score (Intended for use with -vsimple=on).
-anniemode=<integer> Range: 0 - 3: Control the use of annotation indexes. 0 - do not use annotation indexes ; 1 - Process queries using annotation indexes only; 2 - Process queries using annotation indexes, falling back to normal indexes if insufficient results. (Most query op.s stripped.) 3 - Process queries using both annotation and normal indexes (Most operators stripped from queries.). Default 0.
-b=<float> Range: 0.000000 - unlimited: Set Okapi b to f (dflt 0.75)
-cgscope1=<string>: Documents matching this gscope expression (reverse Polish) can be upweighted with -cool.68. Those not matching can be upweighted with -cool.70.
-cgscope2=<string>: Documents matching this gscope expression (reverse Polish) can be upweighted with -cool.69. Those not matching can be upweighted with -cool.71.
-cool=<boolean>: Whether to use topic distillation scoring (cool and cooler). Dflt on.
-cool.<Key>=<key/value pair>: cool.N=V Set a value for the Nth tune parameter. See cooler ranking options page for possible values of N.
-daat=<integer> Range: 0 - 10000000: Specifies the maximum number of full matches for Document-At-A-Time processing. If set to 0, Term-At-A-Time is used instead (dflt 5000).
-diversity_rank_limit=<integer> Range: 10 - unlimited: Diversification won’t alter ranking beyond rank n (default 200, min 10).
-facet_url_prefix=<string> [Not CGI]: Present only results whose URL is prefixed by the given URL. Note that the scheme and hostname part are case insensitive, for URI with scheme smb:// the entire prefix is case insensitive. The behaviour of this option may change in the future to suit facets, this should not be used outside of faceted navigation.
-gscope1=<string>: Present only results whose gscope bits match reverse Polish expression e (Bits numbered from zero). If set to off, disable any previous expression.
-k1=<float> Range: 0.000000 - unlimited: Set Okapi K1 to <f>. (dflt 2.0)
-kmod=<integer> Range: 0 - 1: Select special scoring function i for special fields. 0 = normal, 1 = AF1 (dflt 1).
-lscope=<string>: Present only results whose URL matches a sort-of left-anchored pattern.
-lscorrect=<boolean>: Whether to correct link scores across meta collection components (default yes).
-main_homepage_factor=<float> Range: 0.000000 - 1.000000: Penalise score of the homepage of a single-entity-controlled domain to prevent over representation in results sets. E.g. www.anu.edu.au/ in an index of ANU. (dflt 0.90)
-meta_suppression_field=<string>: If same_meta_suppression is activated, the specified metadata field will be the field to which it applies. Only one metadata field can be treated in this way.
-near_dup_factor=<float> Range: 0.000000 - 1.000000: The query processor will penalise a result which is a near-duplicate of a previous result by multiplying by the factor specified. The penalty stiffens with more repetition. (dflt 0.5)
-promote_urls=<string>: Insert the specified URLs at or near the top of the results list for a query. Value is a space separated list of URLs. URLs must correspond to those recorded by padre-iw. (dflt Inactive)
-quanta=<integer> Range: 10 - 100000: Set the number of possible score quantisation levels for each cool variable. In general, a high number should give more accurate ranking but may slow query processing.
-rank_limit=<integer> Range: 10 - unlimited: Limit highest rank requestable to n (dflt 1,000,000,000).
-ranking_profile=<integer> Range: 0 - 100 [Not CGI]: Choose a profile of settings for the ranking function. 0 - current default; 1 - Standard BM25; 2 - Traditional (pre-12.0) Funnelback. Setting a profile does not override explicit settings.
-recency_decay_vals=<string>: <z,w,m,y,d,c,m> - Define how recency scores decay with time. z w, m, y, d, c, m are floats in the range 0 - 1, which specify the recency score assigned to documents, 0 days, 1 wk, 1 mth, 1 yr, 1 dec, 1 cen, 1 mill. old. (dflt 1.0,0.75,0.5,0.25,0.025,0.0025) Recency scores between key values linearly interpolated. Past the millennium, recency scores are 1/daysold.
-reference_date=<string>: If specified, recency is based on this date rather than that of most recent doc. Format is <yyyymmdd>, or 'today'.
-remove_urls=<string>: Prevent the specified URLs from appearing in the results for a query. Value is a space separated list of URLs. URLs must correspond to those recorded by padre-iw. (dflt Inactive)
-sco=<string>: <n>[<classes>] Set doc scoring mode to n, using the classes specified. Most common values: 0 - score using doc text only ; 1 - no scoring. Produce an unordered set of results ; 2 - score using anchortext and URLs as well, upweight titles (or whatever fields are configured with -specf). For example to automatically look in fields 'u' and 'v' for the query terms set -sco=2[u,v]
-scope=<string>: Present only results whose URL satisfies the include/exclude scopes included in list (comma separated). e.g. -scope=anu.edu.au,-anu.edu.au/archives
-sort=<string>: Sort top results by <string>. Possible values: 'date', 'adate' (ascending date), 'title', 'dtitle' (descending title), 'size' (file size), 'dsize' (descending filesize), 'url', 'durl' (descending url), 'coll' (collection name, then score), 'dcoll' (descending collection name, then score), 'meta<f>' (by metadata field f, then score),'dmeta<f>' (descending metadata field d, then score), 'shuffle' (random to avoid bias), 'collapse_count' (to order by the number of collapsed documents, with the largest collapsed set first), 'acollapse_count' (with the largest collapsed set last), 'prox' (for geo search: Sort top results by proximity to origin), 'dprox' (for geo search: Sort top results by descending proximity to origin). 'score_ignoring_tiers' (descending score, ignoring any tiers. Only useful with sortall.) (dflt is case-insensitive for title and meta). '-sort=' turns off sorting.
-sort_sensitive=<boolean>: Use case-sensitive sorting when sorting results by title or metadata strings.
-sortall=<boolean>: Include partial matches in the resorting performed by -sort.
-specf=<string>: Fields listed in string s, as a list of comma separated fields surrounded by square brackets, will be scored specially and added to query when using the -sco=2 mode (dflt '[k,K]').
-sss_defeat_pattern=<string>: URLs matching the specified pattern (currently a simple string match) will not be subject to samesite suppression.
-static_cool_exponent=<float> Range: 0.000000 - 1.000000: Control the extent to which static scores are attenuated with length of query. 0 => no attenuation; 1 => max attenuation. Attenuation by len ** -f.
-unknown_daysold=<integer> Range: 0 - unlimited: A doc with unknown date is assumed to be d days old (for recency calcs) (dflt 366).
-use_Paik=<boolean>: Use the tf.idf scheme proposed by Jiaul Paik at SIGIR 2013 rather than the more conventional BM25 variant.
-use_secds=<boolean>: When working with domain-importance features in ranking, use SECDs if value is on, and raw domain names otherwise.
-vsimple=<string>: Very simple ranking. If set to 'on', equivalent to -sco=0 -cool=off -SSS=0 -kmod=0.
-weight_only_fields=<string>: Documents will not be retrieved in DAAT mode if they only match unfielded query terms in one or more of the implicit fields listed here. For example, specifying '[K,k]' will stop the query 'Monica Lewinski' matching a document solely because of click data or referring anchortext.
-wmeta.<Key>=<key/value pair>: wmeta.C=F Set upweighting factors for metadata class scoring. C - metadata class; F - weight to set. (dflt 0.5 for 'k' and 'K', 1 for everything else).
-xscope=<string>: Present only results whose URL exactly matches the provided URL (after canonicalization).

K. Ranking - Result diversification options

-SSS=<integer> Range: 0 - 10: Same site suppression depth: 0 - no suppression (dflt); 2 - hosts and their top level dir’s; 10 - org domain (includes sub-domains) e.g. defence.gov.au.
-neardup=<float> Range: 0.000000 - 1.000000: Near dupulicates in ranking are multiplied by f. Setting f to 1 turns off near-dup detection.
-repetitiousness_factor=<float> Range: 0.000000 - 1.000000: Penalise a repetitious result by multiplying by the factor specified. (Repetitiousness may involve same-site, same component or repeated metadata.) The penalty stiffens with more repetition. Setting to 1 turns this off. (dflt 1.0)
-same_collection_suppression=<float> Range: 0.000000 - 1.000000: While searching a meta-collection, penalise the second result from the same primary collection as a previous result by multiplying by the factor specified. The penalty stiffens with more repetition. Setting to 0 turns this off. (dflt 0)
-same_meta_suppression=<float> Range: 0.000000 - 1.000000: Penalise the second result with the same value for a specified metafield as a previous result by multiplying by the factor specified. The penalty stiffens with more repetition. Setting to 0 turns this off. (dflt 0)
-title_dup_factor=<float> Range: 0.000000 - 1.000000: The query processor will penalise a result which has exactly the same title as a previous result by multiplying by the factor specified. The penalty stiffens with more repetition. Setting to 1 turns this off. (dflt 0.5)

L. Result collapsing options

-collapsed_docs_sort=<string>: Sort collapsed results by <string>. Possible values: 'date', 'adate' (ascending date), 'title', 'dtitle' (descending title), 'size' (file size), 'dsize' (descending filesize), 'url', 'durl' (descending url), 'coll' (collection name, then score), 'dcoll' (descending collection name, then score), 'meta<f>' (by metadata field f, then score),'dmeta<f>' (descending metadata field d, then score), 'shuffle' (random to avoid bias), 'prox' (for geo search: Sort collapsed results by proximity to origin), 'dprox' (for geo search: Sort collapsed results by descending proximity to origin). 'score_ignoring_tiers' (descending score, ignoring any tiers. Only useful with sortall.)
-collapsing=<boolean>: Activate collapsing. Collapsing will be based on document content ('$') unless a collapsing_sig value is specified. Note that use of this option will disable result set diversification.
-collapsing_SF=<string>: Metadata fields to include in display for collapsed documents (assuming collapsing_num_ranks is non-zero). (dflt no fields). To view metadata fields 'id' and 'a' set this to '[id,a]'.
-collapsing_label=<string>: Label to indicate why items have been collapsed. (dflt "which are very similar")
-collapsing_num_ranks=<integer> Range: 0 - 1000: Specify how many collapsed results are to be shown under the uncollapsed ones. (dflt 0)
-collapsing_scoped=<boolean>: Scope to only documents which have been collapsed on. Default is off.
-collapsing_sig=<string>: The collapsing_control segment to use when collapsing. E.g. "[a,p]", collapse on author+publisher. The value must correspond to one segment of the indexing.collapse_fields string. (A segment is a comma separated list of fields surrounded by square brackets) (dflt '[$]' (Collapsing on document content.))

M. Security options

-dls_internal_test=<integer> Range: 0 - unlimited: This allows testing of the padre side of the custom document level security mechanism. There is no call out to an external function. The value is interpreted as a combination of bits: 1 bit - dls_internal_test is active/not active; 2 bit - selects whether MINRESULTS mode is used or not. During internal testing, every odd numbered document in the original ranking is arbitrarily treated as inaccessible.
-ipreject=<string> [Not CGI]: QUERY_LIMIT,WINDOW_SECONDS,UPPER_QUERY_LIMIT - Use an IP rejector to limit requests from a single machine. Allow QUERY_LIMIT queries per WINDOW_SECONDS, don’t record more than UPPER_QUERY_LIMIT queries.
-ldLibraryPath=<string> [Not CGI]: Full path to security plugin library
-locking_model=<string> [Not CGI]: Name of locking model, either "trim" or "sharepoint".
-no_security=<boolean> [Not CGI]: Disable DLS, available as a command line option.
-secPlugin=<string> [Not CGI]: Name of security plugin library
-translucent_DLS=<boolean> [Not CGI]: Enables translucent DLS DAAT only.
-userkeys=<string> [Not CGI]: Conduct this search with security keys specified by s. The format is '<collectionName>;key<delim>' where delim is either ',' or new line, spaces are removed for example 'c1;k1 c2;k1,c2;k2'

N. Spelling options

-spelling=<boolean>: Activate spelling suggestion mechanism.
-spelling_alpha=<float> Range: 0.000000 - 1.000000: Set the weighting between 'closeness to the query' and support in the collection for a candidate suggestion. Big alpha, high weight on closeness to the query. (dflt 0.7)
-spelling_blend_thresh=<float> Range: 0.000000 - 1.000000: Confidence threshold for automatically blending results for a query suggestion with those from the user’s original query. (dflt 0.67)
-spelling_difflen_thresh=<integer> Range: 0 - 1000: Don’t make suggestions more than i characters longer or shorter than query. (dflt 2)
-spelling_dym_thresh=<float> Range: 0.000000 - 1.000000: Confidence threshold for making a 'Did you mean' suggestion. (dflt 0.5)
-spelling_edist_constant=<float> Range: 0.000000 - 1000.000000: Don’t make suggestions whose edit distance from the query exceeds f + query_length * spelling_edist_proportion. (dflt 1)
-spelling_edist_proportion=<float> Range: 0.000000 - 1.000000: Don’t make suggestions whose edit distance from the query exceeds spelling_edist_constant + query_length * f (0⇐f⇐1). (dflt 0.25)
-spelling_fullmatch_trigger_const=<float> Range: 0.000000 - unlimited: Don’t look for suggestions if there are at least f * log10(num docs) full matches. (dflt 30.0)
-spelling_fullmatch_trigger_const=<float> Range: 0.000000 - inf: Don’t look for suggestions if there are at least f * log10(num docs) full matches. (dflt 30.0)
-spelling_include_context=<boolean>: Include the non-corrected part of the query in the suggestion link. (dflt 1)
-spelling_min_querylen=<integer> Range: 1 - 1000: Suggestions not made for queries shorter than this. (dflt 2)
-spelling_wt_thresh=<float> Range: 0.000000 - 100.000000: Sets a threshold that determines if a spelling suggestion is returned. If the generated spelling suggestion weight is less than this, the suggestion is not returned. (dflt 0.01)

O. TREC specific options

-trec_runid=<string>: For TREC participation: Each result in TREC format will include this runid.
-trec_topic=<integer> Range: 0 - unlimited: For TREC participation: The first query in a batch will get this topic number. Each new query will increase the number by one.
-trecids=<boolean>: For TREC participation: Each result in TREC format will use the TREC docno rather than a URL

P. Vector Search options

-fusion_ranking=<boolean>: Enable/Disable fusion ranking with vector search
-vector_search_db=<string>: Vector Search DB filename

Help Center

Menu