Manually building result collapsing
This article details how to generate result collapsing indexes manually on the command line.
The result collapsing index files are built by the
padre-cc program. The documentation says you need to rebuild the index but you can skip this and build the collapsing manually. You may wish to do this if you have a large index that takes a long time to build.
padre-cc can be seen by running the program without arguments:
$ /opt/funnelback/bin/padre-cc Purpose: To build an index.collapsig file to permit use of collapsed rankings. Usage: /opt/funnelback/bin/padre-cc <index_stem> [-collapse_control=<string>] [-debug=on] Utility for building a .collapsig file of collapsing signatures. If no control_string is given, a one-column file is built using the signatures from the .textsig file. The collapse_control string must consist of sequences of metadata field characters (ASCII letters or digits) separated by commas. The characters $ and # may be used as metadata field characters and represent document summarisable text and document URL respectively. In future, it is planned to allow field characters to be followed by a regular expression, indicating that only the part of the metadata string which matches the regex should be used in calculating the signature. Example current control string: '$,ta'. In this case the .collapsig will have two signatures per document: Column 0 is the normal document signature and column 1 is a signature derived from the concatenation of metadata fields t and a, in that order.
You can view the collapsing command that was run automatically during a collection’s update by looking at the collection’s update log.
In the Index section look for the line
Index: COLLAPSIG - this will the show the command that was used to build collapsing for a collection.
If you wish to manually build collapsing on an existing (live) index you can run the command manually:
$ $SEARCH_HOME/bin/padre-cc <INDEX-STEM> -collapse_control=<INDEXING-COLLAPSE-FIELDS>
<INDEX-STEM>is the index stem that you wish to build collapsing for (eg.
<INDEXING-COLLAPSE-FIELDS>is the list pf fields you wish to collapse on (this is the value that is normally read from the
This should create an
index.collapsig file in the index folder for the collection.