Funnelback 15.10.0 release notes

15.10.0 - New features

  • Redesigned accessibility auditor reporting interface, and new WCAG technique implementations.

  • Redesigned trend alerts reporting interface.

15.10.0 - Selected improvements and bug fixes

  • Curator rule editing now supports custom structured parameters for most actions.

  • Introduced a minimum length (15 characters) when user change their password.

  • The default hashing mechanism for passwords is now BCRYPT. Existing user passwords will continue to use the legacy hashing mechanism until the password is updated via the Administration interface or accounts API.

  • Session information is now correctly stored for very long queries.

  • Added support for configuring acceptable SSL cipher suites and protocols.

  • Web crawler will no longer report 'invalid' protocols like data: and tel:.

  • Introduced standardised filters for converting CSV or JSON to XML.

  • Fixed overwriting of changed cookie values during form interaction authentication.

  • Improved URL drilldown selection within content auditor.

  • Introduced profile management interface.

  • Introduced a default timeout (50 seconds) for IncludeURL tags, avoiding blocking threads forever in some cases.

  • Fixed cases where collection update progress would not be displayed after starting an update.

  • Fixed date sorting to work correctly for past and future documents (rather than being based on proximity to current date).

  • The query processors (padre-sw) -SF option now accepts regex.

  • The version of Jsoup included has been upgraded to 1.10.2.

  • Log files are no longer created when collections which do not exist for the current Funnelback installation are requested.

15.10.0 - Upgrade Issues

  • Due to the improved password hashing mechanism, HTTP Basic Authentication for APIs is considerably slower than in previous versions. If performance is a concern, API users are advised to switch to token-based authentication.

  • The sessions database schema has changed to allow for longer queries. The inbuilt session databases will be automatically upgraded, however if you are using a external database for sessions you will need to run update-session-db.groovy with the appropriate driver and url.

  • Due to the new IncludeURL default timeout, IncludeURL calls expected to take longer then 50 seconds should manually set a higher timeout.

15.10.0 - Notice

  • Please be aware that 15.10 is the last version of Funnelback which will support running on Windows Server 2008.

Patches

Type Release version Description

3 Bug fixes

Upgrades log4j2 to version 2.16 to fix the security vulnerability where log4j2 JNDI features do not protect against attacker-controlled LDAP and other JNDI related endpoints.

3 Bug fixes

Removes the screens for file-manager rule editing which could create security issues

3 Bug fixes

Fixes an issue where support packages could contain unintended files

3 Bug fixes

Fixes an issue where the running Funnelback jetty web server could retain permissions via supplemental groups after startup

3 Bug fixes

Limits an administration CGI script to redirect only within the Funnelback administration interface as intended

3 Bug fixes

Removes the unused administration debug.cgi script which reflected input parameters without proper escaping

3 Bug fixes

Prevent XSS AngularJS sandbox bypassing injection in Freemarker templates escaped using output formats by inserting zero-width whitespace between consecutive open-curly-brackets.

3 Bug fixes

Prevent XSS AngularJS sandbox bypassing injection in Freemarker templates by inserting zero-width whitespace between consecutive open-curly-brackets.

3 Bug fixes

Prevent XSS AngularJS sandbox bypassing injection in Freemarker templates by inserting zero-width whitespace between consecutive open-curly-brackets.

3 Bug fixes

Improves the Accessibility Auditor historical data storage. The data is stored in less space while also being significantly faster when storing and retrieving data. The Accessibility Auditor historical data APIs are also improved to reduce the amount of memory needed to help reduce the chance of 'OutOfMemoryError' exceptions from being thrown. The Accessibility Auditor historical data will be automatically moved to the new storage format when Jetty is restarted (one collection at a time) or on the first Accessibility Auditor historical data API request.

3 Bug fixes

The default timeout for 'push.scheduler.delay-between-meta-dependencies-runs' has been increased to '1200' (20 minutes). This has been increased to reduce the frequency at which Accessibility Auditor historical data is recorded. This option will need to be overridden if meta collections containing push collections need a smaller delay in updating the spelling index and auto completion.

3 Bug fixes

Prevents creation of objects within Freemarker template files to ensure that template editors can not cause external code to be executed.

3 Bug fixes

Fixes security issues where:

  • The default form-not-found template reflected the given form id without proper escaping.

  • The default configuration of URL previewing could be used to expose local log file content.

Please ensure any custom form-not-found.ftl templates in collections are updated to perform correct escaping if they were derived from the previously vulnerable form-not-found.default.ftl.

Please ensure that any customised value for the global default_url_renderer.permitted_url_pattern setting in global.cfg prevents access to file:// URLs.

3 Bug fixes

parent_group Facebook events field has been removed since it requires escalated permissions. On some Facebook collections, this caused crawling of events to fail.

3 Bug fixes

Upgrades the version of our internal libraries to account for recent breaking changes in the Facebook Graph API. This will fix issues that caused Facebook collections to fail to update on certain user accounts, when crawling more than 200 posts in an hour, and when crawling events posted by a page. To update existing Facebook collections that may be failing, the changes noted in deployment instructions below will need to be made on each groovy script. best_page & parent_page Facebook page fields have been removed since they require escalated permissions.

3 Bug fixes

Fixes a bug where ratio to run full or incremental updates was not being applied and only a full update was triggered.

3 Bug fixes

Fixes a bug for scheduled updates where the 'schedule.incremental_crawl_ratio' parameter was not being respected.

3 Bug fixes

Upgrades the twitter library to add support for the longer, 280 character tweets. For this to be used, the ConfigurationBuilder object needs to be updated to call "setTweetModeExtended(true)". With the default twitter groovy gather script, this can be done by adding "cb.setTweetModeExtended(true);" immediately after the creation of the new ConfigurationBuilder.

3 Bug fixes

Fixes a bug in Accessibility Auditor which caused the document audit view to fail when a document contained escaped or unicode characters in their classnames.

3 Bug fixes

Fixes a bug in Accessibility Auditor which meant urls containing +'s could not be viewed in the audit tool.

3 Bug fixes

Fixes a bug where the Admin API was passing the comment to the publish hook as multiple arguments where it should have been passing the comment as a single argument.

3 Bug fixes

Adds time-based reloading of type-caching objects (XStream and Jackson serialisers) to avoid leaking metaspace memory when groovy classes are serialised and reloaded over time.

By default, reloading occurs every 10 minutes, and can be configured in modernui.properties.

3 Bug fixes

Provides an internal configuration option for controlling groovy class reloading.

3 Bug fixes

Fixed an issue where the user editing interface for a user with no permitted collections would be presented with all collections selected, rather than none.

3 Bug fixes

Allow groovy servlet filters to abort processing in preFilterResponse by returning null.

3 Bug fixes

Changes the click tracking endpoint to no longer depend on the referrer. This does result in the click logs no longer containing the referrer URL.

3 Bug fixes

Fixes an issue where auditing a PDF in Accessibility Auditor could not be rendered.

3 Bug fixes

To minimise the number of false positives reported by XSS testing tools, JSON endpoints have restricted the JSONP callback to only contain A-Za-z0-9 as well as $._-[]".

3 Bug fixes

Fixes an issue where bin/reports-send-email.pl was not reading profile email configuration options from reporting-email.cfg.

3 Bug fixes

Fixes an issue where auto completion with partials did not respect the profile scope.

3 Bug fixes

Fixes an issue with bin/reports-send-email.pl which fails to run.

3 Bug fixes

Fixes an issue where updates and editing meta collections can fail because historical data for Accessibility Auditor could not be recorded.

3 Bug fixes

Restores the behavior of update.pl such that the gatherer (e.g. the web crawler) will use the same collection.cfg file that is passed to update.pl.

3 Bug fixes

Improves the creation of snapshots on empty push collections.

3 Bug fixes

Disable production of the unused (and potentially large) graph.log file.

3 Bug fixes

Fixes an issue where instant updates would not work when external metadata was used.

3 Bug fixes

Fixes a translation that was removed in a previous patch which caused the current license usage not to display correctly.

3 Bug fixes

Improvements to Push collections:

  • Fixes some bugs which may lead to data loss.

  • Collections that won’t start can now have all their data deleted to be able to return to a working state.

  • Switching modes from SLAVE to DEFAULT or vice versa is now more reliable.

  • Collections with slaves can now be deleted.

3 Bug fixes

Fixes an issue with Facebook collections preventing some posts from being gathered if a post or comment caused an error. Includes the full fix which was missing from patch 15.10.0.14.

3 Bug fixes

Updates the 'search_user' collection.cfg option such that if '@' is missing from the name '@' followed by the hostname of the machine will be appended to make the from address when sending collection update related emails.

3 Bug fixes

Fixes a bug where indexed terms within a metadata field could be revealed in DLS enabled collections if both query processor options translucent_DLS and countIndexedTerms where enabled. Note that these options can only be enabled server side.

3 Bug fixes

Fixes an issue with Facebook collections preventing some posts from being gathered if a post or comment caused an error.

3 Bug fixes

Add support to crawl Flickr photosets.

3 Bug fixes

Improves the synonyms API to handle files with windows line endings.

3 Bug fixes

Fixes a bug in the crawler which caused the crawl to only terminate after the crawl timeout, rather than when the crawler had no more URLs to crawl.

3 Bug fixes

Updates the location of the Push sync restart API call to be consistent with other state changing calls. The existing API call is kept for compatibility.

3 Bug fixes

Adds a new Push sync health API calls that never return null for the value of the boolean in the response. The new calls are under /v2/ of the API.

3 Bug fixes

Fixes a bug where indexing may fail on large indexes.

3 Bug fixes

Makes a URL to display field optional in a curator and best bets UI editor.

3 Bug fixes

Fixes a bug in the Marketing Dashboard where labels in the accessibility auditor chart are overlapping.

3 Bug fixes

Introduces the ability to customise the jetty access logging configuration with logback.

The default behaviour of logging is unchanged, however with this patch it is possible to configure access log compression, filtering and size-based retention policies if desired.

See Funnelback version 15.12 "Configuring embedded web server" documentation for details and example of how to customise access logging configurations.

3 Bug fixes

Fixes a bug where Content Auditor would be incompatible with collections that had facets configured.

3 Bug fixes

Updates the version of restfb so that custom Facebook gatherers may use a later version of the graph API.

3 Bug fixes

Adds a customData map in the SearchQuestion for convenience

3 Bug fixes

Fixes a bug in the query processor when promote URLs was used with URLs that contained double dash.

3 Bug fixes

Fixes a bug in the query processor where sorting on file size did not work.

3 Bug fixes

Fixes the modern UI search endpoints such that it returns only config for the selected profile rather than for for all profiles.

3 Bug fixes

Reduces memory requirements for the Jetty web server JVM when Funnelback has many collections.

3 Bug fixes

Fixes a bug where the query processor would fail when sorting on a metadata class that was not defined on all components of a meta collection.

3 Bug fixes

Fixes a bug where exporting the top queries to csv on the marketing dashboard was not working in Internet Explorer 11.

3 Bug fixes

Fixes a bug where exporting the trend alerts listing to csv on the marketing dashboard was not working in Internet Explorer 11.

3 Bug fixes

Fixes a bug where the update step RecordAccessibilityAuditorHistory would not wait long enough for the API request to report the data to complete, preventing the update from finishing.

3 Bug fixes

Fixes a bug where very long duplicate titles in the Content Auditor would incorrectly wrap.

3 Bug fixes

Fixes a bug where the auto completion for the Funnelback documentation in the admin interface would not work correctly.

3 Bug fixes

Fixes a bug where $GROOVY_COMMAND in the post_update_command collection.cfg option would not be expanded.

3 Bug fixes

Fixes a bug where Push Replication would re-attempt a connection to master without sleeping if the response from master was not a 200.

3 Bug fixes

Improves Push collections so that snapshots are marked incomplete during creation to help avoid incomplete snapshots from being used.

3 Bug fixes

Improves Push Replication performance by enabling compression on more files.

3 Bug fixes

Fixes an issue with the Push API client, Push Replication and the Push2 store, where the API client was sending duplicate requests.

3 Bug fixes

Fixes an issue with the indexer option -facet_item_sepchars where metadata for facets was not correctly separated unless the option included the | character.

3 Bug fixes

Improves stability during query processing where very large packets produced by the query processor will be dropped to protect the Jetty web server from running out of memory. By default query processor packets over 60MB will be dropped. This option is configurable in collection.cfg using the ui.modern.padre_response_size_limit_bytes option.

3 Bug fixes

Fixes an issue with per collection logging under the modern UI which handles query processing, where the per collection logs would go to only one collection.