Funnelback 8.0.0 release notes
Released: 30th May 2008
Upgrade issues
-
Database collections have changed in layout, and now require an additional 'primary key' parameter. Please see the version 8 database collection upgrade guide for details.
-
Perl 5.8.8 is strongly recommended for all platforms:
-
Some features do not work out of the box under Perl 5.10 and Solaris.
-
Perl 5.8.5 and earlier have a bug in
HTML::Entities
, which may lead to incorrect encoding of apostrophes in the Funnelback system.
-
-
Queries are now logged in their expanded form, not their pre-expansion form.
New features
-
Document gathering from Microsoft Sharepoint and Lotus Domino
-
Faceted navigation
-
User tagging of results
-
User feedback on results
-
Basic Chinese / Japanese / Korean / Thai (CJKT) support
-
Feeds API
-
Crawling of content behind web forms
-
Automatically generated "support package"
Improvements
-
Allow pre/post commands to use collection.cfg parameters
-
Broken link detection script for featured pages
-
Capability for fetching resources at query time for multiple collection types (databases, filecopy, TRIM)
-
Context sensitive help links open in new pages, not the current page
-
Display real-time collection update status on the administration dashboard home page
-
Import and export of featured pages and query expansions
-
Instant updates support filecopy collections
-
Instant update support for more collection types
-
Java is bundled with Funnelback
-
Logs for a collection go in a collection specific log dir, not the "system logs" dir
-
Log text on the "view file" page is more readable
-
Numerous improvements to form parsing (fixes for nested tags, res* tags that contain curly braces, etc)
-
Option to remove all data during uninstall
-
Reporting uses much less memory
-
Reports are viewable while they are generating, and a reporting error will no longer leave the reports unusable
-
Significantly improved database search, with "workflow" interface, incremental gathering and compressed storage
-
Support for extracting links from Javascript generated web pages
-
Updates for all collection types may now be halted (the halt may not occur until the end of the current update phase for some collection types)
-
When upgrading an installation, the license key is preserved
Selected bug fixes
-
Add support for filtering .dot (MS Word Template) files
-
administration dashboard should include crawler.reject_files in its processing of the "file types to crawl" checkboxes
-
Allow collection parameter editing security model (parameter whitelists) to be applied on a per collection basis
-
Allow / ignore whitespace in various collection parameters
-
Ampersands in query* parameters are not parsed correctly
-
cache.cgi displays "XML parsing error" for pages in funnelback_documentation
-
cache.cgi does not perform security checks
-
cache.cgi links do not get properly URL encoded parameters
-
cache.cgi should strip meta refresh from its displayed contents to avoid sending users to incorrect locations
-
Cached XLS files don’t display correctly in IE6
-
Can enter empty featured page and query expansion
-
Can’t map the same xpath to multiple metadata classes
-
Change crawler to use MIME type rather than URL suffix when storing binary files
-
Check windows password is valid in installer
-
.ckpt index files should be removed by default
-
click.cgi links does not properly URL encode arguments
-
Clicking on filecopy results displays text in the error log
-
Click tracking not working by default
-
collection.cfg settings not being updated to point at new locations on an upgrade
-
Collection parameter whitelist not greying out fields
-
Collection summary rows should show successful update (green tick) after a successful index upgrade
-
Command line administration / Unix scheduling / Apache integration will not work if the Perl binary is not at /usr/bin/perl
-
Command line updates fail if not started from the bin directory
-
crawler_binaries parameter not being updated properly on an upgrade
-
Creating local collections with an unfindable source directory displays a confusing error message
-
_disabled__see_start_urls_file parameter being displayed in update log
-
Documentation CSS is indexed in the funnelback_documentation collection
-
Enable data reports for web collections on an upgrade
-
Filters not picking up title metadata from some Word docs
-
Fluster crashes when a query contains "(" or ")"
-
Fluster links have redundant CGI parameters
-
funnelback_documentation collection shouldn’t be deletable from admin interface
-
Funnelback installer should complain if empty input is given for some fields
-
htpasswd_modify is not fixed in an upgrade
-
Improved handling of URL case sensitivity in the crawler
-
Incorrect handling of numeric entities in crawled URLs
-
Investigate fallback for external filters
-
Investigate how to make query expansion work with Fluster
-
java_libraries contain duplicated path after upgrade
-
Local collection url prefixes don’t work as expected
-
Long logs are difficult to scroll
-
new-collection.pl does not create start.urls file
-
Old Jetty HTTPS server not shut down during upgrade
-
Padre displays result counts in minresults mode
-
PADRE failing to parse XML with empty elements
-
Padre date sorts don’t work for documents in the 16th / 17th century
-
Padre produces invalid XML for some documents that contain ampersands in their title
-
Padre segfault under rare combinations of gscopes and metadata searches
-
Parsing of meta parameters is broken
-
PDF not extracted correctly but output file with binary content was created
-
PDF results include shell error output
-
Permission errors under IIS
-
Remove trailing space in spelling suggestions
-
Reporting date routines do not handle leap years
-
Report links do not work under IIS
-
rss.cgi crashes when xsltproc is not found
-
RTF files filtered in trim collections do not have meaningful titles
-
Schedule updates page on windows incorrectly handles invalid input
-
Security violation displayed when empty filename is submitted for upload
-
Start URL parameter in instant update add doesn’t check for a protocol
-
The "results can’t be displayed because this collection has never updated" page looks awful
-
Various .cgi files do not have execute permission
-
Very rare hang caused by schtasks when upgrading from Funnelback 6.0.x to Funnelback 7.0.x
-
Viewing data reports forces the user displayed on the header to "admin"
-
Visual bugs when viewing administration under IIS
-
When editing a collection, changes are lost when navigating between tabs
-
Word expansion does not work with query_* parameters