Funnelback 8.0.0 release notes
Released: 30th May 2008
Upgrade issues
- 
Database collections have changed in layout, and now require an additional 'primary key' parameter. Please see the version 8 database collection upgrade guide for details. 
- 
Perl 5.8.8 is strongly recommended for all platforms: - 
Some features do not work out of the box under Perl 5.10 and Solaris. 
- 
Perl 5.8.5 and earlier have a bug in HTML::Entities, which may lead to incorrect encoding of apostrophes in the Funnelback system.
 
- 
- 
Queries are now logged in their expanded form, not their pre-expansion form. 
New features
- 
Document gathering from Microsoft Sharepoint and Lotus Domino 
- 
Faceted navigation 
- 
User tagging of results 
- 
User feedback on results 
- 
Basic Chinese / Japanese / Korean / Thai (CJKT) support 
- 
Feeds API 
- 
Crawling of content behind web forms 
- 
Automatically generated "support package" 
Improvements
- 
Allow pre/post commands to use collection.cfg parameters 
- 
Broken link detection script for featured pages 
- 
Capability for fetching resources at query time for multiple collection types (databases, filecopy, TRIM) 
- 
Context sensitive help links open in new pages, not the current page 
- 
Display real-time collection update status on the administration dashboard home page 
- 
Import and export of featured pages and query expansions 
- 
Instant updates support filecopy collections 
- 
Instant update support for more collection types 
- 
Java is bundled with Funnelback 
- 
Logs for a collection go in a collection specific log dir, not the "system logs" dir 
- 
Log text on the "view file" page is more readable 
- 
Numerous improvements to form parsing (fixes for nested tags, res* tags that contain curly braces, etc) 
- 
Option to remove all data during uninstall 
- 
Reporting uses much less memory 
- 
Reports are viewable while they are generating, and a reporting error will no longer leave the reports unusable 
- 
Significantly improved database search, with "workflow" interface, incremental gathering and compressed storage 
- 
Support for extracting links from Javascript generated web pages 
- 
Updates for all collection types may now be halted (the halt may not occur until the end of the current update phase for some collection types) 
- 
When upgrading an installation, the license key is preserved 
Selected bug fixes
- 
Add support for filtering .dot (MS Word Template) files 
- 
administration dashboard should include crawler.reject_files in its processing of the "file types to crawl" checkboxes 
- 
Allow collection parameter editing security model (parameter whitelists) to be applied on a per collection basis 
- 
Allow / ignore whitespace in various collection parameters 
- 
Ampersands in query* parameters are not parsed correctly 
- 
cache.cgi displays "XML parsing error" for pages in funnelback_documentation 
- 
cache.cgi does not perform security checks 
- 
cache.cgi links do not get properly URL encoded parameters 
- 
cache.cgi should strip meta refresh from its displayed contents to avoid sending users to incorrect locations 
- 
Cached XLS files don’t display correctly in IE6 
- 
Can enter empty featured page and query expansion 
- 
Can’t map the same xpath to multiple metadata classes 
- 
Change crawler to use MIME type rather than URL suffix when storing binary files 
- 
Check windows password is valid in installer 
- 
.ckpt index files should be removed by default 
- 
click.cgi links does not properly URL encode arguments 
- 
Clicking on filecopy results displays text in the error log 
- 
Click tracking not working by default 
- 
collection.cfg settings not being updated to point at new locations on an upgrade 
- 
Collection parameter whitelist not greying out fields 
- 
Collection summary rows should show successful update (green tick) after a successful index upgrade 
- 
Command line administration / Unix scheduling / Apache integration will not work if the Perl binary is not at /usr/bin/perl 
- 
Command line updates fail if not started from the bin directory 
- 
crawler_binaries parameter not being updated properly on an upgrade 
- 
Creating local collections with an unfindable source directory displays a confusing error message 
- 
_disabled__see_start_urls_file parameter being displayed in update log 
- 
Documentation CSS is indexed in the funnelback_documentation collection 
- 
Enable data reports for web collections on an upgrade 
- 
Filters not picking up title metadata from some Word docs 
- 
Fluster crashes when a query contains "(" or ")" 
- 
Fluster links have redundant CGI parameters 
- 
funnelback_documentation collection shouldn’t be deletable from admin interface 
- 
Funnelback installer should complain if empty input is given for some fields 
- 
htpasswd_modify is not fixed in an upgrade 
- 
Improved handling of URL case sensitivity in the crawler 
- 
Incorrect handling of numeric entities in crawled URLs 
- 
Investigate fallback for external filters 
- 
Investigate how to make query expansion work with Fluster 
- 
java_libraries contain duplicated path after upgrade 
- 
Local collection url prefixes don’t work as expected 
- 
Long logs are difficult to scroll 
- 
new-collection.pl does not create start.urls file 
- 
Old Jetty HTTPS server not shut down during upgrade 
- 
Padre displays result counts in minresults mode 
- 
PADRE failing to parse XML with empty elements 
- 
Padre date sorts don’t work for documents in the 16th / 17th century 
- 
Padre produces invalid XML for some documents that contain ampersands in their title 
- 
Padre segfault under rare combinations of gscopes and metadata searches 
- 
Parsing of meta parameters is broken 
- 
PDF not extracted correctly but output file with binary content was created 
- 
PDF results include shell error output 
- 
Permission errors under IIS 
- 
Remove trailing space in spelling suggestions 
- 
Reporting date routines do not handle leap years 
- 
Report links do not work under IIS 
- 
rss.cgi crashes when xsltproc is not found 
- 
RTF files filtered in trim collections do not have meaningful titles 
- 
Schedule updates page on windows incorrectly handles invalid input 
- 
Security violation displayed when empty filename is submitted for upload 
- 
Start URL parameter in instant update add doesn’t check for a protocol 
- 
The "results can’t be displayed because this collection has never updated" page looks awful 
- 
Various .cgi files do not have execute permission 
- 
Very rare hang caused by schtasks when upgrading from Funnelback 6.0.x to Funnelback 7.0.x 
- 
Viewing data reports forces the user displayed on the header to "admin" 
- 
Visual bugs when viewing administration under IIS 
- 
When editing a collection, changes are lost when navigating between tabs 
- 
Word expansion does not work with query_* parameters