|This feature is not available to users of the Squiz Experience Cloud version of Funnelback.|
make_report.pl processes a collection’s data files, producing data reports on their contents.
$ make_report.pl <--collection "collection config"> [--log] [--plain] [--datadir "data directory"] [--output "output directory"] [--hosts "host list file"]
The collection configuration file must be specified, and must be a filesystem path to an existing, readable and valid collection configuration file.
--logmay also be specified, and indicates that the script should write to a log file.
--plainmay also be specified, and indicates that the script should output plain HTML instead of Funnelback look and feel HTML.
--datadir "data directory"may also be specified, and gives the directory to provide reports for.
--output "output directory"may also be specified, and gives the directory to write output to.
--hosts "host list file"may also be specified, and gives the location of a file on a disk that groups sites / hosts into groups and subgroups.
make_report.pl runs over a data directory, recording statistics on the directories contents, and outputs reports to HTML files.
The directory that
make_report.pl runs over is the collection configuration’s data_root (
$SEARCH_HOME/data/$COLLECTION_NAME/offline/data), or specified by the "--datadir" option. The collection data_root setting should have data gathered by an update.
make_report.pl will place output in
$SEARCH_HOME/admin/data_report/<collection> by default, or in the directory specified by "--output".
If "--log" is specified, the script will write a log called
crawl_data_report.log to the log directory beside the specified data directory: eg, if the data directory is
/opt/funnelback/data/<collection>/offline/data/, the log file will be
/opt/funnelback/data/<collection>/offline/data/crawl_data_report.log, and if the data directory is
/tmp/my_own_gathered_stuff/, the log file will be
The reports produced will be plain HTML if "--plain" is specified. When this script is run by the update process, the files will include various substitutable strings, including:
@REPORT_BASE@. This is so that the admin UI can read these files from disk and substitute in links to the administration UI homepage, CSS files, images, etcetera.
A "hosts list" may be specified. If none is specified, a default of
$SEARCH_HOME/conf/<collection>/sites-by-portfolio.csv is assumed. The hosts list does not have to exist and has negligible impact on the reports. If present, the list should be of the format:
http://forums.funnelback.com,businesses,funnelback http://www.funnelback.com,businesses,funnelback http://www.csiro.au,governmental,australia http://www.microsoft.com,businesses,microsoft http://www.australia.gov.au,government,australia http://www.health.gov.au,government,australia
Should the host list exist, various aggregate statistics will be produced. For example, statistics will not just be reported for individual sites, but for groups of sites.