File-copy collections



A filecopy collection is used for indexing documents from a file share or a local disk. It is made from a copy of the documents from a local or remote filesystem directory/folder. If you wish to index text-only content (no binaries such as .DOC, .PDF) then you can use a local collection as well.

An update will copy new or changed files from the source folder into the collection's offline data directory from where the update will proceed as normal. Binary documents are converted into text, text content is indexed, and the offline view is swapped with the live view.

A filecopy collection is defined by the following properties:

Supported Directories

Funnelback supports the indexing of various different types of directory. These include:

Local Directories
These are located on the search server and are addressed as local paths.
Windows file shares
These are file shares that are served using the SMB or CIFS protocols, as is standard in most Windows environments. They can be addressed as UNC paths.

How the data source is specified will depend on where the data is located. For example, a file-copy collection might have:

Note that on Linux operating systems, the default firewall rules may need to be altered to allow for SMB / CIFS name resolution.

RedHat Linux provides instructions for mounting NFS file shares and also comes with SMB/CIFS support

File shares mounted on a Windows machine can be indexed in a similar way, and will provide SMB/CIFS support. Please note that drive letter mappings are done or a per-user basis, so paths must be specified as UNC paths (e.g. \\fileserver\directory) for remote file shares. Also note that local collections can not operate with UNC paths or URLs as their data root.

Document Level Security

Document Level Security is supported on Windows to ensure that users can only access the files they are authorized to see.

Serving Fileshare Results

Fileshare results are served by the user interface layer: It will contact the fileshare to retrieve the requested file and download it to the search user browser. As part of its operation it will perform all required access checks to ensure a user only sees documents they are authorized to see.

See also