Connecting enterprise repositories using Manifold CF
|This feature is not available to users of the Squiz Experience Cloud version of Funnelback.|
Funnelback includes support for connecting to a number of enterprise repository systems through an open source connector framework called Manifold CF. This framework, along with the associated Funnelback ManifoldCF connector, allows Funnelback to be populated content from supported repositories and to apply Document level security for repositories where Manifold CF supports fetching security information.
- Repository support
- Configuration guide
- Configuring additional ManifoldCF repositories
- Troubleshooting ManifoldCF
- See also
As of writing, the current version of ManifoldCF, 2.3, supports the following repositories: Alfresco, Any CMIS repository, DropBox, Google Drive, HDFS, Jira, LiveLink (OpenText), Documentum, SharePoint, Meridio and FileNet (IBM).
See https://manifoldcf.apache.org/release/release-2.3/en_US/included-connectors for specific details about supported versions of these repositories.
Funnelback integrates with the connector framework by providing a connector which allows content to be filtered, formatted and added to a Push data source, as well as Funnelback components allowing Funnelback to provide document level security.
The following diagram illustrates how Funnelback interacts with the repository and the authority systems via the connector framework.
The blue lines within the diagram illustrate the gathering of content from the repository system (by the connector framework), and the addition of the content to a Funnelback push data source. The red lines within the diagram illustrate the interactions which occur during a search request to ensure that only content which should be visible to the current search user is returned.
Manifold CF should be downloaded from http://manifoldcf.apache.org/en_US/download and deployed according to its instructions at https://manifoldcf.apache.org/release/release-2.3/en_US/how-to-build-and-deploy#Running+ManifoldCF. Please note that the current version of the Funnelback connector has only been tested with Manifold CF 2.3, however it is believed that later 2.x versions will continue to work correctly, though any future ManifoldCF 3.x releases are not likely to be compatible with this version.
Once ManifoldCF has been installed, please follow the README.txt instructions provided within the manifoldcf-funnelback-connector.zip archive included in $SEARCH_HOME/tools.
Please note that for performance reasons, Funnelback strongly recommends installing Manifold CF on a dedicated server, and using a standalone database rather than the embedded one included with the default ManifoldCF installation.
The first step in configuring Funnelback to interact with an enterprise repository is to create a target push data source within the Funnelback installation. . After creating it, make a note of the push data source’s name.
Data sources intended for use with ManifoldCF require a number of additional data source configuration settings to configure authentication and document level security.
The ManifoldCF system to use as a user authority (usually something like http://manifoldcf-server.example.com:8345/mcf-authority-service)
The name of the domain in which users are assumed to belong.
The user mapper to use to access user information from the authority. ManifoldCF should be used with ui.modern.authentication on Windows, and ManifoldCFDebug can be used on Linux or on other systems where the remote user name should be passed into Funnelback through a URL parameter (i.e. &user=THE_USERNAME).
true on Windows (this causes Funnelback to authenticate search users against the active directory domain in which the Funnelback server resides.
New output connectors can be created in ManifoldCF from the 'List Output Connectors' link on the left. After selecting Funnelback as the output type a number of Funnelback specific settings are provided as detailed below.
The server URL must point at a Funnelback server’s push-api service, which runs by default on port 8443 alongside the administration interface. An example URL for the service is https://funnelback-server.example.com:8443/push-api
The username and password provided must be a valid administration user with permission to access the push data source.
The filter classes setting must be in the form specified by the filter.classes data source configuration option.
Funnelback will access the configured authority connector to obtain information about the permissions assigned to a given user when they make a search request.
Active Directory is the most commonly used authority, and documentation on configuring it is available from the ManifoldCF’s "Defining Authority Connections" documentation.
ManifoldCF will gather content from external repository systems based on the settings provided in a repository connector. ManifoldCF’s "Defining Repository Connections" documentation details how to configure repository connections.
ManifoldCF links repository connections to output connections (such as Funnelback) through a gathering job. ManifoldCF’s documentation for creating new jobs and for executing jobs provides details of how to create and execute these jobs from within the ManifoldCF administration interface.
Once the gathering job has been run it should be possible to perform search requests against the push data source through the search box within the Funnelback interface. When using
security.earlybinding.user-to-key-mapper=ManifoldCF search requests will automatically be authenticated against ActiveDirectory, however for testing or on Linux
security.earlybinding.user-to-key-mapper=ManifoldCFDebug can be used. When using ManifoldCFDebug an additional &user=THE_USERNAME parameter must be included in the search request URL to tell Funnelback which user to secure the results for. Note that ManifoldCFDebug is not a secure approach to use unless other steps have been taken to prevent users from modifying this user parameter (e.g. by providing access to the search only via some other wrapper which controls the user parameter).
Due to licensing restrictions, ManifoldCF’s standard package does not include all the libraries required for some repository connectors, so these libraries must be installed and the relevant connectors enabled before they can be used. Details on specific libraries required are available in README files at
Some repository connectors, such as Documentum and FileNet, also require supporting processes to be run alongside ManifoldCF. The systems required are included within ManifoldCF at
$MANIFOLD_HOME/*-process/. Further details are available in the ManifoldCF 'Building and Deploying' documentation.
Once the appropriate libraries are in place and any required processes are running, the relevant repository connector must be enabled within
$MANIFOLD_HOME/connectors.xml. After changing this file, the Funnelback jetty web server process must be restarted for the change to take effect.
Assistance with the ManifoldCF system is available from the ManifoldCF mailing lists.