Funnelback logo

Documentation

Document Level Security

Introduction

This section describes the document level security features available in Funnelback. Document level security provides the capability to control which documents are visible to users using the Funnelback search engine at the level of the individual document, rather than the more general collection level security. Enabling document level security will allow repositories with complex security restrictions to be easily searched using a single interface available to a wide range of users.

Please note that when using document level security, access to the cached version of documents via the cache controller is disabled.

Late binding vs. early binding model

Funnelback can use two document level security models: early binding and late binding.

Late binding

Late-binding.png

In late binding mode retrieving the security information for each document is done at query time.

When an user performs a search the query processor returns a set of documents matching the query. For each document a realtime security check is performed against the origin repository of the document to ensure that the user is authorized to see the document.

The type of security check to perform will depend on the remote repository type and is configured using document_level_security.action. Each security check type is implemented as a Perl script under $SEARCH_HOME/bin/dls_<repository_type>_check.pl.

Pros:

  • Security information is up-to-date because security checks are performed in real time against the remote repository, for each document matching the query.

Cons:

  • Checking the security information of individual documents is a slow process for most of the repositories types.
  • While some settings exists to limit the number of security checks performed in order to improve the response time, using them prevents Funnelback from providing proper results counts (though calculated estimates are available).


Early binding

Early-binding.png In the early binding mode the security information of each document is embedded in the collection.

When the origin repository is gathered Funnelback collects the security information of each document, the lock strings. When a user performs a search Funnelback collects its credentials, the user keys, and the query processor tries to match those keys against the lock string embedded in the documents, for each document matching the query.

The user keys are mapped to a metadata class like any other metadata, using the content flag 4 in metamap.cfg or xml.cfg (This is automatically done when the collection is created). For example for TRIM collections the following mapping should be used: S,4,trim.lockstring.

The metadata containing the lock string as well as the format of the lock string itself will vary depending on the collection type and remote repository type. Please see the per-collection instruction for details.

Fetching user keys: The type of user credentials, the way to fetch them and to map them into user keys is configured in security.earlybinding.user-to-key-mapper and will depend on the remote repository type. Various user key mappers plugins are provided in the $SEARCH_HOME/share/security_plugins/ folder.

Matching user keys with lock strings: To find if a user is authorized to see a specific document the query processor tries to match the user keys against the document lock string. The default matching algorithm checks that the document lock string contains at least one of the user keys. The algorithm can be changed by specifying a plugin name in security.earlybinding.locks-keys-matcher.name. Various plugins are provided in $SEARCH_HOME/lib/plugins/.

Pros:

  • Nearly as fast as non-secured collections since the security checks don't connect to a remote repository
  • Since it's fast every document in the collection can be checked, allowing Funnelback to return accurate document counts.

Cons:

  • Still needs a connection to the remote repository to fetch the user credentials (user keys), but this can be cached.
  • While security credentials (user keys) will be up to date (depending on the cache settings), the security information on documents (lock strings) are updated only when the collection is updated.
    • Updating the credentials of a users will be effective immediately
    • Updating the security settings of a document will be effective once the collection has been updated.


Hybrid (late binding + early binding)

This mode combines both early and late binding modes.

In this mode the lock strings are collected with every document during the collection update, as in early binding. At query time the set of matching documents is first filtered by matching user keys and document locks. For the remaining set of documents, real time security checks are performed against the remote repository.

To set-up this mode please follow the instruction on both early binding and late binding configuration.

Pros:

  • Security is up-to-date, as in late binding mode, except for documents with reduced security constraints (see cons below).
  • The set of documents to check in real time against the remote repository is reduced because it's already filtered using document locks, thus reducing the time needed to perform the checks.

Cons:

  • Real-time security checks are still performed, possibly impacting the response time
  • Documents for which the security constraints have been reduced or removed won't be returned for users who are now authorized to see them until the next collection update.

For example, consider a document was previously available for "Administrators" only, but is now "Public". Unless the collection is updated the document lock string is still "Administrators only" meaning that the document will be filtered from the initial set of matching documents if a Public user runs a query.

Available collection types

The following collection types supports document level security:

Authentication and delegation

Users must be authenticated for Document Level Security to function. Funnelback currently only supports document level security with search servers running in a Windows environment, the authentication relying on the Windows infrastructure (Active Directory). Setting up the authentication mechanism depends on the user interface type in use.

Most of the collection types also require Windows authentication and trust delegation to be working properly. Having authentication and trust delegation working on the Windows Domain is a pre-requisite for configuring Document Level Security.

Switching off document level security

Late binding

To disable late-binding security, ensure that document_level_security.mode is set to disabled. That will stop the query processor from performing real time checks and will have it return every matching document.

Early binding

Disabing early-binding security can be achieved in two ways:

Method 1: Keep the lock strings in the collection, but use a user key mapper plugin that always returns a master key matching every documents. To do so:

Note: This will not work with NTFS filecopy collection. To disable early-binding security on this collection type you need to apply the second method. This will also not work with collections types that use a custom keys / lock matching algorithm (as opposite to the default one) such as TRIM and TRIMPush collections.

Method 2: Remove the lock strings from the collection. To do so:

See also

For late-binding:

For early-binding:

top ⇑