Database data sources
A database data source indexes the data returned by running an SQL query against a relational database.
Funnelback must have direct access to the SQL database when using a database data source to index database content. In most situations this will not be possible due to organizational security requirements that will generally place the database within the organizations internal network, which is unreachable by Funnelback.
Because access to databases is often restricted you may need to use an alternative approach to indexing the database content that does not rely on a direct connection to the database.
Funnelback’s database connector indexes the single table of results returned by an SQL query as separate XML records.
Because of this, the table that is returned by the query must be fully denormalized (Funnelback can’t resolve any ID references to other database records and has no concept of the relational database).
The best practice method of accessing content from an SQL database is to set up a Funnelback-specific database view containing all the (denormalized) fields that either should be returned as fields in the search results, or be searchable by Funnelback.
Using a view simplifies the SQL query that Funnelback uses to retrieve the data (
select * from VIEWNAME), and increases the visibility to your database administrator that Funnelback is indexing content from the database.
For databases that have multiple types of content it is often necessary to set up several Funnelback database data sources that index each of these types separately.
The database connector utilises a JDBC database driver to connect to the database system. Funnelback ships with a driver for PostgreSQL. Indexing of other databases requires installation of an appropriate JDBC driver.
|The DXP currently does not support using third party drivers - please contact your account manager if you have a requirement for a third party driver to discuss your options.|
Database data sources are created by following the data source creation steps and selecting database from the list of data source types.
The database data source is defined by configuring the following properties:
Test the connection settings using the database connection utils API.
This can be accessed by selecting API UI from the system (cog) menu, then scrolling down to the database connection utils.
There are two API calls provided to assist with checking your database connection.
After a successful update has run, configure metadata mappings for the database fields. The update of the database data source downloads each row from the SQL query and converts it to an XML record.
The fields that were detected in the SQL results should be listed amongst the XML sources. Map the relevant fields to metadata classes. Relevant fields include:
fields that you wish to display in the search results summaries, or use for faceted navigation.
fields that contain information (such as keywords, types) that make it easier to find the record.
If there is a field containing a URL that should be used as the target when the result is clicked on then this should be mapped as the document’s URL using the advanced XML configuration options available on the data source configuration screen.
Rebuild the index by running an advanced update to rebuild the live index.
After the reindex is complete metadata should be available for display in the search results.
Search results can then be configured. This involves:
By default, the URL used in database data sources is a system-assigned URL. If the default system-assigned URL is not modified (for example by filters) the modern UI will use the cache link as the live URL. This will result in the XML of the record being shown to the user when a result is clicked. You can use XSLT on the cache controller to style the response.
Database records can be modified (locally) prior to indexing using the filter framework.