Debugging form-based authentication
|This article applies to form-based authentication (form interaction) for the web crawler. It should not be confused with other forms of authentication support by the web crawler (such as HTTP basic authentication and NTLM authentication).|
This article provides guidance on debugging issues associated with form-based authentication.
Funnelback can be configured to perform form-based authentication when connecting to a website by configuring form interaction. Form-based authentication covers situations where a user has to fill in a HTML form to log in to a website. The underlying authentication mechanism can vary and may include types of authentication such as SAML.
The best way to troubleshoot form-based authentication is to use Funnelback’s debug API, which will request a specified URL and show all the request and response headers, returned data and redirects that occur when the request is made.
This tutorial assumes that form interaction has been setup for the data source.
Log in to the search dashboard and select View API UI from the menu.
The Admin API calls will be listed. Scroll down then expand the debug section.
http-requestcall allows you to debug the set of requests that occur when Funnelback requests a URL listed inside the collection configuration using either a
Click on the
GET /crawler/v1/debug/collections/<collection>/http-requestheading to expand the API test form. Fill in the form with the data source (collection) id (this will cause the API to load the form interaction configuration from the specified data source), the url to test (this should be the URL that you used in your data source configuration) and then select the level to debug. The different level values will provide varying degrees of information.
BODY(the default value) provides the most information - including request and response headers as well as the response data. It is often best to start with
BASICand gradually increase the level as this will give insight into redirects that may be occurring and the values of any cookies returned in the HTTP headers.
The information that is returned by the debug call will provide an insight into what will need to be changed.
Common problems with form interaction include: