vector.indexer_options
Background
This option configures the Funnelback Vector indexer by providing a set of configuration flags which configure various aspects of the Vector build.
Indexer options are supplied as a set of indexer flags which are set in the data source configuration.
Supported flags:
-
--set-paragraph-tag
set what HTML tag to use for paragraph extraction: e.g. "p" for <p> tags, "div" for <div> tags, "section" for <section> tags. By default, it’sp
. -
--amalgamate-paragraphs
enable amalgamate paragraphs, where 1 = no amalgamation (just 1 paragraph per chunk), 2 or more = amalgamate (2 or more paragraphs per chunk). By default, it’s 1. -
--max-elements
set maximum number of elements (paragraphs) that can be stored in vector DB. By default, it’s 1,000,000.
Examples
Use <div>
HTML tags to extract paragraph based on it:
vector.indexer_options=--set-paragraph-tag=div
and use 2 paragraphs per chunk:
vector.indexer_options=--set-paragraph-tag=div --amalgamate-paragraphs=2
and store only 500,000 paragraphs:
vector.indexer_options=--set-paragraph-tag=div --amalgamate-paragraphs=2 --max-elements=500000