Documentation
General Crawl Settings
The following settings are available for all content source configurations.
Maximum content size (MB): This setting limits the net document size (before text extraction) to the given value. The default is 100MB.
Number of crawl threads: Determines how many crawl threads run in parallel to detect new, changed and deleted documents. The default are 10 threads.
Please note the following:Please make sure that to not overwhelm the content source and be gentle.
Please note that multithreading is part of the crawl strategy and takes place where possible. Normally, a crawl thread takes care of one folder, space, project or a page and its attachments. In some scenarios a multithreading might degenerate to a single threading.
Number of transformation threads: This determines how many threads take care of a content transformation. Each thread takes care of one documentation and can also wait for heavy-lifting operations such as LLM-embeddings or image extraction. So please make sure to have sufficiently many threads to achieve a reasonable crawl performance. The default is 10.
Number of search engine submission threads: This value determines how many threads are used to push changes to the search engine. The default is 40.
Deletion limit per crawl. If this value is different from zero, you limit the maximum number of deleted items per crawl to this number. If the total number of undiscovered items is larger than this value, multiple crawls are needed to remove all of these items from the search index.