Elasticsearch - Indexing Configuration

You can access this dialog as part of the connector configuration, if the connector has Elasticsearch as target.

  1. Document index name: this is the name of the document index.
    Based on this and the information below, it constructs the REST API Url to push the indexed data to.

  2. Principal index name: this is the name of the principal index. The principal index is where the connector stores user-group relationships.
    Based on this and the information below, it constructs the REST API Url to push the indexed data to.

  3. Elasticsearch base URL: this is the FQDN, with port and protocol which points to the REST API of the Elasticsearch instance.

  4. Use basic authentication: if you configured basic authentication for securing direct access to the instance, check this box.

    1. Username: is the username for basic auth.

    2. Password: is the according password.

  5. Public keys for SSL certificates: this configuration is needed, if you run the environment with self-signed certificates, or certificates which are not known to the Java key store.
    We use a straight-forward approach to validate SSL certificates. In order to render a certificate valid, add the modulus of the public key into this text field. You can access this modulus by viewing the certificate within the browser.

  1. Response timeout: determines the timeout when waiting for responses in milliseconds.

  2. Connection timeout: determines the timeout when waiting for connections to the instance in milliseconds.

  3. Socket timeout: determines the timeout when waiting for connections to the instance in milliseconds.

  4. Vector field for title. If you use vector search for the vectorization stages), then this field is used to push the vectorized title into.
    If you configure the search engine for querying data (cf. Search Experience and Query Pipelines ), then this field is used for issuing vector queries.

  5. Vector field for body. If you use vector search, then this field is used to push the vectorized body into.
    If you configure the search engine for querying data, then this field is used for issuing vector queries.

  6. Filters (aka refiners aka aggregations)

    1. Within this dialog you can add and remove filters. This does not mean that filters will be shown in the search interface right away but it is needed to tell the search engine to return calculated filters.

    2. Filter type: defines if this is a term filter (such as for file types), range filters (such as for prices) or date filters.

    3. Fieldname: is the name of the field which the filter should be computed for.

    4. Number of values to show: for term filters this defines how many values should be returned.

    5. Sorting: defines if the filter buckets should be sorted based on the document frequency (number of results) or lexicographically

    6. Sort direction: if sorting should be descending or ascending (based on the ordering type in e.)

When finished with setting these fields, click on validate and save. If you observe any issues, then the validator will let you know or you can find more insights in the log files.

Vector Search Dimensions

When using vector search, you need to make sure that the embeddings you use in the query, as well as in the content transformation pipeline have exactly the same dimension as the index fields vector body and vector title. Otherwise indexing or querying the index will fail.