Use the steps in this walkthrough to set up an OCI Data Science connector to use for a
Retrieval-Augmented Generation (RAG) pipeline in OCI Search with OpenSearch.
After confirming and configuring the prerequisites, complete the following tasks to configure and create the connector:
When using a Data Science connector instead of a
Generative AI connector, you need to update the "llm_model" value to
"oci_datascience/<your_llm_model_name>" in the
RAG query payload code examples in Perform RAG with BM25 and
Perform RAG with Hybrid Search.
Prerequisites
To use a Data Science connector with OCI Search with OpenSearch, you need a cluster configured
to use OpenSearch version 2.11 or newer. By default, new clusters are configured to use
version 2.11. To create a cluster, see Creating an OpenSearch Cluster.
For existing clusters configured for version 2.3, you can perform an inline upgrade to version 2.11. For more information, see OpenSearch Cluster Software Upgrades.
To upgrade existing clusters configured for version 1.2.3 to 2.11, you need to use the upgrade process described in OpenSearch Cluster Software Upgrades.
If the OpenSearch cluster is in a different tenancy than the Data Science endpoint tenancy,
you need to create policies in both tenancies to
grant access to Data Science resources.
The following policy examples includes the required permissions. To use these examples,
replace <caller_tenancy_name> with the name of
the tenancy for the OpenSearch cluster, and replace
<resource_host_tenancy_name> with the name of
the tenancy for the Data Science endpoint.
Policy for Data Science
tenancy:
Copy
define tenancy <caller_tenancy_name> as <caller_tenancy_ocid>
admit any-user of tenancy <caller_tenancy_name> to {DATA_SCIENCE_MODEL_DEPLOYMENT_PREDICT} in tenancy
Policy for OpenSearch cluster
tenancy:
Copy
define tenancy <resource_host_tenancy_name> as <resource_host_tenancy_ocid>
endorse any-user to {DATA_SCIENCE_MODEL_DEPLOYMENT_PREDICT} in tenancy <resource_host_tenancy_name>
Deploy the model, as shown in the following example:
Copy
POST /_plugins/_ml/models/<model_ID>/_deploy
6: Create a RAG Pipeline 🔗
Create a RAG pipeline using the <model_ID> from
the previous step, as shown in the following example:
Copy
PUT /_search/pipeline/<pipeline_name>
{
"response_processors": [
{
"retrieval_augmented_generation": {
"tag": "genai_pipeline_demo",
"description": "Demo pipeline Using Genai Connector",
"model_id": "<model_ID>",
"context_field_list": ["<text_field_name>"],
"system_prompt": "You are a helpful assistant",
"user_instructions": "Generate a concise and informative answer for the given question"
}
}
]
}
You can specify one or more text field names for "context_field_list",
separate the values with a comma, for example:
When using a Data Science
connector instead of a Generative AI connector, you need to update the
"llm_model" value to
"oci_datascience/<your_llm_model_name>" in the
RAG query payload code examples in Perform RAG with BM25 and
Perform RAG with Hybrid Search.