Cohere Rerank 3.5 (New)
The cohere.rerank.3-5
model takes in a query and a list of texts and produces an ordered array with each text assigned a relevance score. The relevance score is how the model ranks the documents, that's, how well each text matches the query.
Available in These Regions
- Brazil East (Sao Paulo) (dedicated AI cluster only)
- Germany Central (Frankfurt) (dedicated AI cluster only)
- Japan Central (Osaka) (dedicated AI cluster only)
- Saudi Arabia Central (Riyadh) (dedicated AI cluster only)
- UK South (London) (dedicated AI cluster only)
- US East (Ashburn) (dedicated AI cluster only)
- US Midwest (Chicago) (dedicated AI cluster only)
Access this Model
Key Features
- Dedicated mode only.
- Not available on-demand or in the playground.
- Access the model that's hosted on a cluster through API and SDK.
- For dedicated mode, create an endpoint on a hosting dedicated AI cluster, host the model on the cluster, and then run the RerankText API or its relevant SDK.
Dedicated AI Cluster for the Model
To reach a model through a dedicated AI cluster in any listed region, you must create an endpoint for that model on a dedicated AI cluster. For the cluster unit size that matches this model, see the following table.
Base Model | Fine-Tuning Cluster | Hosting Cluster | Pricing Page Information | Request Cluster Limit Increase |
---|---|---|---|---|
|
Not available for fine-tuning |
|
|
|
If you don't have enough cluster limits in your tenancy for hosting the Cohere Rerank 3.5 model on a dedicated AI cluster, request the dedicated-unit-rerank-cohere-count
limit to increase by 1.
Endpoint Rules for Clusters
- A dedicated AI cluster can hold up to 50 endpoints.
- Use these endpoints to create aliases that all point either to the same base model or to the same version of a custom model, but not both types.
- Several endpoints for the same model make it easy to assign them to different users or purposes.
Hosting Cluster Unit Size | Endpoint Rules |
---|---|
RERANK_COHERE |
|
-
To increase the call volume supported by a hosting cluster, increase its instance count by editing the dedicated AI cluster. See Updating a Dedicated AI Cluster.
-
For more than 50 endpoints per cluster, request an increase for the limit,
endpoint-per-dedicated-unit-count
. See Requesting a Service Limit Increase and Service Limits for Generative AI.
Cluster Performance Benchmarks
Review the Cohere Rerank 3.5 cluster performance benchmarks for different scenarios.
Release and Retirement Dates
Model | Release Date | On-Demand Retirement Date | Dedicated Mode Retirement Date |
---|---|---|---|
cohere.rerank.3-5
|
2025-05-14 | On-demand mode isn't available for this model. | At least 6 months after the release of the 1st replacement model. |
Rerank Model Parameter
For the Rerank model parameters, see the RerankText API documentation.