Metrics in Generative AI Agents

By using metrics, you can monitor the endpoints in Generative AI Agents. Review the following topics for more information about these metrics.

Endpoint Metrics

This section lists the metrics for agent endpoints in Generative AI Agents. You can get the following metrics in an endpoint's detail page.


Metric Display Name	Description
Number of calls	Number of calls that the agent that's hosted on this endpoint has processed
Total processing time (ms)	Total processing time for a call to finish in milliseconds
Service errors count	Number of calls with an error from the service side
Client errors count	Number of calls with an error from the client side
Total input characters consumed	Number of input characters that the agent that's hosted on this endpoint has processed
Total output characters produced	Number of output characters that the agent that's hosted on this endpoint has processed
Number of error traces	Number of traces with an error (This option applies if tracing is enabled for this endpoint.)
Success rate	Successful calls divided by the total number of calls

Tip

In Generative AI Agents service, an endpoint's detail page, select the Options menu in each of the endpoint metric charts to get the following options:

View Query in Metrics Explorer
Copy chart URL
Copy query in Monitoring Query Language (MQL)
Create an alarm on this query
Table View

Viewing Query in Metrics Explorer

The metrics explorer is a resource in the Monitoring service. To get permission to work with the Monitoring service resources, ask an administrator to review the IAM policies in Securing Monitoring and grant you the proper access for your role.

For each of the endpoint metrics, select the Options menu in each of the endpoint metric charts and then click View Query in Metrics Explorer The following table displays the parameters used for the endpoint metrics in Monitoring Query Language (MQL).


Metric Display Name	Metric Parameter	MQL
Number of calls	`TotalInvocationCount`	`TotalInvocationCount[1m].count()`
Total processing time	`InvocationLatency`	`InvocationLatency[1m].mean()`
Service errors count	`ServerErrorCount`	`ServerErrorCount[1m].count()`
Client errors count	`ClientErrorCount`	`ClientErrorCount[1m].count()`
Total input characters consumed	`InputCharactersCount`	`InputCharactersCount[1m].sum()`
Total output characters produced	`OutputCharactersCount[1m].sum()`	`OutputCharactersCount[1m].sum()`
Number of error traces	`ErrorTraceCount`	`ErrorTraceCount[1m].sum()`

The success rate is calculated as successful calls divided by the total number of calls with the following MQL:

TotalInvocationCount[1m]{resourceId = "<endpoint-OCID>", StatusCode="200"}.grouping().count()
/ TotalInvocationCount[1m]{resourceId = "<endpoint-OCID>"}.grouping().count() * 100

Creating an Alarm for an Endpoint Metric

For each of the endpoint metrics, select the Options menu in each of the endpoint metric charts and then click Create an alarm on this query to be transported to a populated Create alarm page in the Monitoring service. Fill in the remaining fields to set an alarm for the metric that you selected.

Oracle Cloud Infrastructure Documentation Try Free Tier

Metrics in Generative AI Agents

Endpoint Metrics 🔗

Viewing Query in Metrics Explorer 🔗

Creating an Alarm for an Endpoint Metric 🔗

Oracle Cloud Infrastructure Documentation
Try Free Tier

Endpoint Metrics

Viewing Query in Metrics Explorer

Creating an Alarm for an Endpoint Metric