Metrics in Generative AI Agents

By using metrics, you can monitor the endpoints in Generative AI Agents. Review the following topics for more information about these metrics.

Endpoint Metrics

This section lists the metrics for agent endpoints in Generative AI Agents. You can get the following metrics in an endpoint's detail page.

Metric Display Name Description
Number of calls Number of calls that the agent that's hosted on this endpoint has processed
Total processing time (ms) Total processing time for a call to finish in milliseconds
Service errors count Number of calls with an error from the service side
Client errors count Number of calls with an error from the client side
Total input characters consumed Number of input characters that the agent that's hosted on this endpoint has processed
Total output characters produced Number of output characters that the agent that's hosted on this endpoint has processed
Number of error traces Number of traces with an error (This option applies if tracing is enabled for this endpoint.)
Success rate Successful calls divided by the total number of calls
Tip

In Generative AI Agents service, an endpoint's detail page, select the Options menu in each of the endpoint metric charts to get the following options:
  • View Query in Metrics Explorer
  • Copy chart URL
  • Copy query in Monitoring Query Language (MQL)
  • Create an alarm on this query
  • Table View

Viewing Query in Metrics Explorer

The metrics explorer is a resource in the Monitoring service. To get permission to work with the Monitoring service resources, ask an administrator to review the IAM policies in Securing Monitoring and grant you the proper access for your role.

For each of the endpoint metrics, select the Options menu in each of the endpoint metric charts and then click View Query in Metrics Explorer The following table displays the parameters used for the endpoint metrics in Monitoring Query Language (MQL).

Metric Display Name Metric Parameter MQL
Number of calls TotalInvocationCount TotalInvocationCount[1m].count()
Total processing time InvocationLatency InvocationLatency[1m].mean()
Service errors count ServerErrorCount ServerErrorCount[1m].count()
Client errors count ClientErrorCount ClientErrorCount[1m].count()
Total input characters consumed InputCharactersCount InputCharactersCount[1m].sum()
Total output characters produced OutputCharactersCount[1m].sum() OutputCharactersCount[1m].sum()
Number of error traces ErrorTraceCount ErrorTraceCount[1m].sum()

The success rate is calculated as successful calls divided by the total number of calls with the following MQL:

TotalInvocationCount[1m]{resourceId = "<endpoint-OCID>", StatusCode="200"}.grouping().count()
/ TotalInvocationCount[1m]{resourceId = "<endpoint-OCID>"}.grouping().count() * 100

Creating an Alarm for an Endpoint Metric

For each of the endpoint metrics, select the Options menu in each of the endpoint metric charts and then click Create an alarm on this query to be transported to a populated Create alarm page in the Monitoring service. Fill in the remaining fields to set an alarm for the metric that you selected.