IBM watsonx - Documentation

The nebuly platform provides full support for the IBM watsonx SDK. In this section, we will show you how to easily monitor all the requests made to the IBM watsonx models. First of all, let’s perform a simple chat request using the IBM watsonx SDK, tracking the interaction time start and time end:

Python

import datetime
from ibm_watsonx_ai import Credentials
from ibm_watsonx_ai.foundation_models import Model
from ibm_watsonx_ai.foundation_models.utils.enums import ModelTypes

ibm_model_id = ModelTypes.GRANITE_20B_CODE_INSTRUCT
time_start = datetime.datetime.utcnow().isoformat()
model = Model(
    model_id=ibm_model_id,
    credentials=Credentials(
        api_key="<IAM_API_KEY>",
        url="<IBM_URL>"
    ),
    project_id="<IBM_PROJECT_ID>"
)

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Who won the world series in 2020?"}
]
generated_response = model.chat(messages=messages)
assistant = generated_response['choices'][0]['message']['content']
time_end = datetime.datetime.utcnow().isoformat()

Now, let’s build the payload with all the useful information to be sent to the nebuly platform. We are going to include all the conversation details, along with the details of the model used that are used in the platform to compute and track the cost of the interaction.

The costs are available only for the following IBM Foundation Granite models:

ibm/granite-13b-instruct-v2
ibm/granite-8b-japanese
ibm/granite-20b-multilingual
ibm/granite-3-2b-instruct
ibm/granite-3-8b-instruct
ibm/granite-guardian-3-2b
ibm/granite-guardian-3-8b
ibm/granite-3b-code-instruct
ibm/granite-8b-code-instruct
ibm/granite-20b-code-instruct
ibm/granite-34b-code-instruct

You can send the traces also for other Granite and Third Party models, but the cost will not be computed.

Python

import uuid

nebuly_traces = [  # Optional, it can be an empty list
    {
        "model": ibm_model_id.value,
        "messages": messages,
        "output": assistant,
        "input_tokens": generated_response['usage']['prompt_tokens'],  # Needed to compute the cost
        "output_tokens": generated_response['usage']['completion_tokens'],  # Needed to compute the cost
    }
]

request_body = {
    "interaction": {
        "conversation_id": str(uuid.uuid4()),
        "input": messages[-1]['content'],
        "output": assistant,
        "time_start": time_start,
        "time_end": time_end,
        "end_user": "<USER_ID>"
    },
    "anonymize": False,
    "traces": nebuly_traces
}

At this point, we have all the information needed to send the request to the nebuly platform. The following code snippet shows how to easily send the request:

Python

import requests

url = "https://backend.nebuly.com/event-ingestion/api/v2/events/trace_interaction"
headers = {
    "Authorization": "Bearer <NEBULY_API_KEY>",
    "Content-Type": "application/json"
}

response = requests.post(url, json=request_body, headers=headers)

You can find a detailed explanation of the some of the specific parameters used in the code snippets above:

user_id

string

required

An id or username uniquely identifying the end-user. We recommend hashing their username or email address, in order to avoid sending us any identifying information.

conversation_id

string

required

A unique identifier for the conversation. It is used to group all the single interactions exchanged during the conversation.

anonymize

boolean

If set to True, a PII detection algorithm will be applied to the input and output messages to remove any personal information.

You can find more details in the API Reference section.