Knowledgebase

Overview

The hTech AI Ticket Assistant tracks detailed information about every AI request it makes. These statistics help you monitor usage, understand token consumption, diagnose performance issues, and optimize for cost or speed depending on your setup. This guide explains what each call statistic means and how to use that information effectively.

What Are Call Stats?

Call Stats are recorded every time the AI module sends a prompt to the selected AI model. Each entry captures:

The model used
Token usage (input and output)
Processing time
Whether the request succeeded or failed
Any error messages returned by the API
The related ticket (if applicable)
The approximate cost of the request

These stats help you understand exactly how the assistant is being used.

Where to View Call Stats

You can view call statistics inside the module:

Log into WHMCS admin
Select Addons
Click "hTech AI Ticket Assistant"
Open the "Call Stats" or "Usage Details" panel

Understanding Token Usage

AI models use tokens to process information. A token is a small piece of text, and the cost of a call depends on how many tokens are used.

The module records two types of tokens:

Tokens In – The number of tokens sent to the model (your prompt).
Tokens Out – The number of tokens returned by the model (the AI reply).

Total tokens = Tokens In + Tokens Out

Large messages, long ticket histories, or detailed replies increase token usage.

Why Token Tracking Matters

Token usage is important for:

Monitoring resource consumption
Avoiding Hosted Mode usage limits
Controlling costs in BYO mode
Identifying inefficient prompts or bloated SOPs

High token usage could indicate overly long SOPs or KB entries being embedded unnecessarily.

Understanding Runtime (Processing Time)

Each call includes a "runtime_ms" value showing how long the AI model took to generate the reply. This helps you identify performance problems.

0–500 ms – Very fast
500–1500 ms – Normal
1500–3000 ms – Slower but acceptable
3000+ ms – Model may be busy or overloaded

High runtimes may indicate peak OpenAI usage periods or unusually large inputs.

Success vs Failure

Each AI call is marked as either:

Success – The model responded with a valid reply.
Failure – The call returned an error or empty response.

If a call fails, the error message is saved to help diagnose the issue.

Common Error Messages

Examples of errors you may see:

Invalid API key – BYO mode key incorrect or expired.
Model not available – Selected model is offline or misconfigured.
Rate limit reached – Too many calls in a short time.
Input too long – Ticket + SOP + KB exceeded model limits.
Connection error – Temporary communication issue.

Most errors resolve automatically when conditions return to normal.

Estimated Cost

The system may record an estimated cost of the API call. This estimate depends on:

The model’s price per 1,000 tokens
The number of input tokens
The number of output tokens

This helps BYO users forecast monthly billing and optimize prompt and SOP sizes.

How to Reduce Token Usage

Token consumption can be reduced by:

Shortening SOPs or breaking them into sections
Keeping KB articles concise
Using fewer past messages in Threaded Memory
Limiting Auto Reply in high-volume departments
Choosing more efficient models

Large inputs increase costs and slow down replies.

Identifying Problematic Tickets

Some tickets may consume more resources than others. Call Stats help identify tickets that:

Cause multiple retries
Generate long or complex responses
Contain excessive ticket history
Trigger repeated errors

Consider flagging these tickets to disable AI if needed.

Best Practices for Monitoring Call Stats

Review stats weekly to track high usage patterns.
Adjust department modes if automation is excessive.
Enable Escalation Detection to avoid wasted calls on risky tickets.
Shorten SOPs if you notice large token input numbers.
Use analytics to balance speed, accuracy, and cost.

Summary

The Call Stats and Token Usage system gives you complete visibility into how the hTech AI Ticket Assistant is interacting with your customers. By reviewing model usage, token counts, cost estimates, and processing time, you can optimize your automation strategy and ensure efficient, predictable behavior.

If you need help interpreting call stats or improving efficiency, feel free to open a support ticket.

Call Stats & Token Usage Explained in the hTech AI Ticket Assistant Print