Overview
The hTech AI Ticket Assistant tracks detailed information about every AI request it makes. These statistics help you monitor usage, understand token consumption, diagnose performance issues, and optimize for cost or speed depending on your setup. This guide explains what each call statistic means and how to use that information effectively.
What Are Call Stats?
Call Stats are recorded every time the AI module sends a prompt to the selected AI model. Each entry captures:
- The model used
- Token usage (input and output)
- Processing time
- Whether the request succeeded or failed
- Any error messages returned by the API
- The related ticket (if applicable)
- The approximate cost of the request
These stats help you understand exactly how the assistant is being used.
Where to View Call Stats
You can view call statistics inside the module:
- Log into WHMCS admin
- Select Addons
- Click "hTech AI Ticket Assistant"
- Open the "Call Stats" or "Usage Details" panel
Understanding Token Usage
AI models use tokens to process information. A token is a small piece of text, and the cost of a call depends on how many tokens are used.
The module records two types of tokens:
- Tokens In – The number of tokens sent to the model (your prompt).
- Tokens Out – The number of tokens returned by the model (the AI reply).
Total tokens = Tokens In + Tokens Out
Large messages, long ticket histories, or detailed replies increase token usage.
Why Token Tracking Matters
Token usage is important for:
- Monitoring resource consumption
- Avoiding Hosted Mode usage limits
- Controlling costs in BYO mode
- Identifying inefficient prompts or bloated SOPs
High token usage could indicate overly long SOPs or KB entries being embedded unnecessarily.
Understanding Runtime (Processing Time)
Each call includes a "runtime_ms" value showing how long the AI model took to generate the reply. This helps you identify performance problems.
- 0–500 ms – Very fast
- 500–1500 ms – Normal
- 1500–3000 ms – Slower but acceptable
- 3000+ ms – Model may be busy or overloaded
High runtimes may indicate peak OpenAI usage periods or unusually large inputs.
Success vs Failure
Each AI call is marked as either:
- Success – The model responded with a valid reply.
- Failure – The call returned an error or empty response.
If a call fails, the error message is saved to help diagnose the issue.
Common Error Messages
Examples of errors you may see:
- Invalid API key – BYO mode key incorrect or expired.
- Model not available – Selected model is offline or misconfigured.
- Rate limit reached – Too many calls in a short time.
- Input too long – Ticket + SOP + KB exceeded model limits.
- Connection error – Temporary communication issue.
Most errors resolve automatically when conditions return to normal.
Estimated Cost
The system may record an estimated cost of the API call. This estimate depends on:
- The model’s price per 1,000 tokens
- The number of input tokens
- The number of output tokens
This helps BYO users forecast monthly billing and optimize prompt and SOP sizes.
How to Reduce Token Usage
Token consumption can be reduced by:
- Shortening SOPs or breaking them into sections
- Keeping KB articles concise
- Using fewer past messages in Threaded Memory
- Limiting Auto Reply in high-volume departments
- Choosing more efficient models
Large inputs increase costs and slow down replies.
Identifying Problematic Tickets
Some tickets may consume more resources than others. Call Stats help identify tickets that:
- Cause multiple retries
- Generate long or complex responses
- Contain excessive ticket history
- Trigger repeated errors
Consider flagging these tickets to disable AI if needed.
Best Practices for Monitoring Call Stats
- Review stats weekly to track high usage patterns.
- Adjust department modes if automation is excessive.
- Enable Escalation Detection to avoid wasted calls on risky tickets.
- Shorten SOPs if you notice large token input numbers.
- Use analytics to balance speed, accuracy, and cost.
Summary
The Call Stats and Token Usage system gives you complete visibility into how the hTech AI Ticket Assistant is interacting with your customers. By reviewing model usage, token counts, cost estimates, and processing time, you can optimize your automation strategy and ensure efficient, predictable behavior.
If you need help interpreting call stats or improving efficiency, feel free to open a support ticket.