Skip to main content
Phonely’s A/B Testing tool helps you experiment, measure, and improve your voice AI’s perfomance with real worl data . It allows you to compare different versions of you AI agent such as Voice, workflow, conversation settings to determine which one performs best in live calls. This guide explains how A/B testing works, how to create and run a test, and how to interpret results.

What is A/B Testing in Phonely ?

A/B Testing lets you run controlled experiments between two versions of your AI agent:
  • The Base Agent (Control) — your exisiting setup.
  • The Test Agent (Variant) — a duplicate with specific changes.
Incoming calls are automatically split between the two versions. Phonely then measures how each one perfoms based on the success criteria you define — such as call duration, outcomes, or end reasons. This gives you clear, data-driven insights into what changes actually improve performance — instead of relying on guesswork.

Where to Access A/B Testing

  1. Go to your Phonely Menu and click to open Testing .
  2. Choose A/B Testing from the top navigation bar.
You’ll see three sections:
  • Planned – Tests you’ve set up but haven’t started yet.
  • In Progress – Tests currently running on live calls.
  • Completed – Finished tests where you can review performance results.

Creating a New A/B Test

Click Create a new test in the Planned section to begin. A step-by-step setup window will appear.

Name and Describe Your Test

  • Test Name: Give your test a descrptive name that identifies what you’re testing.
    Example: “Friendly Voice vs Formal Voice – Support Line.”
  • Description: Explain your test goal.
    Example: “Evaluate whether a friendly voice style improves appointment confirmations.”
Creating An Ab Test Gi

Choose What You’d Like to Test

Phonely supports multiple types of tests depending on your experiment goal. You can select one of the following:
TypeWhat It TestsCommon Use Case
VoiceCompares different AI voices or tonesTest if a friendly voice leads to higher customer engagement
WorkflowTests different conversation flows or logicCompare two calls scripts or routing paths
Agent SettingsEvaluates settings like interruption, delay, or background noiseFind the balance between quick responsed and natural flow
Knowledge BaseTests different documentation sourcesSee which knowlesdge sources improve the accuracy of the the answers
OtherFor any other tests outside these categories
Once selected, click Next.

Define End Criteria and Call Distribution

Here, you’ll specify how long the test should run and what share of calls should be routed to your test version.

End Criteria

Choose when the test should stop automatically:
  • By Number of Calls: Ends after a set number of test calls.
    Example: Stop after 1,000 calls routed to the test version.
  • By Number of Days: Runs for a fixed duration (e.g., 10 days).
  • AI-Determined (Coming Soon): Will allow Phonely to automatically decide when enough data is collected.

Call Route Percentage

Use the slider to define how much traffic is sent to the test version.
  • Example: Route 30% of calls to the test, and keep 70% on the base agent.
  • Recommendation: Start small (20–30%) to ensure stability before scaling up.
After configuring both, click Next. Configure Test End Criteria Gi

Set Success Criteria

This step defines what “success” means for your test. You can base success on how calls end, what outcomes are tagged, or how long they last.

Call Ended Reason-Based Testing

Evaluates success based on how the call ended.
Use this if you care about the technical or behavioral outcome of the call.
  • Examples of End Reasons:
    • Call Transferred.
    • Voicemail
    • Max Duration
    • Silence Timeout
    • Customer Ended
    • Agent Endedded
Use case: “We want more calls to end in transfers to the sales team.”

Call Outcome-Based Testing

Evaluates based on your defined business outcomes — which you can configure inside your flow.
  • **Examples: **
    • Appointment Booked.
    • Lead Qualified.
    • Legal Inquiry Logged.
    • Support Issue Resolved.
Use case: “We want to see if the new workflow increases lead qualification rate.”

Duration-Based Testing

Optimizes for call length.
  • Shorter Calls: Indicated more efficiency or faster resolution (ideal for support or routing).
  • Longer Calls: Indicates better engagement or deeper discussions (ideal for sales).
Use case: “Does the new prompt shorten average support calls by 15%?”

LLM-Based Evaluation (Coming Soon)

A future option will allow Phonely’s AI to analyze transcripts and automatically evaluate call quality based on context. Once you’ve chosen and configured your success criteria, click Next. Test Success Criteria Gi

Editing the Test Agent

After setup, Phonely automatically duplicates your base agent into a Test Agent.
You’ll see a banner:
“You are editing a test agent. This agent will be used to test the new changes.”
Keep all other elements identical to ensure that results reflect only the changes you made. Once done, click Continue to save your test agent.

Running the A/B Test

After your setup is complete, you’ll return to the A/B Testing dashboard.
  1. Under the Planned section, find your new test.
  2. Click Begin Test to start routing calls.
Your test will then appear under In Progress, showing live metrics such as:
  • Success Rate for each agent.
  • Total Answered Calls.
  • Traffic Allocation.
Calls will automatically be divided between your Base Agent and Test Agent. Execute An Ab Test Gi

Monitoring and Analyzing Results

You can monitor ongoing results anytime during the test:
  • Track Success Rate Trends to see which variant performs better.
  • Check if call allocation percentages remain balanced.
  • Review call outcomes and end reasons to ensure tagging consistency.
Once your call limit or duration target is reached, the test moves to the Completed section. Click View Results to analyze:
  • Performance metrics (success rate, duration, end reason distribution).
  • Comparative insights between Base and Test agents.
  • Which version achieved better alignment with your success criteria.
The variant with the higher success percentage is your winning configuration — which you can apply to your main agent for future calls.

Editing an A/B Test

Phonely allows you to update your A/B test at any point before or during the experiment. This is useful when you want to refine the test name, description, routing percentage, or modify the Test Agent itself. You can edit your test from the Planned or In Progress section.
  1. Open your A/B Testing dashboard.
  2. Find the test you want to modify.
  3. Click the ⋮ menu in the top-right corner of the test card.
  4. Choose one of the following options:
How To Edit An Ab Test Pn Use this option to update:
  • Test name
  • Description
  • What you’re testing (Voice, Workflow, Agent Settings, etc.)
  • End criteria (number of calls or days)
  • Call route percentage
  • Success criteria.
This opens the same guided setup window you used when creating the test, allowing you to adjust any configuration step-by-step.

Edit B Test Agent

Selecting this option opens the Test Agent, which is the duplicate created during setup.
You’ll see a banner reminding you:
“You are editing a test agent. This agent will be used to test the new changes.”
Only modify the specific variables you want to test.
All other settings should remain identical to your Base Agent to ensure fair and reliable results.
When finished, click Continue to save your changes.

Delete Test

If you want to remove a planned or completed test entirely, choose Delete Test.
(Tests already running cannot be deleted until they finish.)

When to Edit a Test

You might want to make edits when:
  • The description or test name needs clarification.
  • You want to adjust call routing (e.g., from 30% to 45%).
  • You need to modify the workflow, voice, or KB version in the Test Agent.
  • You decide to extend the test duration from 1 day to 7 days.
  • You want to tighten or change the success criteria.
Any changes you make will immediately update the test configuration.