By Nick Morris, Senior Director, Data Analytics, LiveVox
There has been a noticeable increase in buzz about the potential benefits of artificial intelligence ever since generative models like ChatGPT first emerged in late 2022. In response to this wave of enthusiasm, the prospect of using AI to reshape customer engagement and enhance operational efficiencies has sparked intense debate over the ethics of replacing human customer service agents with artificial intelligence. Using generative AI to converse with customers has also raised significant concerns: it is known to provide incorrect or biased information, and it has the potential for misuse.
Table of contents
Despite these misgivings about open-source GPT models, the versatility of the underlying technology has already demonstrated excellent potential for a wide range of contact center applications, such as automating and streamlining processes, generating insights for data-driven decisions, and providing personalized customer interactions.
Consider the following use cases for generative AI in contact centers:
- Speech Recognition: AI can analyze speech patterns and identify keywords or phrases to route calls to the right agent, reducing wait times and improving customer satisfaction.
- Voice Cadence & Volume Visualization and Sentiment Analysis: By analyzing the tone and wording of customer interactions, AI can detect patterns of dissatisfaction or frustration. Providing contact center agents with access to real-time sentiment helps them identify customer emotions so they can provide proactive, personalized service.
- Predictive Analytics: AI can analyze historical data and identify trends that predict future customer behavior. This allows contact centers to proactively address issues before they occur, reducing the number of customer complaints and improving overall satisfaction.
- Agent Assist: AI can provide real-time support for contact center agents by offering suggestions or recommendations, as well as step-by-step procedures to resolve customer issues.
- Automated Chatbots: Uses AI and natural language processing (NLP) to analyze and understand input text and generate appropriate responses.
- Virtual Assistants: Leverages advanced AI, NLP, and machine learning to understand and respond to customer conversations in a conversational manner, freeing human agents for more complex tasks.
With these scenarios in mind, it’s time to think seriously about integrating AI/ML into everyday, high-impact contact center solutions and creating a scalable AI strategy that aligns with your organization’s objectives and vision. It all begins with understanding the foundational building blocks of these technologies. Let’s dive into the essential concepts of machine learning and how they tie into contact center solutions.
Essential concepts of machine learning
Recognizing AI’s capabilities within the contact center is the first step to leveraging it effectively. At LiveVox, much of our work to date has centered on machine learning, which refers to the training of models. There are three basic approaches with distinct applications for contact centers.
Supervised learning
Supervised learning uses labeled data to train models for specific prediction tasks. In this method, the desired outcomes are already known and provided in the training dataset. The model learns to make predictions based on the input-output relationships in the data. Speech transcription, sentiment analysis, and intent recognition fall under this category.
Supervised learning methods are widely used across various technologies in use today. LiveVox’s Customer Happiness Index is a good example. The tool analyzes multiple data points and predicts an outcome based on the patterns discovered. For example, let’s say a customer has dropped out of the IVR twice, called customer care seven times in the last 90 days, and expressed negative sentiment in 50% of those interactions. Based on these data points and the patterns identified, we can predict that this customer might have a negative experience and be considered at risk.
Unsupervised learning
Unsupervised learning is more challenging. The model learns patterns and structures within the data without any predetermined outcomes. The models are not trained, usually due to the absence of labeled data. Instead, numerous data points are fed into it, allowing it to process the training independently. These methods are used for tasks like customer segmentation, anomaly detection, and topic modeling. Typical applications include recommendation systems, fraud detection, and membership or subscriber forecasting.
Deep learning
Another more advanced technique within machine learning is deep learning, which employs supervised learning for training and can also be applied to unsupervised learning tasks. Deep learning focuses on training artificial neural networks with large amounts of data to address highly intricate and open-ended problems. For contact centers, deep learning can be found in applications like real-time agent assistance, intelligent call routing, and predictive analytics.

Overall, machine learning enables contact centers to use their data to gain more significant insights and drive improved customer experiences. By refining the data that is used to train these models, contact centers can further improve the accuracy and capabilities of their ML applications.
Refining data: Transforming raw forms into useful formats
Machine learning models are only as effective as the data they’re trained on. You may have heard the saying, “Data is the new oil” — and it’s true. Like oil, data requires processing and analysis to transform its raw form into a valuable resource (insights).
The input data provided during the training phase is what ML algorithms use to identify patterns and make predictions. The performance of an ML model is significantly reduced if biased or limited data are used. Therefore, the data must be accurate, but you also need to ensure that your data captures the essence of the problem(s) you are trying to solve.
Knowing what data you have and preparing it properly is key.
Where do you find the data?
You can find data from a variety of sources. Data can be structured or unstructured, from log files to customer interactions. It can be siloed in different departments or distributed across multiple locations. Your database administrator can help you determine what’s available, how to access it, and whether the data is usable or needs cleansing.
Most organizations will find that only about half of their data is immediately usable. The most organized data is typically customer data, such as who called you or who you’ve called. However, the attributes surrounding this data — credit scores or how many times you’ve contacted the customer in the past year — will likely need cleansing.
For instance, most companies don’t store voice, email, and SMS conversations in the same format, so this data isn’t readily available for training ML models. To create a holistic picture, consider: How many times has the customer emailed us? Texted us? Called our contact center? Visited our website? Connecting all these data points is where much of your time and effort will be spent. But that’s not the only challenge. Not all your data will come from a single data source. Your data science team will need to spend significant time converting data from multiple sources into a usable format.
How do you know if the data is usable?
All numerical data can be used as training features for an ML model. This means they won’t disrupt the algorithms. However, it’s also important to check if the data is suitable or “fit” for the model. This check requires either expertise in the subject (which is the quicker method) or a deep dive into the data, but only if you already know what outcome you’re looking for. It’s crucial to ensure all the data features, whether they are originally numerical or not, pass this suitability check.
Data that’s divided into categories, like whether a phone call is “inbound” or “outbound,” can also be used. You just have to change these categories into numbers that the computer can understand. There’s no limit to how many different values you can have; deciding on the right number is often a matter of trial and error.
Freeform text, like sentences or paragraphs, is trickier. It’s usually not used in small-scale learning models because it’s hard to translate into a format that’s useful for machine learning. You can give some structure to freeform text by using basic Natural Language Processing techniques or by searching for specific keywords. For example, if you know that a comment field is meant to capture payment information, you can look for words like “paid” and then search for any numbers in the same text. If you can’t do this, it’s not an issue as long as you can replace empty fields (also known as NULL or NaN) with numerical values or another method to remove them.
Techniques for refining your data
The following are a few techniques and considerations when refining data for training ML models. This will give you a glimpse into the complexity of the process, and highlight the significant effort that goes into fine-tuning data.
- The first step in processing data is to ensure all data is in a numerical format. Categorical data can be encoded by mapping each category to a specific number on a scale. However, this can only be done when the categories have a clear ranking or hierarchy. For example, if categories range from “least likely” to “most likely,” these could be represented by numbers 1 and 5, respectively. If there is no inherent ranking among the categories, alternative encoding methods like one-hot encoding should be employed. This method introduces a new numerical characteristic for each category. Specifically, it assigns a binary 1 for the presence of a particular category and a 0 for its absence.
- Raw numerical data is often not helpful “as-is.” You may end up with thousands to millions of features that dilute the predictive power of each feature. In such situations, techniques for reducing dimensionality are typically used to remove redundant and irrelevant features while preserving important ones. This improves ML models’ accuracy and reduces the complexity of the data.
- Another beneficial numerical transformation involves normalizing values to a specific range, with decimals between 0 and 1 being a preferred choice due to the scaling issues many methods face. Alternatively, converting these values into sample statistics can also be useful. While these are not theoretically limited in range, they often fall within narrow, finite boundaries in practice.
- Instead of attempting to identify a subset of numeric features that are linearly independent, it’s possible to create new “refined” features. These are derived from a formulaic calculation based on one or more of the “raw” features. This approach proves particularly effective when known dependency relationships exist between features. Combining them yields a more accurate measurement of the single aspect they all relate to.
Securing and using data responsibly
In addition to engineering overhead, data governance is another aspect that requires careful consideration. You have to decide what to do with this data. In most cases, you would want it integrated into a machine learning system. However, this step involves moving potentially confidential, regulated, and private data from various sources outside the traditional data infrastructure into your organization. Getting all company stakeholders to agree to this data movement is usually difficult.

Additionally, there is the task of determining which data can be safely used to train the ML model. There are a couple key questions to consider:
- What is considered proprietary data?
- Which data could potentially put us at risk due to privacy concerns?
Carefully assess what range of data you’re willing to expose to the model. For instance, you wouldn’t want to train a customer service model with personally identifiable information, such as social security numbers or email addresses, only to have the AI engine respond with that information to answer a customer’s question. Keep in mind that there is always a risk of bleed-through, and you could potentially expose sensitive data in the AI engine.
Navigating the infrastructure and resources
Although data preparation is often viewed as the most labor-intensive aspect of machine learning modeling, it is only one component. Many other elements can influence your AI/ML initiative’s outcome, such as cost, infrastructure, and the requisite skills. The following are a few key considerations when developing a sustainable AI strategy.
Cost modeling challenges
While cost modeling for a hosted AI/ML model like ChatGPT is flexible, it can be difficult to predict. Unlike traditional models where costs are primarily associated with data storage, generative AI models use token-based pricing, where costs are calculated per token processed. The token could be a single character or a word, and the cost includes both the input (the customer’s query) and the output (the generated response). This makes it difficult to estimate costs ahead of time, as the required number of tokens can vary significantly depending on input and output.
While it’s possible to set guardrails on cost, such as not exceeding $20,000 a day, this isn’t a fail-safe method. If the model hits the cost ceiling, it will simply shut down, negatively impacting your customers’ experience.
Instead, a more proactive approach involves thorough upfront testing, data loading, model training, and quality assurance. The model should be tested across various prompts and responses to confirm its expected behavior. Regular data monitoring is also essential for maintaining accuracy and quality. Finally, the model should be regularly updated and improved to adapt to new and evolving inputs.
Infrastructure choices
There are different infrastructure choices for hosting AI models. Cloud-based services that allow you to load your model directly into their engine without many additional infrastructure modifications simplify the process considerably. Amazon SageMaker is one example. This ease of use, however, comes at a higher cost. So it’s worthwhile considering whether the convenience of this type of solution aligns with your organization’s budget and strategic goals.
On the other hand, you could create your own model-hosting infrastructure, which gives you more control and flexibility, but may be more expensive and require more management time. It takes a lot of planning to figure out what level of computing power you need to make the model work. The trend is shifting from CPUs to GPU-based machines due to their superior performance. If you decide to go this route, you’ll need to do a lot of upfront research to pick the right machine size and assess how scalable it is.
It is important to note that GPU-based machines incur costs regardless of whether the software is running. Unlike many cloud services where you pay as you go, these machines charge a fixed cost per day. This pricing model can make testing more expensive, which should be factored into your overall cost assessment.
Must-have skills for your data science team
The data science team is responsible for building models and discovering new and feasible use cases for these models. They decide how to integrate the model into the existing tech stack, and work closely with DataOps to ensure new data, like numerical features, can be transmitted and stored efficiently.
Data science is a multidisciplinary field, and your team members must be proficient in various areas. Your data science team should have the following skills and experience:
- Numerical & Statistical Analysis: Essential for data interpretation and model development.
- Data Structures & Algorithms: Key for understanding ML algorithms.
- Database Management Systems: Crucial for handling large data volumes.
- Data Pipeline Design/Build/Manage: Needed for seamless data flow.
- Experimentation Experience: Provides a rigorous approach to research.
- Communication Skills: Important for explaining complex concepts and reporting findings.
- Advanced Degrees: Typically required for most qualifications.
“Nice-to-have” skills, such as time series analysis and scientific computation, often become necessities but are usually included in the above areas. Remember, not every project requires a dedicated data scientist if your team collectively covers these skills, although oversight by a data scientist can be beneficial.
Partner with LiveVox to turn AI challenges into contact center success
There is no doubt that AI/ML has transformative potential for contact center operations. However, navigating this terrain is complicated and resource-intensive, particularly considering how much skilled personnel, good data, robust infrastructure, precise strategic planning, adequate funding, time, and regulatory knowledge are needed to implement an AI/ML strategy effectively.
With a reliable partner, AI deployment challenges can be transformed into opportunities for growth and innovation. LiveVox has a proven track record and a comprehensive suite of AI solutions. We offer contact centers of all sizes an effortless entry point to AI-powered solutions designed to enhance customer engagement, streamline costs, and enrich the overall customer experience. Importantly, our AI solution enables organizations to innovate with the latest generative AI with security and price consciousness.
Ready to unlock the full potential of AI and ML for your business? Partner with LiveVox to integrate these technologies seamlessly, delivering tangible value and driving your organization toward a future defined by informed decision-making and enhanced customer experiences.
Let’s shape the future together. Talk to a specialist now.