Harnessing Crowdsourcing for Effective Customer Support Categorization
Written on
Introduction to Customer Support Categorization
Congratulations on successfully launching your app or product! Reaching this milestone is a significant achievement, as many do not get this far. The next step is to engage with your customers. What feedback are they providing? Are there any reported issues or bugs?
Instead of sifting through countless reviews to gauge customer sentiment, this article explores how crowdsourcing can streamline the categorization of customer support requests, drawing insights from one of the largest web browsers.
Review Categorization Challenges
Picture yourself receiving 15,000 reviews monthly—how would you manage them? While it seems logical to have an in-house support team read and respond to each review, this approach is neither efficient nor cost-effective.
Interestingly, nearly half of the reviews found online often do not necessitate the full attention of your team, as they may not report bugs or request new features. Consequently, you can categorize these reviews into two distinct groups: nonspecific and specific.
Nonspecific reviews tend to express feelings without providing context. For example, a review stating, "I don't like your app" lacks detail. On the other hand, specific reviews convey emotions alongside clear reasons, such as "I dislike your app because it lacks a night mode." These specific instances are the ones that should be escalated to your support team.
Initial Solutions for Review Filtering
So, how can you effectively filter these two types of reviews? One option is to establish a secondary tier of in-house support that manually categorizes them, or alternatively, employ a machine learning (ML) algorithm for automated classification.
However, the browser team found both methods to be costly. The first method required the upkeep of an additional in-house team, while the second necessitated substantial data for model training. Ultimately, the team opted for crowdsourcing through Toloka, which offers a straightforward solution for acquiring quality labeled data without the need to create and maintain a data labeling pipeline. This process includes uploading data, giving straightforward instructions, defining categories, and providing examples.
Initially, the approach involved categorizing reviews into six different groups, where only one focused on specific reviews needing further attention from the support team. The other five categories were aimed at various types of nonspecific reviews, including:
- Positive reviews lacking details
- Negative reviews lacking details
- Reviews stating "the app does not work"
- Reviews stating "the app has problems"
- Spam
Nonspecific reviews could then receive automated responses tailored for their category.
Although this initial design reduced costs compared to previous strategies, the results were not satisfactory, as the labeling accuracy stood at only 78% and the categories were ambiguous.
Refining the Categorization Process
Due to the insufficient performance of the first pipeline, the team introduced two new categories: ads and hate speech. The revised pipeline achieved an accuracy improvement from 78% to 88%, and the categories became more distinct. Despite this advancement, the company sought to enhance the results even further.
The third pipeline incorporated a two-tier classification system. During the first stage, crowd annotators labeled reviews as positive, negative, spam, or specific. Specific reviews were directed immediately to the support team, while the remaining reviews underwent a second classification stage.
As illustrated in the diagram below, each category possessed its own subcategories in the second tier, resulting in four distinct labeling projects across the pipeline:
This final pipeline design slowed down the review process but significantly improved accuracy to 93%. At this point, the solution was deemed acceptable and is currently in use for the project.
Implementing Your Own Review Categorization Pipeline
If you wish to adapt this project for your own needs, the process is relatively straightforward. As previously mentioned, you can utilize ready-to-use solutions from Toloka to set up your project. By uploading data via API, you will receive results formatted as a JSON file. This format lends itself well to analysis using automated scripts, allowing you to, for instance, gather all reviews of a specific category and send automated responses.
Conclusion
In this article, we explored how crowdsourcing can be employed to classify product reviews effectively. You have seen various pipelines used for experimentation in the context of browser review classification. I hope this information assists you in developing your own classification pipeline for similar projects.
To delve deeper into this project, watch Natasha's presentation. Additionally, if you're interested in further exploring data labeling pipelines, consider joining a data-empowered community.
Below, you may also find other posts of interest:
The first video, "Improve your Customer Satisfaction Ratings - CSAT - by Categorizing your Requests," provides insights on enhancing customer satisfaction through effective categorization.
The second video, "HaloPSA | Ticket Categorisation (Custom Fields)," discusses custom fields for ticket categorization in customer support.