As the demand for video and audio capabilities within applications continues, so does the necessity for precise and accurate audio quality tests.
In today’s culture—which has become highly connected thanks to video conferencing, online gaming, and remote learning—real-time communication capabilities are an expectation for organizations of any size. Take recruiting, for example: according to Gartner research, 86% of companies conducted interviews using a video conferencing tool during the pandemic, and that model has become the norm in many industries. Similarly, in 2022, over 77% of businesses planned to use video conferencing as their primary method of B2B interaction.
In this context, audio quality is often taken for granted—until things go awry. When a call fails to deliver consistent, high-quality audio, conversations are transformed into irritants–impeding productivity and frustrating customers.
To avoid these outcomes, businesses that depend on real-time communication can objectively analyze audio quality to detect and avoid issues. Ensuring consistent and high-quality audio is important for companies implementing WebRTC, an industry standard that enables real-time communications within web pages and mobile apps. With a robust, objective testing methodology, businesses can easily observe how changes to their system architecture and infrastructure affect audio quality in the presence of constraints like packet loss. Using these kinds of tools to detect issues can safeguard the communications experience for a company’s customers and users.
See also: Fused Testing Offers a Blueprint to Capitalize on Test Automation
What Is WebRTC, and Why Is It Important?
Before we get into the nuts and bolts of audio quality analysis, it’s helpful to understand what WebRTC is and the role it plays in enabling real-time communications in modern applications. WebRTC is a web standard that defines simple JavaScript application programming interfaces (APIs) that developers can use to implement real-time voice and video communications within web pages and applications. This technology is available in all modern browsers and can be embedded in native clients on every major platform. Notably, WebRTC underlies video conferencing platforms such as Google Meet, Slack, and Microsoft Teams. WebRTC is also used in settings such as telehealth, enabling video appointments for patients and physicians; in workforce collaboration tools, allowing colleagues to brainstorm both visually and verbally within the same platform; and in surveillance applications, allowing employees to monitor security camera feeds from browsers and mobile devices.
Regardless of the use case, the reality is that we all prefer real, authentic connections through the apps, sites, and services we engage with. As we collectively spend more and more time online, we don’t just want generic comment threads. Instead, we gravitate toward communicating in a way that feels organic and similar to real life.
WebRTC is essential to these use cases as it implements reliable, low-latency communication between browsers and other devices; without this capability, human collaboration using voice and video is impossible. WebRTC provides a robust, full-featured implementation that developers can use to quickly and easily add real-time communication features to existing applications.
How Top-Level WebRTC Providers Perform Audio Quality Tests
That said, these communication features are only effective if the audio is clear and intelligible. If you want confidence that a WebRTC provider can ensure good, reliable audio, making sure they analyze audio quality as described below is a good place to start.
There are many different ways to analyze audio quality. Media layer models, which take actual audio signals as input, are often used because they provide a high level of precision, even though other models offer lower computational costs.
Media layer models can work in different ways, depending on the amount of data that is available. Full-reference models compare a decoded audio signal to the original source signal. Reduced-reference models compare the decoded signal against features of the original; these models are often used when the complete source signal is not available for analysis. Finally, no-reference models analyze a decoded signal on its own, without reference to the original data, by looking for recognizable distortions and artifacts that might have been introduced during encoding and transport.
Full-reference models have been the subject of extensive research in academia and industry, and because they have access to both the source and decoded signals, they can measure changes in audio quality with a high level of accuracy. Additionally, they are a good fit for assessing the performance of WebRTC-based media platforms, as they examine the differences between a source signal and a decoded version of the same signal, which allows the effects of the platform on audio quality to be evaluated.
Industry Standard Full-Reference Models
Several full-reference models are in common use today. POLQA (Perceptual Objective Listening Quality Analysis) and ViSQOL (Virtual Speech Quality Objective Listener) are among the best for most use cases.
POLQA is an ITU-T (International Telecommunication Union – Telecommunications) recommendation. It succeeded the previous standard, PESQ (Perceptual Evaluation of Speech Quality), in 2011. POLQA compares the source and decoded signals using a perceptual psycho-acoustic model that is based on the way humans perceive audio.
ViSQOL is a more recent, open-source model developed by Google. It uses a spectro-temporal measure of similarity between the source and decoded signals to measure the degradation in quality during encoding and transport.
POLQA and ViSQOL perform comparably on most measures, but ViSQOL has a few advantages that make it more attractive for use in evaluating the performance of WebRTC-based infrastructure. It offers support for both speech and high-fidelity audio, which makes it more attractive for applications in which content other than speech plays a role. Additionally, it is open source and written in C++, like WebRTC; this makes it easier to integrate into a WebRTC media platform.
The Significance of Audio Quality Tests
The growing need for real-time audio features in applications and services requires enterprises to be mindful of the impact of audio quality on customer satisfaction. Full-reference models like ViSQOL can perform well for systems built on WebRTC, but businesses should consider the details of their requirements and use cases when choosing a strategy.
Whatever approach an organization implements, audio quality should be monitored over time so that the effects of changes in the organization’s infrastructure and implementation can be measured.