Recommendation Engine for Audio Content

Executive summary

Our Customer

Trinity Audio is a company that specializes in developing an AI-driven ecosystem of solutions that help manage audio experiences for publishers and content creators. These solutions encompass a wide range of features, including voice editing, content discovery, virtual assistant skills, and data analytics, among many others.

The Obstacles They Faced

The customer aimed to elevate the user experience through real-time content recommendations based on the context and popularity of articles. However, they faced the challenge of developing such a system efficiently, within the constraints of time and budget.

How We Helped

The recommender system we developed and implemented has significantly boosted the audio completion rate in the client’s solution – a metric of core value to the client’s business.

The Challenge

The client’s product monetization strategy depends on advertising, making it imperative that users stay engaged and listen to audio content through to completion on various web platforms that incorporate their product. The crux of the challenge lay in elevating the content completion rate.

So the objective was to ensure that users received highly relevant and appealing content recommendations, compelling them to remain on the platforms for extended periods. This, in turn, would naturally lead to an increase in the exposure to and consumption of advertisements.

The Solution

Choosing the recommendation approach

There are two types of engines used for making recommendations: collaborative filtering and content-based filtering. Collaborative filtering systems rely on user behavior data. Since we do not collect such data about our providers or their end-users, this type did not suit us. On the other hand, content-based filtering requires extensive domain knowledge and accurate content descriptions.

Opting for content-based filtering entailed the following issues:

  1. Feature extraction: The process involves transforming raw data into numerical features that can be analyzed and used for recommendations.
  2. Constant updates: The system requires regular updates to ensure that data remains current, thereby offering relevant recommendations.
  3. Dependence on data quality: Any inaccuracies, inconsistencies, or lack of relevant data can result in suboptimal recommendations.
  4. High system performance: It is essential for the recommendation system to operate at high efficiency, providing immediate suggestions to users.
  5. Cost-effectiveness of development: The development of the system must be financially sustainable to ensure affordability without compromising on quality.

Engines used for making content recommendations

Crafting the optimal content recommendation formula

In our quest to recommend the most compelling content for the solution’s users, we synthesized three pivotal factors to develop our proprietary recommendation formula.This multifaceted approach ensures our recommendations are both precise and pertinent, catering to the diverse preferences of our audience.

Contextual score
Our approach involved a meticulous analysis of article data to identify keywords that encapsulate overarching topics and to determine a content matching score through the TF-IDF technique. This analysis encompassed several dimensions of the articles, including:

  • Tags/sections: identifying the categories or themes the article belongs to.
  • Title: the headline or primary identifier of the article.
  • Description: a brief overview outlining the article’s content.
  • Text: the full body of the article, where the core message resides.

Performance score
We integrated article performance metrics into our formula, such as the number of times content was started (referred to as the ‘Content Started’ number), completion rates, and the publication date. This incorporation allows us to recommend content that not only aligns closely with user interests but also ranks high in popularity and timeliness.

Relevance metric
This metric ensures that our recommendations not only resonate with the users’ interests and preferences but also prioritize the freshness of content. The relevance metric is designed to capture the ever-evolving interests of users, who seek content that reflects the latest trends and developments.

Here is the final formula we established for content recommendation engine

Content recommendation engine formula

Technical implementation of the recommender system

In addition to configuring recommendations, our solution also conducts thorough testing. As a result, content publishers can select the preferred criteria for recommendations: by context, by performance, or a combination of both (or any other potential types, as we explore what customers want). The providers can also selectively remove articles from the recommended content.

Here is a detailed explanation of the engine’s technical architecture and the components involved:

  • Data streaming and processing
    We utilized Amazon EMR, equipped with Apache Spark, for our ETL (Extract, Transform, Load) processes. The EMR runtime for Apache Spark offers a performance enhancement of over 3x compared to clusters not utilizing the EMR runtime. This efficiency accelerates workload processing and reduces compute costs, all while requiring no modifications to existing applications.
  • Data storage and retrieval
    For data storage, we opted for ElasticSearch running on Amazon EC2 instances. This choice was driven by ElasticSearch’s capabilities in scoring documents based on their relevance to queries and filtering documents by specific attributes. It also excels in handling operations efficiently at scale, both in terms of processing throughput and managing large datasets. This implementation enables the swift creation of recommendation playlists, with response times as quick as 30 milliseconds.
  • Continuous data update
    To ensure the recommendation engine remains current, we implemented a system where the Elasticsearch index is regularly updated with the latest content. Spark updates the index daily with data from the past two weeks, and throughout the day, it adds new articles to capture trending topics that emerge. The approach guarantees that our recommendations include the most recent and relevant content.
  • Scalable code execution
    We employ a Lambda architecture to integrate both historical and real-time data, ensuring our system recommends the most popular articles at any given moment. Moreover, AWS Lambda plays a crucial role in our solution, facilitating the execution of Python code in a serverless environment. It is a native approach for these types of tasks and is also easily triggered from the backend side of the application (Audio Player).

Content Recommendation Engine – Architecture Diagram

Content Recommendation System – Architecture Diagram

Amazon Web Services utilized:

  • Amazon EMR (with Apache Spark)
  • AWS Lambda

What We Achieved Together

By offering recommendations that are both relevant and engaging, we’ve seen a notable increase in user interaction with the content, which has sequentially led to a higher volume of advertisement listens. The core achievement of this project is the improvement of the audio completion rate by up to 40% – an essential metric for our client’s business.

More specifically, the key achievements of implementing the content-based recommendation system include:

  • High-performance with affordable development
    We engineered a solution that excels in performance while keeping the costs manageable. This efficiency is evident in the computational resources utilized, as well as the development time invested.
  • Rapid recommendation playlist creation
    The engine is capable of generating a personalized recommendation playlist in 30 milliseconds for each user. These immediate content suggestions maintain their engagement without noticeable delays.
  • Flexible content discovery preferences
    Allowing the publishers to choose their preferred type of recommendation aligns content discovery with personal interests and behaviors, making each interaction with the platform more satisfying.
  • Amplified user engagement
    A/B testing conducted post-implementation has demonstrated a tangible increase in the average time users spend on a page. This metric is a clear indicator of the system’s effectiveness in capturing and maintaining user interest.

Why Romexsoft

Romexsoft is an AWS-certified Consulting Partner, trusted Software Development Company and Managed Service Provider, founded in 2004. We help customer-centric companies build, run, and optimize their cloud systems on AWS with creative, stable, and cost-efficient solutions.

Our key values

  • Delivery of quality solutions
  • Customer satisfaction
  • Long-term partnership

We have successfully delivered 100+ projects and have a proven track record in FinTech, HealthCare, AdTech, and Media industries.

Romexsoft possesses a 5-star rating on Clutch due to its strong expertise, responsiveness, and commitment. 60% of our clients have been working with us for over 4 years.

Let’s Talk about Your Business Needs!

    Related Success Stories

    AWS Services for E-learning SaaS | Customer Case Study | EdTech

    IT Staff Augmentation Services | Customer Case Study | AdTech

    Open Market is an enterprise web application for managing advertisements on radio stations and creating advertising network on radio broadcast.