GenAI at AutoScout24

Markus Ludwig and Scott Stevens · March 18, 2024 · 6 min read

data sciencemachine learning
Share on linkedinShare on facebookShare on twitterShare on reddit

At AutoScout24, we are excited about the possibilities GenAI presents for unlocking value, both internally and for our users.

We were early adopters of Transformers, which are the neural-net architecture underlying large language models (LLMs). In early 2019, we went live with a feature that allows our users to search for cars using natural language. Under the hood, we employed a sequence-to-sequence model to translate user input into filters on our results page. While this early application used the same underlying technology as current LLMs, the models available in early 2024 are not merely larger and more powerful, but qualitatively different. This means they offer both new possibilities and risks. As a consequence, effectively making use of GenAI still poses several unsolved challenges, and, like many others, we are still learning how best to deal with them.

One of the key differences between our early natural-language search feature and current LLM-powered products is that our model, similar to traditional machine-learning approaches, only saw a relatively small subset of internal data. LLMs, on the other hand, have basically seen the entire internet. The exposure to a large variety of different topics means they can make sense of a broad range of natural language inputs. The downside is that they have not only seen helpful, high-quality content, but have also been exposed to darker, less friendly corners of the internet. Furthermore, they have learned a compressed and somewhat blurry representation of the internet and tend to make up for the lack of exact factual recall with plausible-sounding but ultimately incorrect information. This is one of the reasons it remains challenging to build production systems that rely on LLMs.

Alongside issues like cost and response latency, the main difficulty boils down to finding use cases that solve pain points while minimizing the potential risks associated with incorrect or harmful information.

The rapid advances in GenAI make it challenging to outline a long-term strategy, but at the moment, we are pursuing a two-pronged approach:

  • Use internal tooling to improve the efficiency of our employees
  • Develop user-facing features that enable new use-cases and add value for our users

Internal Tooling

We want to encourage and drive the adoption of GenAI tools across the company so we can be faster and better at what we do. Internal tooling can help with a variety of tasks, from code suggestions and chat offered by GitHub Copilot, to text generation and summarization capabilities provided by Copilot for Microsoft 365, to automatically generated meeting summaries via the Zoom AI Companion, as well as the many other assistants for products like Miro, Jira, and Confluence.

We are driven by the belief that companies that adapt and learn to efficiently use these tools will move ahead, while those that don’t will fall behind. The challenge then becomes to encourage and drive their use amongst our employees. We feel the most effective approach for this is to build an active community of users who share their use cases, success stories, and learnings. To facilitate experimentation, we provide getting-started guides for engineers who want to explore LLMs on their own, as well as for those interested in evaluating third-party offerings.

Our approach focuses on avoiding decision paralysis around GenAI tool selection by promoting hands-on experimentation within teams. This practical exploration encourages feedback on tool performance, and will hopefully help in guiding future decisions based on real-world experience and utility.

User-Facing Features

When building product features that are powered by LLMs, we need to consider a different set of risks. While our developers will likely be able to spot and ignore nonsensical code suggestions, a user looking for guidance on how to navigate the many car options available is more likely to fall for incorrect advice that sounds plausible. In order to increase the likelihood of helpful and relevant responses, we need to make sure the LLM has all the context it needs. That means it has to be aware of both what the user is doing on our website and what they currently see. Augmenting the textual user input with this contextual information improves the relevance of responses, as they are now grounded in the experience of our users and our actual inventory.

We also need to tie down our use case as much as possible, since we don’t want to offer a chatbot that writes poems or helps students with their homework. We do this by assessing the user’s input before even asking an LLM to write a response.

For certain applications, restricting interactions to a single conversational turn can also be beneficial. While the current popularity of LLMs is closely linked to chat interfaces, these might not always be ideal. Recently, more alternatives have emerged, offering a different set of trade-offs. For example, allowing users to refine their input directly—rather than through a series of back-and-forth interactions—ensures the model is not distracted by the previous conversation. This approach improves the chances of receiving a high-quality response and increases the likelihood of the LLM adhering to our system instructions.

Another aspect of traditional chat interfaces worth reconsidering is the practice of directly streaming the LLM response to the user, i.e., displaying each word of the text as it comes in, rather than waiting for the finished response. We believe it’s essential to leverage LLMs where they excel and complement them with traditional tools for other tasks. This approach involves shifting from direct streaming to capturing output as conventional JSON responses, which can then be parsed and validated by traditional software components to create hybrid systems. In such systems, the primary use of the LLM is to understand and act on user input with tools, and then to partially narrate the outcomes. This strategy enables the development of richer interfaces incorporating mixed content, extending beyond text to include previews of search results, images, and even individual listings.

We are excited to see where things go next! The field is evolving so quickly that this post will likely soon seem outdated—perhaps within weeks. But we will continue to share insights into how we are leveraging GenAI in future posts—the next one will look deeper into Adopting GenAI in day-to-day tasks at Mobile Engineering.

Share on linkedinShare on facebookShare on twitterShare on reddit

About the authors

Markus Ludwig

Markus is a staff data scientist

Connect on Linkedin

Scott Stevens

Scott is a senior data scientist. Lately, he has been working most with churn and price prediction, as well as a few other projects less easily described. He’s currently leading a project to predict car prices for multiple companies across multiple countries in Europe.

Connect on Linkedin

Discover more articles like this:

Stats

Over 170 engineers

50+nationalities
60+liters of coffeeper week
5+office dogs
8minmedian build time
1.1daysmedianlead time
So many deployments per day
1000+ Github Repositories

AutoScout24: the largest pan-European online car market.

© Copyright by AutoScout24 GmbH. All Rights reserved.