Since the pandemic, the airline industry has seen a significant rise in customer dissatisfaction, reaching unprecedented levels around the world.
We clearly showcased this phenomenon in our air travel passenger sentiment analysis, and other data sources support this notion.
According to Reuters, regulatory agency data from countries like Germany and Canada show that complaints in 2023 have surged to pre-pandemic highs.
An examination of the U.S. Department of Transportation’s (USDOT) Air Travel Consumer Report Archive further illustrates this troubling trend.
- In 2022, passenger complaints skyrocketed by over 400% compared to pre-pandemic figures in 2019, a level of discontent that further intensified into 2023.
- By mid-2023, the volume of complaints became so enormous that the U.S. DOT ceased publishing complaint data, overwhelmed by the sheer number of cases.
This alarming rise in complaints primarily stems from operational disruptions, including significant increases in flight delays and baggage issues. A critical factor exacerbating these challenges is the sharp reduction in airline and airport personnel—a repercussion of the pandemic-induced travel downturn.
An Oxford Economics study shows that employment in the aviation industry was 21% lower in 2022 compared to 2019. It’s no wonder that the industry struggles with a workforce shortage that directly impacts operational smoothness and the overall traveler experience.
Addressing the Crisis with GenAI
In the face of ongoing labor shortages that are unlikely to be resolved in the short to medium term, the airline industry must rely on technological innovation and automation to enhance operational efficiencies and manage customer complaints.
This necessity raises a critical question at the heart of today’s analysis:
Can Generative AI (GenAI) help airlines improve their customer service?
While GenAI has numerous back-office applications, this analysis focuses on its customer-facing potential, particularly through the use of GenAI-powered airline chatbots.
Such AI-driven chatbots could have massive potential to alleviate passenger frustrations. This may not be resolving issues like flight delays or baggage mishandling directly. Instead, they could efficiently manage the cascade of customer service tasks these issues generate, such as cancellations, rebookings, and refunds.
We have solid reasons to believe this is possible. Apparently, the evolution of chatbots, bolstered by advancements in GenAI and large language models (LLMs), has already significantly enhanced user-friendliness and problem-solving capabilities for a few selected carriers.
- For instance, IndiGo’s “6Eskai” chatbot has reportedly cut the workload of its customer service agents by 75%.
- Similarly, Air India’s “AI.g” chatbot boasts a containment rate of 93%, meaning only 7% of traveler queries require human intervention.
If these achievements could be mirrored industry-wide—enabling approximately 90% of passenger requests to be effectively resolved via next-generation AI-powered chatbots—the potential for managing and reducing customer complaints is substantial.
Therefore, our analysis today delves into the current sophistication of airline chatbots and the extent to which GenAI has been integrated into these systems.
Ultimately, we aim to assess whether GenAI-enhanced chatbots can indeed elevate airline customer support to the levels demonstrated by IndiGo and Air India.
How We Tested the Impact of GenAI on Airline Chatbots
To assess the performance of current airline chatbots and determine the impact of GenAI, we designed a comprehensive benchmarking test. This involved evaluating whether airlines are incorporating GenAI into their chatbots and how this technology affects chatbot performance compared to traditional chatbots.
Here’s how we selected the airlines for our analysis:
- We began by identifying all available customer-facing chatbots used by major airlines worldwide, irrespective of whether they are GenAI-powered or not.
- We categorized these airlines into major geographic regions: EMEA, America and APAC.
- For each region, we selected at least two low-cost carriers and four full-service carriers (based on their size). This ensures a representative sample of the entire airline industry.
- We limited our selection to no more than two airlines per country, except for the United States, where we tested three airlines given their size and industry relevance.
- Where possible, we prioritized airlines with a reputation for developing innovative digital products.
In total, we tested chatbots from 21 major airlines.
Each chatbot was evaluated on the same 15 functionalities, which cover the most relevant traveler self-service needs across the typical airline journey.
For each functionality, we scored the airline chatbots based on their ability to fulfill the task: fully (1 point), partially (0.5 points), or not at all (0 points). Therefore, the maximum score an airline chatbot could achieve was 15 points (100%).
Let’s explore what we found out.
Insight #1: GenAI Airline Chatbots Clearly Outperform the Competition
Our analysis reveals that airline chatbots using GenAI technology provide a much more comprehensive customer service experience compared to those that don’t.
On average, GenAI-based chatbots scored 24.2 percentage points higher in our benchmarking, achieving a score of 63.3%.
Notably, only GenAI-based chatbots offered itinerary planning and fully supported flight bookings within the chatbot interface.
Furthermore, the qualitative experience with GenAI-based chatbots was significantly better due to their superior ability to understand user intent. This advanced natural language understanding, a hallmark of GenAI technology, reduces the need for travelers to navigate through extensive menus. Instead, users can simply use natural language to find answers to their questions, making interactions more intuitive and efficient.
Insight #2: GenAI Chatbots Excel in Early Journey Phases and Inclusivity
While the overall superior performance of GenAI airline chatbots is notable, a deeper look into specific functionalities reveals more nuanced insights.
Here are the key findings from our detailed comparison.
- GenAI-powered chatbots particularly excel in the early stages of the traveler journey, such as itinerary planning. These functionalities showed the greatest performance gap compared to traditional chatbots, indicating prime areas for airlines to enhance their chatbots with GenAI. This opens up opportunities for airlines to engage with customers much earlier in the travel planning stage than currently, where most travelers first interact with an airline brand when they are ready to book a flight.
- Given the airline industry’s global nature—Lufthansa’s global route network, for example, spans over 80 countries—it is particularly encouraging to see the multilingual support in today’s GenAI-powered airline chatbots. This makes an airline’s chatbot accessible and helpful to everyone, not only those communicating in the airline’s native language or English. GenAI is a true superpower in this regard. Interestingly, the technology’s foundational breakthrough, the “Transformer” deep learning architecture, was originally developed to translate text between languages. This capability could alleviate the frustrations of many travelers unfamiliar with English or the airline’s native language and open the door to new markets of passengers who might otherwise feel uncomfortable navigating an airline’s services in a foreign language. By enabling multilingual support, GenAI creates a more inclusive travel experience.
- The only functionalities where non-GenAI chatbots scored above 50% on average were flight booking and flight status. Interestingly, flight status was the only function where non-GenAI chatbots outperformed their GenAI counterparts. This suggests that while GenAI offers many benefits, it is not a cure-all.
- In areas such as flight rebooking management (as well as ancillary services, loyalty, and post-flight feedback sharing), GenAI-powered chatbots performed better but with a relatively small margin compared to traditional chatbots. This indicates that while GenAI chatbots can boost overall chatbot performance, they are not a wonder weapon for managing complex tasks like advanced rebooking management during flight disruptions.
Recommendations for Implementing GenAI-Powered Chatbots
Given the significant insights from our benchmarking analysis, it’s clear that GenAI-powered airline chatbots can greatly enhance digital self-service options for travelers, thereby alleviating pressure on service hotlines and overall customer dissatisfaction.
However, we must acknowledge that we are still far from having GenAI chatbots capable of effectively handling complex tasks like flight rebooking, which travelers desperately need during disruptions.
Our recommendation to airlines is to start developing GenAI-powered chatbot interfaces immediately but with a keen awareness of the critical challenges involved.
It’s a common misconception that implementing this technology is as simple as integrating a ChatGPT wrapper around an internal information database. In reality, the process is much more complex.
As highlighted in a blog post by Arcus, converting chat messages into accurate database queries with the correct context to retrieve the desired information is a particularly challenging problem. Hastily deployed systems risk retrieving incorrect information and presenting it confidently as fact, a phenomenon known as hallucination. This not only frustrates users but also leaves airlines vulnerable to legal repercussions.
- For instance, a recent court ruling held Air Canada liable for incorrect information provided by its chatbot.
- Although this example doesn’t specifically refer to a GenAI chatbot, the challenges of accurate chat-to-query translation and the tendency for GenAI systems to hallucinate are significant issues.
Therefore, airlines must ensure that any GenAI chatbot systems are meticulously trained on up-to-date and reliable information.
Learning from OTAs: Best Practices for GenAI-Powered Chatbots
We also recommend that airlines look beyond their immediate industry to learn from best practices in the broader travel-tech context.
- Online Travel Agencies (OTAs) have been pioneers in rolling out GenAI-powered chatbots, offering valuable insights for airlines.
- For instance, Kayak and Expedia were among the first to release ChatGPT plugins in early 2023, experimenting with this technology’s potential from the outset.
We find OTA chatbot use cases particularly relevant because users’ customer journeys are quite similar to those of airlines, making their advancements especially pertinent.
After testing multiple OTA apps, we were particularly impressed by the chatbots of Trip.com and Despegar. These platforms stood out due to their accessibility and innovative use of GenAI.
Here are some best practices observed from Trip.com and Despegar that airlines could adopt to enhance their own chatbot implementations:
Best Practice #1: Ensure Chatbots are Easily Accessible
Mobile accessibility is crucial for travel industry chatbots, as customers are often “on the move” and may only have access to their mobile devices in hectic situations. Therefore, it is essential that airline chatbots are easily accessible within their mobile apps.
However, our benchmarking analysis reveals that most airline chatbots have limited or no in-app accessibility. Specifically:
- Only 38% of the airline chatbots we tested are natively accessible in the airline app.
- 33% of airline apps pull up the chatbot within a mobile browser, resulting in a far clunkier user experience.
- The remaining 29% of airline apps provide no access to a chatbot whatsoever.
Even for those that offer chatbots directly within the app, they are often deeply nested within link trees among numerous other links, making them difficult to find—far from the smooth and superior user experience customers are used to with Amazon’s one-click shopping.
In contrast, the Trip.com and Despegar apps have their chatbots natively integrated and prominently featured on the home screen for easy access.
This is critical in the context of customer service, as Hubspot research shows that 90% of consumers expect an “immediate” answer to customer service questions, defined as within 10 minutes of the issue or question arising.
Best Practice #2: Proactively Guide Users to Their Desired Actions
Among the airline chatbots we tested, most only provide information about a requested topic without helping users take action. This means that the majority of airline chatbots leave customers to their own devices, even when they understand the type of help travelers need. When chatbots do attempt to assist, the process is often unrefined or cumbersome.
For instance, when asked to help with booking a new flight:
- 67% of the airline chatbots we tested do not collect any data and simply inform users to visit the airline website or app, sometimes even without providing a relevant link.
- The remaining 33% have dedicated flows to choose flights in-chat or at least collect flight requirements.
- Only 10% of the airlines redirect users within the app instead of opening a browser to complete bookings.
In stark contrast, the Trip.com chatbot automatically pulls up the app’s flight booking flow when asked for business-class flight fares, ensuring a seamless user experience.
That’s how it’s supposed to be.
For a smooth user experience, travelers should be able to get the information they need and get directed to where they can take action, all within a single app environment without being redirected to a browser. This integrated approach ensures that users can complete their tasks efficiently and conveniently.
Best Practice #3: Offer More Than Just Text Communication
Support for multimodal input, which can be defined as the ability to interpret information beyond traditional text input, such as through voice or images, is a rapidly developing trend that can significantly enhance user experience.
This is especially true in stressful situations like flight cancellations or when users are on the go.
However, out of the 21 airline chatbots we tested, only one offered an option besides text to communicate with the chatbot, and that was voice support.
In contrast, both Trip.com and Despegar include voice support as a standard feature.
One particularly innovative use of multimodality in the travel industry, although not yet integrated into a chatbot, is Kayak’s PriceCheck feature.
- This allows users to search for flights based on a screenshot of flight details from other apps or websites.
- The screenshot is analyzed for details like origin, destination, and date, and users are automatically brought to Kayak’s flight results page (aligning with Best Practice #2) without needing to input the information manually.
As technology advances, offering communication options beyond traditional text interfaces becomes increasingly feasible. It would be encouraging to see more airlines adopting such options, allowing travelers to communicate in their preferred manner.
The Path Forward for Airline Chatbots
The current state of airline chatbots is largely unexceptional.
So far, only a minority of airlines have launched GenAI-powered chatbots. Even these GenAI-based chatbots, while offering more comprehensive customer service options than traditional chatbots, still only achieve a 63% score in our benchmarking due to their limited re-booking functionalities (among others).
However, the existing GenAI airline chatbots showcase the immense potential of this technology.
Hence, we firmly believe in the potential for GenAI chatbots to provide superior customer service in the airline context. When properly integrated with back-end systems, GenAI’s ability to recognize user intent can enable unprecedented levels of passenger self-service at scale and in multiple languages.
These chatbots may even evolve to become AI travel agents, creating regular customer touchpoints, which is increasingly critical for fostering relationships between airlines and their customers as traditional loyalty programs lose their effectiveness. For example, Qatar Airways is already advancing in this direction with its Sama chatbot.
We urge airline managers to act now and leverage these insights to build the next generation of chatbots that meet and exceed passenger expectations, securing a competitive edge in an ever-evolving industry.