Reworking Location Retrieval at Airbnb: A Journey from Heuristics to Reinforcement Studying | by Dillon Davis | The Airbnb Tech Weblog | Nov, 2024

How Airbnb leverages machine studying and reinforcement studying methods to unravel a singular data retrieval job with the intention to present friends with distinctive, inexpensive, and differentiated lodging all over the world.
By: Dillon Davis, Huiji Gao, Thomas Legrand, Weiwei Guo, Malay Haldar, Alex Deng, Han Zhao, Liwei He, Sanjeev Katariya
Airbnb has reworked the best way individuals journey across the globe. As Airbnb’s stock spans various areas and property sorts, offering friends with related choices of their search outcomes has turn into more and more complicated. On this weblog publish, we’ll talk about shifting from utilizing easy heuristics to superior machine studying and reinforcement studying methods to remodel what we name location retrieval with the intention to handle this problem.
Company sometimes begin looking by getting into a vacation spot within the search bar and anticipate probably the most related outcomes to be surfaced. These locations will be nations, states, cities, neighborhoods, streets, addresses, or factors of curiosity. In contrast to conventional journey lodging, Airbnb listings are unfold throughout completely different neighborhoods and surrounding areas. For instance, a household trying to find a trip rental in San Francisco may discover higher choices in close by cities like Daly Metropolis, the place there are bigger single-family properties. Thus, the system must account for not simply the searched location but additionally close by areas that may supply higher choices for the visitor. That is evidenced by the areas of booked listings when trying to find San Francisco proven beneath.
Given Airbnb’s scale, we can’t rank each itemizing for each search. This introduced a problem to create a system that dynamically infers a related map space for a question. This technique, often known as location retrieval, wanted to stability together with all kinds of listings to attraction to all friends’ wants whereas nonetheless being related to the question. Our search rating fashions can then effectively rank the subset of our stock that’s throughout the related map space and floor the most related stock to our friends. This technique and extra is printed beneath
Initially, Airbnb relied on heuristics to outline map areas based mostly on the kind of search. For instance, if a visitor looked for a rustic, the system would use administrative boundaries to filter listings inside that nation. In the event that they looked for a metropolis, the system would create a 25-mile radius across the metropolis heart to retrieve listings.
Bettering these heuristics proved to be profoundly impactful. One such instance is the introduction of a log scale parameterized easy operate to compute an growth issue for the diagonal dimension of the executive bounds of the searched vacation spot. We utilized this for very exact areas like addresses, buildings, and POI’s leading to a 0.35% improve in uncancelled bookers on the platform when examined in an internet A/B experiment in opposition to the baseline heuristics. Figures beneath show how search outcomes for a constructing in Ibiza, Spain improved dramatically with this heuristic by surfacing considerably extra and better high quality stock.
These heuristics have been easy and labored properly sufficient to begin, however that they had limitations. They couldn’t differentiate between several types of searches (e.g., a household in search of a big dwelling versus a solo traveler in search of a small house), they usually didn’t adapt properly to new knowledge as Airbnb’s stock and visitor preferences advanced.
With extra knowledge accessible over time from these instinct based mostly heuristics, we thought there is perhaps a technique to make the most of this historic person reserving conduct to enhance location retrieval. We constructed a dataset for every journey vacation spot that recorded the place friends booked listings when trying to find that vacation spot. Primarily based on this knowledge, the system may create retrieval map areas that included 96% of the closest booked listings for a given vacation spot.
We examined these newly constructed retrieval map areas in lieu of the instinct based mostly heuristics outlined above based mostly on the speculation that it will present friends a extra bookable choice of stock. Whereas this statistical strategy was extra aligned with visitor reserving conduct, it nonetheless had limitations. It handled all searches for a location the identical, no matter particular search parameters like group dimension or journey dates. This uniform strategy meant that some friends won’t see the very best listings for his or her specific wants. Consequently, this statistics based mostly technique had no detectable improve in uncancelled bookers on the platform when examined in opposition to the heuristics outlined above in an internet A/B experiment. This led us to consider that location retrieval could require extra superior methods reminiscent of machine studying.
As an alternative of solely counting on previous reserving knowledge, the brand new system may be taught from varied search parameters, such because the variety of friends and keep period. By analyzing this knowledge, a mannequin may predict extra related map areas for every search, quite than making use of a one-size-fits-all strategy.
For instance, a bunch of ten vacationers trying to find a San Francisco trip rental may desire bigger properties within the suburbs, whereas solo vacationers may prioritize central areas. The machine studying mannequin may distinguish between these completely different preferences and modify the retrieval map areas accordingly, offering extra tailor-made outcomes.
We constructed our machine studying mannequin within the following method. This can be a results of three iterations that launched the machine studying mannequin, expanded its characteristic set, and expanded search attribution. The structure is depicted within the determine beneath.
- Coaching Examples: Searches issued by a booker by getting into a vacation spot within the search bar or manipulating the map that contained the booked itemizing of their search outcomes on the identical day or someday earlier than the reserving. We discard any bookings which might be canceled 7 days after reserving.
- Coaching Options: We derive options straight from the search request reminiscent of location title, keep size, variety of friends, worth filters, location nation, and so on. There are 9 steady options and 19 categorical options in complete.
- Coaching Labels: The latitude and longitude coordinates of the booked itemizing attributed to the search
- Structure: A two layer neural community of dimension 256 was chosen with the intention to have extra flexibility for loss formulation in comparison with conventional regression and choice tree based mostly approaches.
- Mannequin Output: 4 floats that outline the latitude and longitude offsets from the middle latitude and longitude coordinates of the searched vacation spot that symbolize the related map space.
- Loss: Skilled to foretell map areas that include their related booked itemizing whereas minimizing the scale of the expected map space and the incidence of predictions that can’t assemble a legitimate rectangular map space.
The machine studying system elevated the recall of booked listings (i.e., how typically the system retrieved an inventory that was ultimately booked) by 7.12% and diminished the scale of the retrieval map space by 40.83%. It had a cumulative impression of +1.8% in uncancelled bookers on the platform. The preliminary mannequin was evaluated in opposition to the baseline and every subsequent mannequin iteration was evaluated in opposition to the previous outgoing mannequin.
Figures beneath show how search outcomes for a particular road in Lima, Peru improved dramatically with the mannequin by surfacing outcomes which might be a lot nearer to the searched road.
Earlier than
After
Whereas machine studying improved the system’s means to distinguish search outcomes, there was nonetheless room for enchancment, notably in studying whether or not areas that had by no means been surfaced earlier than have been related to friends for a search. To handle this, Airbnb launched reinforcement studying to the situation retrieval course of.
Reinforcement studying allowed the system to constantly be taught from visitor interactions by surfacing new areas for a given vacation spot and adjusting the retrieval map space based mostly on visitor reserving conduct. This strategy, often known as a contextual multi-armed bandit downside, concerned balancing exploration (surfacing new areas) with exploitation (surfacing earlier profitable areas). The system may actively experiment with completely different retrieval map areas studying from visitor bookings to refine its predictions.
Making use of a contextual multi-armed bandit historically requires defining an energetic contextual estimator, a way for uncertainty estimation, and an exploration technique. We took the next strategy given product constraints, system constraints, and the character of our mannequin formulation. The structure is depicted within the determine beneath.
- Lively contextual estimation: We employed our present machine studying mannequin for location retrieval retrained every day to recurrently be taught from any new bookings knowledge that we acquire whereas surfacing beforehand unshown areas.
- Uncertainty estimation: We modified our mannequin structure with a random dropout layer to generate 32 distinctive predictions for a given search (Monte Carlo Dropout). This permits us to measure the imply and customary deviation of our prediction whereas minimizing adverse impression to system efficiency and modifications to our present mannequin formulation.
- Exploration Technique: We compute an upper confidence bound utilizing the imply and customary deviation of our prediction with the intention to assemble bigger retrieval map areas based mostly on the mannequin’s confidence in its prediction for the search.
This technique efficiently explored extra for less-traveled areas the place it was much less assured and explored much less for areas which might be typically searched and booked. For instance, pictured beneath are the imply (inside) and higher confidence certain (outer) estimates of retrieval map areas for San Francisco, CA (left) and Smith Mountain Lake, Virginia (proper). San Francisco is searched virtually 25x greater than Smith Mountain Lake with proportionately extra bookings as properly. Consequently, the mannequin is extra assured in its retrieval map space estimate for San Francisco vs Smith Mountain Lake leading to 2–3x much less exploration for San Francisco queries vs Smith Mountain Lake.
The reinforcement studying system was additionally examined in opposition to the outgoing machine studying mannequin in on-line A/B experiments displaying a cumulative 0.51% improve in uncanceled bookers and 0.71% improve in 5 star journey price over two iterations that launched reinforcement studying and optimized scoring of the extra complicated mannequin.
Airbnb’s journey from easy heuristics to classy machine studying and reinforcement studying fashions demonstrates the facility of data-driven approaches in remodeling complicated methods. By regularly iterating and enhancing its location retrieval course of, Airbnb has not solely enhanced the relevance of its search outcomes but additionally helped friends expertise extra 5 star journeys.
This transformation cumulatively ends in a 2.66% improve in uncanceled bookers — a serious achievement for a corporation working at Airbnb’s scale. Extra particulars will be present in our technical paper. As Airbnb continues to innovate, we’re constantly evaluating and introducing extra superior options and retrieval mechanisms like retrieving with complicated polygons . These will additional refine and improve the search expertise for tens of millions of friends worldwide.
If the sort of work pursuits you, take a look at a few of our associated positions and extra at Careers at Airbnb!
All product names, logos, and types are property of their respective house owners. All firm, product and repair names used on this web site are for identification functions solely. Use of those names, logos, and types doesn’t suggest endorsement.