Metis: Constructing Airbnb’s Subsequent Era Information Administration Platform | by Xiaobin Zheng | The Airbnb Tech Weblog | Jun, 2023

How Airbnb developed our information catalog right into a platform for managing and governing our information warehouse at scale.
By: Erik Ritter, Jiaxin Ye, Sylvia Tomiyama, Woody Zhou, Xiaobin Zheng, Zuzana Vejrazkova
At Airbnb, hundreds of thousands of information property exist in a fancy ecosystem to tell our enterprise and enhance our merchandise. The Information Administration crew’s mission is to empower the corporate to handle its information ecosystem at scale.
To do that, we want an correct understanding of the entire property in our ecosystem and the way they relate to one another. In different phrases, it requires correct metadata. Our information administration platform Metis, named for the Greek goddess of excellent counsel, is our answer to make sure that reliable metadata may be captured, managed, and consumed at scale.
Metis is an evolution of our present basis of metadata merchandise inside Airbnb.
Dataportal was our first effort in direction of democratizing information: efficiently enabling information customers to seek out trusted information. It was an enormous boon to productiveness and fairly forward of its time.
As information reliability and compliance laws grew to become necessary, we would have liked a extra complete and detailed understanding of how information was reworked. This led to our adoption of Apache Atlas as our information lineage answer. Apache Atlas powers merchandise like SLA Tracker (see Visualizing Information timeliness at Airbnb), which mixes touchdown time metadata and lineage to allow debugging upstream information delays.
As our necessities for metadata elevated, increasing to extra areas like value administration, information high quality, and so forth, our wants for a knowledge catalog have expanded:
- Means to manipulate each the information and metadata describing it
- Guardrails and suggestions to enhance information high quality
- Auditability of a dataset’s historical past, each for debugging & governance functions
We quickly discovered that information administration needed to be pursued as a self-discipline, thus constructing Metis because the one-stop-shop for accessing all information metadata.
Metis is made up of three core merchandise: Dataportal, Unified Metadata Service (UMS), and Lineage Service. Collectively, this platform permits Airbnb to handle hundreds of thousands of information property throughout many domains. A brief checklist of property we help embrace:
- Apache Hive and Trino datasets
- Metrics and Dimensions, powered by Airbnb’s Metric Platform: Minerva
- Charts and Dashboards from Apache Superset and Tableau®
- Information Fashions, together with these licensed by Midas
- Machine Studying options and fashions
- Groups and staff of Airbnb (not technically a knowledge asset, however key to help prime quality possession and guarantee metadata stays updated for all of the above information property)
On a excessive stage, Metis consists of following elements:
Dataportal — serves as a catalog and administration UI for human customers.
Viaduct — Airbnb’s in-house GraphQL API layer modeling offline information ecosystem.
UMS Core service — a backend service holding system schema and enterprise logic wanted for metadata administration.
Metadata storage:
- MySQL — primarily storing vital metadata that must be centrally managed
- Lineage Graph — a centralized service accumulating and serving information lineage
- Elasticsearch — serving search & discovery use instances
Offline Element — exterior to UMS Core service to carry out offline duties: e.g. offline metadata consistency test, coverage enforcement.
Offline Dataset — offline export of metadata for analytics use instances.
Dataportal serves because the UI for Airbnb’s information catalog and is a spot for individuals to seek out and handle all of the property supported by Metis. It’s constructed as a Single Web page Software utilizing React and TypeScript and is due to this fact versatile sufficient to serve the massive number of workflows required for information administration and governance. The frontend communicates with UMS and different providers by way of a GraphQL API; that is particularly necessary as we wish to stop each sequential fetches of lineage data and over-fetching massive quantities of metadata to make sure a performant person expertise.
The Dataportal expertise begins with search, in order that each information customers and information homeowners can discover the property they want. We’ve designed our search and discovery expertise with just a few rules in thoughts:
- Show related metadata instantly within the search outcomes to assist individuals discover the precise asset they’re in search of
- Uprank prime quality and generally used information property, within the case that the person is unaware of the precise asset they want
In consequence, search outcomes are inclined to return prime quality, licensed datasets, together with the outline, current person rely, and final time it was modified to assist the person discover which asset they wish to choose:
As soon as the specified asset is positioned, the person can go to the Entity Web page to carry out a big number of consumption, administration, and governance actions. We construction all of the content material on the entity web page into tabs grouped by class of information or motion:
Consumption and documentation associated tabs make it simple for individuals to learn to use this desk, with column and desk descriptions within the Configuration tab, proprietor and client information on the Factors of contact tab, and additional particulars on methods to use the desk on the Documentation tab. Past that, these pages additionally enable customers to tackle administration actions, as seen within the under screenshots:
The above screenshot highlights solely a subset of the way we leveled up the Dataportal from a searchable information catalog into the one centralized place to handle and govern all of your information property.
Unified Metadata Service, or UMS, is the backend core of our centralized information administration platform. It gives:
- A centralized schema and Graphql API layer on prime of it to entry metadata
- A centralized relationship graph to attach siloed metadata
- Centralized metadata administration capabilities to allow programs to satisfy compliance and governance necessities with out reinventing the wheel
The centralization of metadata into UMS prevents all metadata suppliers and customers from needing to combine with one another; as an alternative all suppliers and customers solely should combine with UMS:
UMS performs varied roles throughout metadata integrations and use instances. In a decentralized information ecosystem, we’re very opinionated about what metadata must be saved, replicated to, or served by UMS.
UMS helps proxying learn requests to many information programs. This consists of proxying learn requests to:
- Hive Metastore for desk schema and desk properties.
- Lineage service for uncooked Hive desk information lineage.
- Information Governance service for information governance standing for datasets.
UMS centrally manages just a few vital enterprise metadata and shops in its personal metadata database with administration capabilities:
- Validation and authorization for updates
- Audit historical past
- Approval workflow for delicate operations on vital metadata
As a part of Airbnb’s Information High quality Initiative, we carried out information high quality scores which can be instantly tied to every information asset within the information warehouse. Information high quality scores for datasets are generated in an offline method and ingested into UMS metadata database for on-line consumption.
Much like conventional information catalog, UMS centrally manages indexes in an Elasticsearch cluster for various entities to energy information discovery.
There are instances the place metadata must be saved or replicated into Metis storage layer. UMS integrates with metadata suppliers in a wide range of paved mechanisms to ingest metadata leveraging Airbnb’s tech stack. These embrace:
- Stream processing (Flink) jobs ingesting metadata change occasions.
- ETL(Airflow) jobs that run each day to tug from metadata suppliers and push to UMS.
- Direct calls to UMS API.
After we onboard a brand new metadata supplier, the important thing work concerned is figuring out product necessities and aligning on the scope of metadata integration, adopted by finalizing the precise integration mechanism.
The ultimate main piece of Metis is our Lineage Service. We adopted Apache Atlas as Airbnb’s information lineage answer for Information Warehouse again in 2020.
At Airbnb, Apache Atlas holds a big lineage graph containing over 100 million nodes and 300 million edges. The first quantity of lineage information comes from manufacturing Hive tables and a big quantity of intermediate Hive tables in our Information Warehouse.
We’ve extensively custom-made and tuned Apache Atlas to deal with the massive scale lineage occasions in our Information Warehouse:
- Apply sharding technique on lineage occasions to extend parallelism.
- Bettering Atlas server code effectivity on prime of a graph database.
- Advantageous tuning underlying storage programs backing the graph database for scalability and latency.
- Learn path optimization and filtering help for accessing lineage information extra effectively.
Atlas’s lineage-related elements, together with its Graph Engine (JanusGraph), Sort System, Ingest (with Hook integrations), and lineage API, have allowed us to effectively accumulate and serve lineage information, offering worthwhile insights into the relationships between varied information property and pipelines. It’s powering many vital information compliance, information reliability and information high quality merchandise. See Visualizing Information Timeliness at Airbnb.
As proven above, Airbnb’s strategy to information administration has considerably developed over the previous 6 years. We began constructing Dataportal with a aim to “democratize information” at Airbnb, and we now have Metis: a platform that allows anybody at Airbnb to go looking, uncover, eat, and handle all the information and metadata in our offline warehouse. Metis has been serving vital roles throughout information compliance, information reliability, information high quality initiatives and helps 1000+ information customers each week.
Our future work will contain two key priorities: firstly, we are going to concentrate on evolving our system structure and underlying know-how with a purpose to preserve tempo with the fast evolution of our information ecosystem. Secondly, we plan to increase our protection to extra programs and allow extra superior information administration capabilities, reflecting our ongoing dedication to investing in information right here at Airbnb.
Metis wouldn’t have been potential with out the members of the information administration crew in addition to our cross purposeful and cross org collaborators. They embrace, however will not be restricted to: Adam Kocoloski, Adam Wong, Cindy Yu, Dave Nagle, Erik Ritter, Jerry Wang, Jiaxin Ye, John Bodley, Jyoti Wadhwani, Liyin Tang, Michelle Thomas, Nathan Towery, Paul Ellwood, Sylvia Tomiyama, Vyl Chiang, Woody Zhou, Xiaobin Zheng, and Zuzana Vejrazkova.
Apache Airflow, Apache Atlas, Apache Hive, Apache Superset, Atlas, and Hive are both registered emblems or emblems of The Apache Software program Basis in the US and different nations.
All emblems, service marks, firm names and product names are the property of their respective homeowners. Any use of those are for identification functions solely and don’t indicate sponsorship and endorsement.