Airbnb builds Himeji – a scalable centralized authorization system



Airbnb recently described how he built Himeji, a scalable centralized authorization system. Himeji stores permission data and performs permission checks as a central source of truth. It uses a fragmented and replicated in-memory cache to improve performance and reduce latency, and has performed production checks for about a year. Its throughput increased from 0 in March 2020 to 850,000 entities / s in March 2021, while maintaining availability of 99.999% and latency of 12 milliseconds at the 99 percentile.

The image below illustrates Himeji’s three-layered architecture.


First, the orchestration layer receives requests from clients and is responsible for retrieving data from the cache. The caching layer, which is partitioned and replicated, is responsible for in-memory filtering and loading from the database if the cache fails. Airbnb is targeting a cache hit rate of around 98%. Finally, the data layer uses Amazon Aurora for durable database storage. Airbnb SpinalTap detects data mutations and sends notifications on Apache Kafka to invalidate the cache.

Airbnb Engineer Alan yao describes the reasons that led to this architecture:

Over the past two years, Airbnb’s engineering has shifted from a monolithic Ruby on Rails architecture to a service-oriented architecture. In our Rails architecture, we had one API per resource to access the underlying data. These APIs had authorization controls to protect sensitive data. Since there was only one way to access a resource’s data, managing these controls was easy. In the transition to SOA, we moved to a layered architecture where data services envelop databases and presentation services by hydrating from multiple data services.

According to Yao, Airbnb initially moved permission controls to showcase services, as shown in the image below.


This choice led to several problems. First, authorization controls were now duplicated and difficult to manage. Second, each authorization check had to be deployed across multiple services to perform the required logic, which severely degraded performance and reliability. The solution was to move authorization controls to data services instead of presentation services and create Himeji, allowing authorization data to be stored centrally and in a scalable manner. The figure below illustrates Himeji and its use in the Airbnb system.


Himeji’s Control API allows data services to perform authorization checks. Data services can ask Himeji if a particular principal (for example, a user) has a relationship (for example, a privilege or action) on a specific entity. For example, a data service might ask: “Can user 123 write in description for list 10?” This structure is called a tuple. He is inspired by Google Zanzibar, which is Google’s global authorization system.

Authorization rules in Himeji can be stored in the database or derived from configuration. For example, the following configured rule allows a principal to read the location of a list if the principal is the owner of the lists (an owner is a list permission stored in the database). Alternatively, it allows customers from a booking linked to the listing to also read the location.

        - #OWNER
        - LISTING : $id # RESERVATION @ 
          Reference(RESERVATION : $reservationId # GUEST)

Therefore, if a guest tries to read the location of the list, the data service will check if that user’s principal has permission to do so. Based on the above rule, Himeji will automatically ask if the principal is a guest on the list reservation automatically, and it will return the appropriate result.

To reduce onboarding times and drive adoption by developers, Airbnb has created some tools. These include tools for porting pre-existing authorization data with Apache Airflow and Apache Spark and scripts to automatically generate Java and Scala code.


Leave A Reply

Your email address will not be published.