Airflow at Adyen: Adoption as ETL/ML Orchestrator
The Adyen way of engineering portrays an engineer as not just a coder, but a designer, architect, tester, and operations engineer -- all in one. We believe that the minimal amount of links results in the fastest time to market, therefore exposing our products to our customers more quickly.
In this article, we’ll discuss the creation of a recent application, illustrating how the many roles of an engineer brings an application from its requirements to a working solution in production.
In recent years, our team has launched and iterated on a product that enables Adyen to transfer funds to our customers using our own banking platform. One of the regulatory requirements of this platform is reporting on account balances. To facilitate the data for these reports, a data warehousing application was implemented, allowing us to store and retrieve any changes made to account balances.
Before an engineer sets things to high gear, we first address any uncertainties of the application with the product team. In this case, the most significant uncertainties were the data points that Adyen needs to report on, taking into account that some reports may span multiple historical years. Adding new data points to the application was tricky, as we would need to back-fill any historical records with such data. Also, in order to future-proof the application, our team needed to consider any prospective use cases of the product.
Most new applications start with the architecture phase. The architecture phase is a high-level process in which an engineer defines how a new application fits into the existing architecture of the Adyen platform. Our team follows strict guidelines that not only helps navigate our engineers through the architecture phase, but that our platform also remains manageable to our customers.
Distribution of the application is one of the aspects that an engineer needs to consider when defining a new application within the Adyen platform. After all, our platform operates globally. If we know that the application will only be accessed by European customers, our team will initially distribute the application only in European data centers in order to optimize response times. As we scale the application, we’ll continue deployment in new regions with new customers. Note that in any case, a new application is deployed in multiple data centers in the same region for failover procedures.
For our particular application, from a database perspective, we decided to go with a distinct database for any data relevant to the warehouse product. Since the data warehouse product is data-heavy by nature, it made sense to create a distinct database for any data related to the product.
With the architecture defined, we now know how our application will be distributed, and where it needs to store its produced data.
When it comes to designing the application, our team first assesses what the software and database should look like. For example, in the case of this particular application:
For the former, our team implemented our own streaming framework, which allows consumers to both read and filter data from a stream. A simple approach was taken here to create a consumer that reads from the stream that stores all the mutations on balances. The consumer normalizes the read data into POJO's, plain-old Java objects with key-value pairs. These POJO’s map to the table design in the database, allowing the consumer to easily insert data from these POJO’s into the database.
In terms of the database design, we chose a star schema approach. This schema is well-known for data warehousing and allows for a flexible database design. In essence, we decided to go with a single fact table, which stores all information related to the mutation of balances (e.g., amount and currency). In our approach, there are multiple-dimension tables that store meta information regarding the mutation. We could add more dimension tables if any additional meta information is needed for the reports.
At this point, we know what to build and where to build it. Our focus now shifts to implementing the software. In the context of this application, this means making sure that the SQL queries from our architecture and design phase become realized and tangible.
At Adyen, we don’t necessarily have strict guidelines regarding code design. However, there is one which we promote internally, and that is your code should be understandable at 04:00 AM under pressure. What does this look like in terms of the existing application?
As we know from the design phase, we needed to implement a consumer that reads data, and creates POJO’s from these data. A proper software design pattern that fits this model is the builder pattern, which allows for flexible and understandable creation of POJO’s from a set of data. This pattern could help understand why a record with incorrect constraints was tried to be inserted into the database. For the SQL part, we implemented simple CRUD operations to be able to store the data in the database.
Regarding testing, our team invests in three different techniques to make sure all code is tested properly:
To provide another layer of code assessment, all code and tests we write at Adyen as engineers are reviewed by at least two other engineers. These additional review steps make sure that the written code is understandable for others, and is in line with the general guidelines we have at Adyen regarding software development.
At this point, all the bits and pieces are ready and it is time to put the consumer in production. As engineers, we are responsible for monitoring any new process that goes to production. Our team has weekly release cycles, so effectively, every week sees new code going into production. If any issues occur during production, the engineer develops a fix for it. This continues until all processes do what they should be doing without error.
As an engineer at Adyen, you are responsible for overseeing the entire lifecycle of a new application. This begins from the laying out the starting requirements of an application, to engineering, to finally running code in production. In my personal experience, engineering is never about just picking up a ticket, developing, and moving on to the next ticket. You are always a key stakeholder in different areas of an application, and that’s what makes every day unique as an engineer at Adyen.
By submitting this form, you acknowledge that you have reviewed the terms of our Privacy Statement and consent to the use of data in accordance therewith.