Tracking HOT updates and tuning FillFactor with Prometheus and Grafana
The "Scaling up data culture" series is a number of blog posts that spin off from the journey taken by individuals and companies, like Adyen, that started investing and embracing data in their organizations some years ago and have adapted since then. This post relates to how we adapted the organization to accommodate for clarity, autonomy and empowerment, by establishing a guideline around the role each person plays.
In the case of Adyen, being one of the top payment processors, the backend requires us to collect loads of extremely valuable data which unlocks massive potential for our merchants and their shoppers. Scaling up data culture means professionalizing the value extracted with data, and that inevitably means growing a team, embedding a culture around data that did not exist before, choosing for a tech stack, making choices, failing, succeeding and constantly striving for improvement.
At Adyen we see extracting value from data as a continuous process, since technologies, products, the market and ourselves as a company evolve at a tremendous speed. We have learnt a ton of lessons by creating and applying new algorithms, putting ML models in a production flow where hundreds of billions circulate or by creating an adequate infrastructure to do that. But we only believe this is the beginning, there's still a big wave to surf, and we do that by learning from our journey, getting better and adapting. In the case of this post, we show how we adapted the role definitions to provide the clarity required for that growth.
If someone ever asked you “what does a data scientist do?”, you will likely be comfortable giving a high level description. When more detailed questions follow, you’ll probably be uncertain of the role specifics, answering something like “well, it depends on the company”.
Do you need to write production level code? Do you need to use Big Data technologies? What about statistics? Deep learning? Do you need a PhD? It gets tricky. And this is only considering the definition of “data scientist”. What about data analysts? Is machine learning the same as data science? These could be quite profound questions.
Smaller organizations quickly solve this by having a single role - typically labeled data scientist - covering all the trades.
As companies grow this single role does not fit the requirements of the teams well as they are becoming more specialized. A better categorization is necessary, so teams spawn a differentiation of “data analyst” (closer to Business Intelligence), and “data scientist” (closer to algorithms).
At some point companies become more data-savvy as they start introducing into their production flows calls to algorithms that have been trained and have learned patterns. Once again new roles emerge, as “data engineers” and “machine learning engineers” add color to conversations.
On the other hand, becoming too granular will not help. The illusion of choice can overwhelm both aspiring candidates as well as team leaders hoping to find someone that answers their questions and helps solve their problems.
Chances are that if your company is facing those questions, it isn’t alone. In fact, it is good to see what steps fellow companies have taken to fix that problem and think on whether to adopt them or not. At Adyen, as we grew more professional with our data practices, we wanted to clarify our view of the roles within data. We defined the primary focus of each role, the tech stack and skills needed, and aligned those definitions closer to market trends.
Our categorization spans two levels: a broad one with three tracks (data analysis, engineering, science) and within each broad track we accommodate another break-down if necessary. This setup leads to a clear break-down with the minimum amount of granularity, ending up in nine roles in total.
|Track||Role||In a nutshell|
|Analysis||Product Data Analyst||Domain expert that uses data analysis to find and drive insights.|
|Analysis||BI (Business Intelligence)||Drives the operations, plan and future of the BI platform
|Analysis||UX (User eXperience)||Uses data analysis on UX to improve products
||Analytics Engineer||Researches and answers business problems and opportunities, then implements pipelines that deliver analytics|
|Engineering||Data Engineer||Implements and deploys the tooling around the Big Data Platform|
|Engineering||ML Engineer||Implements and deploys the tooling for training and deploying ML models|
|Science||Data Scientist||Uses data science, statistics to unlock answers by applying inference, experimentation and explainability.|
|Science||ML Scientist||Designs, develops and deploys ML models into production.|
|Science||ML Researcher||Researches new algorithmic methodologies that add business value|
Analysts sit close to product, BI and UX to soak into domain knowledge that transcribes data needs and also use data to answer questions and drive insights.
At this point we need to clarify what is the difference between a data analyst and a business analyst. A data analyst makes extensive use of data to answer questions, either by querying databases, building dashboards, python scripts or big data jobs to retrieve information. Business analysts have a very well determined domain (AML, finance, fraud) and they use their own tooling which could be a custom-made interface or spreadsheets.
They love staying close to the product and its users - our merchants - , people with deep domain knowledge that know its way around data, either querying data sources or creating insightful visualizations. Their focus is on the product and leverages data analysis and data tooling to understand and improve the product.
The BI platform is used widely by a colourful array of people (account managers, sales, analysts, devs). The BI data analysts bridge the gap between the developers and the users of the BI platform, by ensuring the data is always flowing, the data models are correct and the platform is up. They also help get people on-boarded on the BI platform by providing training and help people with everything they need to squeeze the answers out of the BI platform.
We need to quantify the way that people use our products. UX Data Analysts use tools that collect user behavior data and analyse it to obtain results and insights about the usability and quality of our UX.
Skill chart for data analysts
The engineering track is focused on building tooling to empower either analysts or scientists to build end-products on top of that. Involvement with the product is therefore needed, at different depths, to ensure that they are building the right thing.
They understand the business context and the data needs behind it, they know how to implement quality data pipelines on our Big Data Platform either to fuel and create stunning visuals, reports, raw or refined datasets for internal or external use.
We called this job “ETL - Extract Transform Load - engineer” before but changed it to “analytics engineer” since we noted that we are evolving ETL to ELT and other paradigms and that people in this role answer a lot of product questions using analytics.
They develop, maintain and operate the tooling that enables data developers on the Big Data Platform (data engineers, data scientists) to commit data pipelines, access data, query data, serve data, schedule workflows, backfill datasets and harness the metastore.
They develop and improve the tooling needed by data and machine learning scientists to train, track, register and deploy machine learning models both in online (i.e. serving) and offline flows.
Skill chart for the engineering track
The science track contains those roles that are closer to maths, algorithms and machine learning models. The modulating factor here is the level of exposure to product, mathematical depths and production-level code.
They can jump into any problem and analyse, forecast, detect patterns and draw insights by leveraging data science algorithms and the big data platform, and thus enable teams to go beyond. Deep knowledge of statistics and data science algorithms is a must as well as being able to communicate complex outcomes with clarity over a wide range of audiences.
We used to call this job Data Science Inference, to emphasize the focus on inference and statistical analysis.
Sitting at the cornerstone of algorithms, mathematics and development, they can solve problems by designing and implementing production-ready data science algorithms and machine learning models. Domain knowledge is needed in order to ensure they are answering the right questions, but there is a team around them to provide it.
We used to call this job “Data Science Algorithms”.
Sometimes out of the box algorithms aren’t enough and more complex ad-hoc solutions are needed. She has a deep mathematical understanding of the science behind the algorithms that allows her to modify or create new algorithms and models that would.
Skill chart for the Science track
Many engineers, at some point in their careers, looked at the future and saw that their progression was inevitably meeting the challenge of managing people. Some would look at it with enthusiasm - driven by their will to help fellow colleagues and practitioners; others would look at it with languishment - they were just not enthusiastic about it and saw it as a necessary evil.
That is not necessarily true. We believe that people deliver the best when they are happy and they feel comfortable, and are also motivated about the future. Therefore engineers and scientists can grow and advance their careers into a technical track, that does not imply managing a team. And they can progress and amplify their impact by becoming staff and principal engineers and scientists.
At this point we want to emphasize that, at Adyen, building great products and solving problems is always a team game. The complexity of the problems, the challenges to deliver on, and the experience of working at Adyen pivots without exception around the idea of team spirit. This means helping others, getting help from others, sparring ideas, sharing successes and frustrations. We expect and promote staff and principal engineers and scientists to grow and be more present also by mentoring and helping and providing guidance to fellow colleagues.
Of course, evolving as a tech lead is a possibility with enormous impact which requires managing a team both on the technical side - as being a technical architect of the product - as well as on the human side - by taking care of the people in the team, growing their careers.
Another trend that we observed is people signing up as data analysts with the hope of becoming data scientists. In fact, at some point people would look at a data analyst as a person that is not yet good enough to be a data scientist, possibly driven by the market hype behind the so-called sexiest profession of the 21st century.
That is wrong. Data analysts add enormous value to our teams, by being able to leverage deep domain knowledge behind our products as well as data analytical skills such as querying our databases or our big data platform and then communicate that back to non-technical audiences, including customers. This is very powerful and needed across all our teams.
If a person shines in this range of skills, and enjoys it, there should be absolutely no urge or pressure to go learn how to optimize the hyperparameters of a random forest. All of us perform our best work doing what we love and thus you can grow your career into any domain without feeling the need to jump into other areas.
That is not mathematically symmetric. A data analyst can perfectly evolve into becoming a data scientist if wishes so. Or an analytics engineer, or a product manager.
On Adyen’s formula you’ll find an entry about “creating your own path”. While, if you are new to the Tech culture, this can be easily misunderstood as a vague corporate message, it really is not. It means that when you are part of Adyen you will have some guidelines about what is expected from you -at the end of the day this is your role- but there is absolutely nothing preventing you to explore, learn, help out and shape your future. We actively encourage that and leave the bandwidth on your calendar to do that. This has consistently proven to render loads of benefits in the long term in terms of career growth, delivery, happiness and product discovery.
By submitting this form, you acknowledge that you have reviewed the terms of our Privacy Statement and consent to the use of data in accordance therewith.