Driving accessibility with governed data products, seamlessly managed in Atlan
At a glance
- Kiwi.com, a global travel technology company that powers over 100 million searches a day for the perfect travel route, aimed to improve access to data
- By choosing Atlan as its modern data catalogue, Kiwi.com streamlined the aggregation, curation and monitoring of thousands of data assets, consolidating them into 58 discoverable data products for simpler and more efficient consumption.
- This approach to data products has reduced the workload of their central engineering team by 53% and increased data user satisfaction by 20% since onboarding more than 20 teams to share and use data responsibly across their organization.
Data is central to the success of travel technology company Kiwi.com, powering over 100 million daily searches for optimal travel routes and supporting 22.9 billion kilometres of travel by 2023. Its innovative algorithm enables customers to discover and book affordable flights that other search engines often miss, with billions of price checks performed daily on 95% of global flight content.
For Kiwi.com employees working every day to improve their products, experience and operations, easy access to trusted data is crucial. And leading the charge to improve how her colleagues use that data is Martina Ivanicova, Data Engineering Manager.
At the Gartner Data & Analytics Summit 2024 in London, Martina joined Atlan to share her experience and lessons learned in bridging the gap between data producers and consumers, introducing data products, and leveraging active metadata to deliver trusted and understandable data that drives the next big business decision. Reflecting on her experience, she posed a crucial question:
Let’s put ourselves in the shoes of a new data analyst at our company. Where do I find the data in the first place?
Journey to the center of the data stack
Kiwi.com operates on a microservices architecture, with services organized by business domain. Leveraging Google Cloud Platform, they pull data in batches to BigQuery, their data warehouse, while Dataflow manages real-time data processing and stores results in Google Cloud Storage or directly in BigQuery. Additional transformations ensure that data is available in Looker, their business intelligence tool. Metadata for all components is managed in Dataplex, a GCP service that hosts vast amounts of their data assets.
While this setup may seem simple, it operates at scale, orchestrating and managing a massive volume of data across the ecosystem.
“We have 100 Postgres databases, tens of thousands of tables, thousands of BigQuery datasets, tens of thousands of BigQuery tables, hundreds of Airflow DAGs, and thousands of Looker objects,” Martina shared.
Finding focus with data products
For Kiwi.com’s data analysts, accessing accurate and reliable information is crucial to making informed decisions. However, even with data consolidated in one place, finding the right data resource was still a challenge. This prompted Martina and her team to look beyond simple accessibility, simplifying navigation in their complex data ecosystem.
If you were a data analyst and you went to Dataplex to search for the keyword “Destination,” you would find over 200,000 entries. How would you find data for your use case? The answer is that you are not looking for data, you are looking for a data product.
Martina and her team embraced the concept of treating data as a product, ensuring it is trustworthy, easy to use, discoverable and valuable, making it a key operational asset. This approach led to the development of a data product governance framework, establishing key standards for scaling and managing data quality, documentation, security and related processes.
A data-governed product model
Martina and her team’s approach to data products was designed to focus on their most important data assets. They divided these assets into tiers, with the top tier being the most critical for strategic decisions and financial reporting. The majority of their data consumers’ use cases depended on this top-level data, which drove strict requirements for reliability and accessibility and inspired a six-part framework for creating, managing, and improving these data products.
Property: Establish clear ownership and responsibility for each data product, both at the product and technical level.
Documentation: Make every data product easy to find and understand, with the right context and insights that can be discovered and understood.
Quality: Monitoring the freshness, accuracy and reliability of each data product using an internal monitoring platform.
Architecture: Optimize the creation, modification, storage and access to data products with a solid technological configuration.
Security: Comply with security and privacy standards, ensure data protection and maintain regulatory compliance.
Processes: Implement procedures to maintain data reliability, including: data contractsensuring that all consumers and producers agree to specific SLAs and SLOs, instituting incident and change management protocols.
While Martina’s team could support this framework by combining a powerful set of existing tools, such as Terraform, BigQuery, Dataplex, Monte Carlo, and Looker, managing data products would mean moving from one tool to another as data flowed from producer to consumer. In evaluating the active metadata management market for a solution that would unify these disparate tools into a single view, Kiwi.com chose Atlan.
By seamlessly integrating with their data stack, Martina and her team use Atlan to ensure their data products are accessible and understandable, efficient and reliable, and perfectly align with their high quality and security standards.
“Atlan was flexible enough to provide an umbrella over all the metadata we were trying to track. It also helped us evaluate the performance of our data products against specific criteria, ensuring they meet the required standards.”
An optimized data landscape
By moving from searches that returned thousands of results across a complex dataset to creating governed, easily discoverable data products, Martina and her team have significantly improved the capabilities of the data engineering function and are driving user satisfaction to unprecedented levels.
Today, his team manages 58 of these top-tier data products, a carefully curated set that has focused and optimized their workload. This landscape has since been organized by domain, curating data assets and ensuring clear ownership and documentation.
This approach has allowed data teams to take full responsibility for their data, fostering a culture of accountability. “We managed to organize the data landscape in a way that matches our domains. The data is owned by the teams that are in these domains,” Martina explained.
Through this transformation, Kiwi.com successfully onboarded over 20 teams to share and use data responsibly across the organization. Routine internal surveys revealed A 20% increase in data user satisfactionshowing the significant positive impact of this initiative.
The lesson of “less is more”
Instead of having to search through 272,000 difficult-to-analyze results, a Kiwi.com analyst can now find exactly what they need with Atlan. In a simple, easy-to-use interface, you are given a complete view, from ownership to related assets, data contracts, service level agreements and any data quality issues.
Sharing the most important lessons she learned from her experience, Martina expresses that access to large amounts of data is just the first step towards data democratization, and not the destination. With curiosity and focus on the needs of your data consumers, getting the most value possible from data means delivering an experience that provides discoverable, understandable, and trustworthy data, instantly and at their fingertips.
People say they want access to all the data, all the time. Take this with a grain of salt. It is not enough for us to break down silos and connect all the data sources in the company. It is not enough for us to offer self-service analytics tools to the company. It is important that we provide reliable and discoverable data, and that less is more.“
Photo by kychan in Stop splashing