How a Pioneer in Digital Content Extended the Value of Snowflake to its Data Consumers
Hosting thousands of podcasts, millions of songs, and on-demand programming for radio shows, this organization is a leader in online audio content. With millions upon millions of customers, each with unique preferences across a seemingly limitless number of content choices, the organization sits atop a massive amount of useful data. And among the key people responsible for ensuring the organization makes the most of this data is their Data Engineering and Architecture leader.
“I’ve been here for almost two years, but have been in Software Engineering for something like 15 years. I joined because we’re doing some really complex and interesting things with data, and have some innovative ways of supporting data-driven decision making,” he shared.
The organization’s 12-member Data Engineering and Architecture team carefully stewards an array of technologies from transformation layers to their data warehouse, providing the technology and assets that downstream teams like Data Science need to develop forecasting models, map customer journeys, and enable strong business decisions.
“We have multiple stakeholders like analytics or product teams, and our goal is to give them what they need to make quality, data-driven decisions so we’re ready to innovate in the months and years to come,” he shared.
The modern data stack maintained by their team consists of Airflow, Snowpipe, Snowflake, and Tableau. Long supported by Confluence for documentation, context about their data assets is now accessible through Atlan, their platform of choice for Data Governance, documentation, and discoverability.
“Snowflake doesn’t provide the option for documentation, so we used Confluence. There was a lot of old documentation that was unfortunately not easy to search for,” their leader shared. “You could document as much as you wanted, but it wasn’t sitting on top of my Snowflake tables, which is what I wanted. It was also hard to discover ancillary and complementary datasets that were relevant to what you were looking for, and it was even hard to know where or when to add or update documentation.”
In the absence of a modern data catalog, each time data consumers searched for data, or sought to understand more about it, numerous questions were directed at the Data Analysis team. While knowledgeable about the data sets in Snowflake, this persistent back-and-forth was drawing away their focus, and resulted in longer time-to-insight for data consumers.
“They’re a small team, and they can’t help everybody, so we wanted to make all of this self-serviceable. I want people to be able to quickly find what they’re looking for, then ask a more educated question like ‘I found this Sales Analytics data, but I want to determine what’s happening with advertisers, too. Is that possible?’,” he shared.
After a thorough evaluation of available tools, the organization chose Atlan to present and contextualize valuable assets stored in Snowflake to their stakeholders, improving time to insight and driving savvier use of data, and doing away with the costly service model that had plagued their Data Analysis team.
What stood out to me about Atlan is that our stakeholders could log in and query information directly. Instead of my team providing direct access to all these data stores in Snowflake, I could provide access to Atlan, and they could find what they needed. It’s a one-stop-shop model for governance and discovery, and that’s what I was looking for. I didn’t want more than one tool.”
Data Engineering and Architecture Leader
Beginning by integrating Atlan with the foundational systems of their data stack like Snowflake and Airflow, the team got to work by documenting their most important tables in Snowflake. Then, using Jira, they set up a process for continuous documentation, prompting upstream owners of data assets to document their work in Atlan whenever a change occurs.
“Now, when you work on a table, one of the automated tasks that’s set up in Jira is that the owner needs to document that table in Atlan. If they work on a table that hasn’t been enriched in Atlan, that’s their time to take 10 minutes and write a blurb, explain the columns, add relevant information, and maybe add a few tags,” their leader explained.
The first group to adopt this process is their Data Analysis team, who are documenting their deep knowledge of the organization’s Snowflake ecosystem, then Product teams, with unique context to provide about the organization’s numerous offerings. Downstream, this ensures that data consumers, from a marketing user running simple analysis, to an executive curious about how a dashboard is assembled, can make faster, more informed decisions.
“A lot of times, people just want reports, but there are also a lot of brilliant technical people that want to dive more deeply into data, and we’re able to provide something to all of them,” their leader shared. “Having a documentation tool means we’re aligning better, and aligning better helps us make faster decisions.”
An important effect of the organization’s adoption of Atlan is improved adoption of Snowflake’s capabilities and of the assets stored within it, and a more judicious policy of granting direct access to the data warehouse.
“People have learned over the course of the last couple of years that we have a central data store, and they just want access to it. We’ve got a couple of power users who make great use of raw data and know exactly what they’re doing, but there’s a growing number of people that are less technical and heard they can get access to data if they get access to Snowflake,” their leader explained.
More and more frequently, the organization’s less-technical data consumers would request access to Snowflake, but without experience navigating data warehouses, or the ability to use SQL, would be quickly overwhelmed by the volume and complexity of the data available to them.
While their data team could divert some of these end-user requests to tools like Amplitude, Atlan was the missing piece that could make their organization’s data assets browseable, understandable, and consumable to a spectrum of roles and skill sets.
“For people that want to do a little bit of analysis on their own, my goal is to send them to Atlan,” he explained. “There, they’ll have all the context they need and some saved queries they can get started with. They can make better sense of it.”
Now, when requests for access to Snowflake are made, the team responds with a structured set of questions that assess their technical capabilities, and determines the best place to send them to find and apply data. And by enabling this new group of users with Atlan, the Data Engineering and Architecture team avoids confusing its end users, and a costly back-and-forth of training and questions on concepts like SQL.
I don’t want to over-protect data and be scared of it. We’re secure, we’re privacy first, but we also want to democratize it. The data is there for a reason, and people can’t make good decisions if we’re hoarding it and not making it available. And even if we are making it available, if it’s not understandable, then it doesn’t matter. We’re making sure it’s understandable, it’s reliable, it’s high-quality, and it’s discoverable.”
Data Engineering and Architecture Leader
With a steadily growing library of well-contextualized data assets, and a climbing number of end users now capable of yielding value from Snowflake, the data team is focusing on Data Governance, focusing not only on security and privacy, but making sense of disparate, but similar data sources, breaking down knowledge silos, and increasing visibility of their entire data estate.
“You have to have one place where everything is put together and documented. Atlan provides all the tools I need to do good governance on the assets I have,” their leader shared. “We want to enable quality, data-driven decisions, and make the right decisions with the right insights. We can’t do that without proper governance.”
After a successful implementation of Atlan, democratizing access to valuable assets in their Snowflake ecosystem, their data leader shared what he believes are key considerations for his peers as they consider investing in a modern data catalog.
This is the first and only cataloging tool I’ve used. We got some demos from other companies, but the Atlan team innovates really quickly, and was responsive for any issues or requests we had early on; issues that have mostly gone away at this point, a year later. My advice would be to look for something that’s not only usable by your technical teams, but simple to understand and navigate for all the people who are going to be your consumers. If it’s not, they’re not going to touch it.”
Data Engineering and Architecture Leader
Photo by Will Francis on Unsplash