After debuting our blog series on data pandemic stories with a story from Tableau and a perspective from Privacera, we are excited to bring to you the third installment authored by Kaycee Lai, CEO of Promethium, a Starburst partner. The year 2020 will be etched in history as rather an unusual one: One where the outbreak of a global pandemic wreaked havoc but also taught us important lessons. Behavioral patterns changed forever, Data science models were put to test, and the need for fast data access and "analytics anywhere" emerged as a key growth driver. Companies with strong digital foundations thrived while others were forced to adapt. According to a report by McKinsey, digital transformation “vaulted five years forward.” If you have a compelling data pandemic story to tell and would like to get featured on Starburst Blog, please write to us at firstname.lastname@example.org.
—Your friends at Starburst
One of the things we’ve seen in the pandemic is that organizations who are able to quickly adapt by making data driven decisions have thrived. At Promethium, we’ve witnessed that the change ushered in by COVID boils down to one word: Speed.
Previous to COVID, it typically took months to answer a question with data. When the pandemic hit, the acceptable time frame – including discovery, ETL, prep, complex queries, etc. – literally went from months to minutes. People started saying, "We're not sure what our inventory levels will be in four months, we need to make decisions now." Companies need to be able to pivot on a dime and react to changing market conditions.
So how do companies go from a data analytics time frame of months to one in which answers are delivered in minutes? There are three main factors at play here:
1. Breaking down data silos. We've had this challenge forever because we've created data silos. "My data warehouse is here...my database is here...my cloud is here....by the way, marketing doesn't like this cloud, so they went with that cloud." But technologies have come along that allow you to cross these barriers and get to the data very quickly, such as the work done at Trino (formerly “PrestoSQL”) in data federation and virtualization.
2. Collaboration. Data analytics is a team sport. You need collaboration between the people who have the technical skill sets to get the data, and the people who provide the business context. Too often the data analytics narrative goes something like this: "Tell me what you want, and then leave me alone for two months, and then hopefully I’ll have something for you…[two months pass]...Oh, it's wrong? Whoops! Let me do it again." We don't do this in our everyday lives. We get things done very quickly because we can collaborate and communicate in real time. People are now starting to expect this level of service in data analytics.
3. Skill sets. For a long time the skills to analyze data were centralized in one team. Advancements in technology like natural language processing, for example, have created a big shift toward self-service analytics. For the next generation, if you've used Google, you’ll be able to get what you're looking for very quickly.
Major technology shifts in these areas have made it possible to go from taking months to only minutes to get data-backed answers to business questions.
Addressing the knowledge gap with crowdsourcing and transparency
Clearly we're getting over the coding gap, but we’ll always require the ability to understand how to use data. So an interesting question begins to pop into everyone’s mind: What are the types of skills that you need in order to understand whether you’ve found the right ‘needle’ in the haystack?
We do this by going back to human nature and looking at the most intuitive processes. Traditionally we relied on tribal knowledge, but that creates problems in the data analytics lifecycle because often we have trouble tracking down – or perhaps don’t have a solid working relationship with – the right subject matter expert.
The answer is to rely on something that has worked in many other industries: Crowdsourcing. For instance, we use Yelp to source the experiences of thousands of other people through something as simple as a five-star rating system. This kind of guidance helps tremendously in guiding people to the right data in a timely manner, and complements traditional approaches such as tags, etc.
Another factor that influences collaboration is transparency. During the pandemic food delivery apps like DoorDash have soared in popularity, and one key reason for this is that they tell you what step of the order process you're in. This gives people more confidence in the process, and can also speed it along because it introduces structure and accountability. If the person who requested a data-derived answer has full visibility into the data steward’s process for getting that answer, and can communicate in real-time throughout the process, we find that it proceeds much more quickly, and the answer tends to be much more on-target.
So, what’s the next chapter of the data pandemic?
In my opinion, the next chapter of the ‘data pandemic’ will be the return to service of industries and businesses that have been largely shutdown. Consider travel and leisure as an example. Industry giants like Disney announced that its Disneyland California parks are reopening on April 30 and Royal Carribean announced it is launching seven night cruises from the Bahamas. For companies like these, re-opening isn’t just a matter of turning the lights back on, they need to reconnect with massive supply chains, reboard/onboard tens of thousands of employees, and more. Data needs to be available now – not months from now – to make accurate forecasting, supply chain, operations and human resource decisions for effective return to service.