Starburst Blog

Subscribe to Our Blog

Subscribe to Email Updates

Featured Post

Recent Posts

Presto Turns Eight Years Old!

Presto just turned eight years old only a few weeks ago, and it's just getting started. 

Let’s take a journey back to 2012. Facebook had ramped up its data warehouse to intake massive amounts of data to the tune of 250 petabytes. For those of you that don’t know the story, Hive was the current data warehousing software, also grown out of Facebook, that enabled users to run batch processing jobs over Hadoop. The jobs were submitted using a Hive-specific SQL dialect and didn’t require knowing how to program using the complex MapReduce paradigm. Running SQL on the Hadoop cluster was revolutionary in the early Hadoop era, where adopting Hadoop initially had high costs to hire experts to access the data in HDFS. However, Hive was not designed with a human-in-the-loop in mind and had other drawbacks such as a nonstandard SQL dialect. The anecdote taken from a Facebook data scientist at the time was that running even six queries on Hive was a good day. Previous solutions failed to scale to address the lack of ability to sift through this data in real time and provide accurate results. Facebook vastly needed to improve the time-to-value for business analysts, developers, and data scientists.

Following the timely Facebook tradition, engineers Dain Sundstrom, David Phillips, Eric Hwang, and Martin Traverso set out to build an entirely new system that could handle the existing petabyte-scale of data at Facebook and keep up as it grew. As we know, these were the humble beginnings of the Presto system we all have come to know and love.

original_presto

The design of Presto aims to return results fast and correctly, adhere to the ANSI SQL standards, and above all, make the system open-source and community-driven. Not only did the project achieve what it set out to do at Facebook, but it expanded with its flexible connector architecture. The alchemy of the connector SPI architecture, combined with the open-source culture, ultimately drove the success of Presto. The open-source philosophy provides plenty of benefits, including a collaborative testing suite and user ownership and influence on the project, which makes for better, robust software. This culture gives interested parties a stake in the direction of Presto and combines their story with the Presto legacy. What this means is that this birthday celebrates not only the accomplishments of Presto as a software but the accomplishments of the individual contributors and companies that helped get Presto to where it is today. This notion is reflected no better than one of the original co-creators of Presto, Martin Traverso.

With that, we here at Starburst want to send a heartfelt thanks to everyone who has contributed and look forward to the next eight years of success with Presto.

Brian Olsen

Brian is a U.S. Marine turned software engineer and developer advocate working to foster the open-source Presto community. Brian spent four years as a data engineer at a cybersecurity company working on pipeline maintenance and query optimization. While in this role, Brian was responsible for maintaining data pipelines and migrations to include replacing some legacy data warehousing systems to use open-source Presto. Brian is a published author in ACM and IEEE geospatial database conferences.

Presto Book Download CTA

Your Comments :

blog-cta

From Facebook

Read more of what you like.

By | on 26, Aug 2020 |   community presto

Presto just turned eight years old only a few weeks ago, and it's just getting started.