Warsaw 2014 - Proposal

Gold partners

Back to proposals overview - program

How to build a petabyte-scale data infrastructure


Back in 2011 at Prezi we started off with a single SQL query that worked on a few megabytes of data and produced somewhat accurate numbers satisfying basic business needs. This used to be our BI platform. Today we run a data infrastructure with around 70 high-performance servers that crunch hundreds of gigabytes of data and feed hundreds of reports day by day. Along this journey we used standard Unix and statistical software, later on-premise Hadoop clusters, NoSQL databases and third-party BI tools. Learning from our mistakes we rebuilt our data infrastructure and ETL systems many times. We’ll share the successes and misses we encountered throughout this journey with a special focus on our current experiences with managed solution


Julianna Göbölös-Szabó
Data Engineer

As a mathematician by education, Julcsi worked for years in a research institute and analysed big networks. Now she is a data engineer at Prezi building a reliable and easy-to-use ETL system for both developers and non-tech colleagues. In her free time she enjoys eating delicious food and then burning calories with running, swimming or biking.

Zoltán Tóth
Data Engineer

Prior to joining Prezi Zoltán worked as a developer/architect for pharmaceutical market research companies. Now, as Senior Data Engineer, he helps Prezi build and operate a world-class data infrastructure. In his free time he enjoys cooking and simplicity and is a wannabe triathlete.

blog comments powered by Disqus
Polish-Japanese Institute of Information Technology IT Vivus Allegro IBM - Polska kreuzwerker GmbH Kainos e-Xim IT Sematext

Silver partners

MegiTeam SaltStack itSMF - Polska Chef Ansible Sii

Media partners

MegiTeam 7thGuard Governica GazetaPraca