by Nishant Bangarwa (@nishantbangarwa) on Wednesday, 6 July 2016

+3
Vote on this proposal
Status: Confirmed & Scheduled
View session in schedule
Section
Full talk

Technical level
Intermediate

Media

Abstract

Traditional SaaS solutions based on hadoop datastore Hive/Hbase or classical RDBMS work well for storing data, although they are not optimized for ingesting data and making it immediately available for interactive ad-hoc low latency queries at a very high scale. Long query latencies make these solutions suboptimal choices to power interactive applications. This talk will introduce Druid as a complementing solution for scalable real-time ingestion and analytics.

Druid is an open source distributed data warehouse, designed to support OLAP-like queries and is used in production at numerous companies. It was inspired by Google’s Dremel, PowerDrill and search framework. This talk will cover druid architecture, its storage internals and the common use cases druid is a good fit for.

Outline

  • History and Motivation
  • Live Demo
  • Druid Architecture
  • Storage Internals
  • Druid in Practice
  • Common Use Cases

Speaker bio

Nishant is an active contributor and PMC member for Druid. He is part of Business Intelligence team at Hortonworks. Prior to that he was part of Metamarkets backend team and was responsible for analytics infrastructure, including real-time analytics in Druid. He holds a B.Tech in Computer Science from National Institute of Technology, Kurukshetra, India.

Comments

  • 1
    [-] virendra goswami (@invincibleveer) 11 months ago

    Hi, Can you share the slides, Thanks

  • 1
    [-] Nishant Bangarwa (@nishantbangarwa) Proposer 11 months ago

    slides added.

Login with Twitter or Google to leave a comment