The Fifth Elephant 2015

A conference on data, machine learning, and distributed and parallel computing

DATE

16-18 July 2015, Bangalore

STATUS

Awaiting jury selection


Machine Learning, Distributed and Parallel Computing, and High-performance Computing are the themes for this year’s edition of Fifth Elephant.

The deadline for submitting a proposal is 15th June 2015

We are looking for talks and workshops from academics and practitioners who are in the business of making sense of data, big and small.

Track 1: Discovering Insights and Driving Decisions

This track is about general, novel, fundamental, and advanced techniques for making sense of data and driving decisions from data. This could encompass applications of the following ML paradigms:

  • Statistical Visualizations
  • Unsupervised Learning
  • Supervised Learning
  • Semi-Supervised Learning
  • Active Learning
  • Reinforcement Learning
  • Monte-carlo techniques and probabilistic programming
  • Deep Learning

Across various data modalities including multi-variate, text, speech, time series, images, video, transactions, etc.

Track 2: Speed at Scale

This track is about tools and processes for collecting, indexing, and processing vast amounts of data. The theme includes:

  • Distributed and Parallel Computing
  • Real Time Analytics and Stream Processing
  • MapReduce and Graph Computing frameworks
  • Kafka, Spark, Hadoop, MPI
  • Stories of parallelizing sequential programs
  • Cost/Security/Disaster Management of Data

Commitment to Open Source

HasGeek believes in open source as the binding force of our community. If you are describing a codebase for developers to work with, we’d like it to be available under a permissive open source license. If your software is commercially licensed or available under a combination of commercial and restrictive open source licenses (such as the various forms of the GPL), please consider picking up a sponsorship. We recognize that there are valid reasons for commercial licensing, but ask that you support us in return for giving you an audience. Your session will be marked on the schedule as a sponsored session.

Workshops

If you are interested in conducting a hands-on session on any of the topics falling under the themes of the two tracks described above, please submit a proposal under the workshops section. We also need you to tell us about your past experience in teaching and/or conducting workshops.


Confirmed sessions

"Thinking Machines"

Shailesh Kumar (@shkumar)

  • Keynote
  • Advanced
  • 1 upvotes
  • 0 comments
  • Tue, 7 Jul

Future patterns in data ecosystem

Amod Malviya

  • Sponsored Keynote
  • Intermediate
  • 2 upvotes
  • 0 comments
  • Tue, 7 Jul

Igniting your data with Apache Spark

Yagnik (@yagnik)

  • Workshop
  • Beginner
  • 0 upvotes
  • 5 comments
  • Thu, 2 Jul

Deploying Batch and Streaming Architectures on AWS

Russell Nash (@russnash)

  • Sponsored
  • Intermediate
  • 4 upvotes
  • 0 comments
  • Thu, 18 Jun

Data Comes in Shapes

Tim Poston (@timposton)

  • Keynote
  • Beginner
  • 5 upvotes
  • 0 comments
  • Tue, 16 Jun

When Apache ZooKeeper is good fit

Rakesh R (@rakeshadr)

  • Crisp Talk
  • Intermediate
  • 20 upvotes
  • 0 comments
  • Tue, 16 Jun

Dead Simple Scalability Patterns

Vedang Manerikar (@vedang)

  • Crisp Talk
  • Beginner
  • 34 upvotes
  • 0 comments
  • Mon, 15 Jun

Call me maybe: Jepsen and flaky networks

Shalin Mangar (@shalinmangar)

  • Full Talk
  • Advanced
  • 15 upvotes
  • 0 comments
  • Mon, 15 Jun
  • slideshow

Graph Algorithms and Computer Vision

Sumod Mohan (@sumod)

  • Full Talk
  • Intermediate
  • 7 upvotes
  • 0 comments
  • Mon, 15 Jun
  • play_arrow

Harnessing the power of the Erlang VM at Housing

Abhijit Pratap Singh (@sabhi)

  • Crisp Talk
  • Intermediate
  • 13 upvotes
  • 0 comments
  • Mon, 15 Jun
  • play_arrow
  • slideshow

Exploratory data analysis using Apache Lens and Apache Zeppelin

Pranav Agarwal (@praagarw)

  • Crisp Talk
  • Intermediate
  • 8 upvotes
  • 1 comments
  • Mon, 15 Jun

Keeping Moore's law alive: Neuromorphic computing

Anand Chandrasekaran (@madstreetden)

  • Full Talk
  • Beginner
  • 20 upvotes
  • 0 comments
  • Mon, 15 Jun

Deep Learning for Natural Language Processing

Devashish Shankar (@devashishshankar)

  • Full Talk
  • Intermediate
  • 18 upvotes
  • 1 comments
  • Mon, 15 Jun
  • play_arrow
  • slideshow

Joining data streams at scale for fun and profit

Aniruddha Gangopadhyay (@aniruddha9591)

  • Crisp Talk
  • Beginner
  • 8 upvotes
  • 0 comments
  • Mon, 15 Jun

Hardware Accelerated Big Data Processing

Reetinder Sidhu (@sidhu1f)

  • Crisp Talk
  • Intermediate
  • 6 upvotes
  • 0 comments
  • Mon, 15 Jun

Building a E-commerce search engine: Challenges, insights and approaches

Vinodh Kumar R (@vinodhkumarr)

  • Sponsored
  • Beginner
  • 10 upvotes
  • 0 comments
  • Mon, 15 Jun

Are these the same pair of shoes? - Matching retail products at scale

Nikhil Ketkar (@nikhilketkar)

  • Full Talk
  • Intermediate
  • 76 upvotes
  • 1 comments
  • Mon, 15 Jun

Using Modes for Time Series Classification

Rohit Chatterjee (@rohitchatterjee)

  • Crisp Talk
  • Beginner
  • 5 upvotes
  • 0 comments
  • Mon, 15 Jun
  • play_arrow

Apache Tez - Present and Future

Rajesh Balamohan

  • Full Talk
  • Intermediate
  • 3 upvotes
  • 0 comments
  • Mon, 15 Jun

Approximate algorithms for summarizing streaming data

Himadri Sarkar (@himadri)

  • Full Talk
  • Intermediate
  • 45 upvotes
  • 5 comments
  • Sun, 14 Jun
  • play_arrow

CAP Theorem: You don’t need CP, you don’t want AP, and you can’t have CA

Siddhartha Reddy (@sids)

  • Full Talk
  • Intermediate
  • 20 upvotes
  • 4 comments
  • Sun, 14 Jun

POC: How to slice, dice & search billions of users events in seconds (from scratch)

Bhasker Kode (@bhaskerkode)

  • Crisp Talk
  • Beginner
  • 11 upvotes
  • 0 comments
  • Sun, 14 Jun
  • slideshow

The many ways of parallel computing with Julia

Viral B. Shah (@viralbshah)

  • Full Talk
  • Beginner
  • 5 upvotes
  • 0 comments
  • Sun, 14 Jun

Escher - democratizing beautiful visualizations

Shashi Gowda (@g0wda)

  • Crisp Talk
  • Beginner
  • 5 upvotes
  • 1 comments
  • Fri, 12 Jun

Recommendation System beyond traditional Collaborative filtering

Gagan Agrawal (@gagana24)

  • Full Talk
  • Intermediate
  • 7 upvotes
  • 0 comments
  • Fri, 12 Jun
  • play_arrow

Running natural language queries against NoSQL schema

Deepak Krishnan (@deepakgk)

  • Crisp Talk
  • Advanced
  • 30 upvotes
  • 3 comments
  • Thu, 11 Jun

Search at Petabyte scale

Anup Nair (@anair)

  • Crisp Talk
  • Intermediate
  • 8 upvotes
  • 0 comments
  • Thu, 11 Jun

Building tiered data stores using Aesop to bridge SQL and NoSQL systems

Regunath Balasubramanian (@regunathb)

  • Full Talk
  • Intermediate
  • 9 upvotes
  • 0 comments
  • Wed, 10 Jun
  • slideshow

HawkEye: A Real-Time Anomaly Detection System

Satnam Singh, PhD (@satnam-datageek)

  • Crisp Talk
  • Beginner
  • 9 upvotes
  • 2 comments
  • Mon, 8 Jun
  • slideshow

A review of important results in distributed systems

Vaidhy Gopalan (@vaidhy)

  • Full Talk
  • Intermediate
  • 10 upvotes
  • 1 comments
  • Thu, 28 May

Making a contextual recommendation engine using Python and Deep Learning at ParallelDots

Muktabh Mayank (@muktabhm)

  • Crisp Talk
  • Beginner
  • 10 upvotes
  • 0 comments
  • Wed, 27 May
  • slideshow

Critical pipe fittings: What every data pipeline requires

Yagnik (@yagnik)

  • Full Talk
  • Intermediate
  • 5 upvotes
  • 0 comments
  • Wed, 27 May

Understanding supervised machine learning hands on!

Harshad Saykhedkar (@harshss)

  • Workshop
  • Beginner
  • 14 upvotes
  • 1 comments
  • Mon, 25 May
  • slideshow

Processing large data with Apache Spark

Venkata Naga Ravi (@venkatanagaravi)

  • Full Talk
  • Intermediate
  • 3 upvotes
  • 0 comments
  • Sat, 23 May
  • play_arrow
  • slideshow

Building Recommender system

Swaroop Krothapalli (@swaroop)

  • Crisp Talk
  • Beginner
  • 5 upvotes
  • 0 comments
  • Thu, 21 May

Visualising Multi Dimensional Data

Amit Kapoor (@amitkaps)

  • Full Talk
  • Intermediate
  • 9 upvotes
  • 0 comments
  • Tue, 19 May
  • slideshow

Introduction to Deep Learning

Bargava Subramanian (@barsubra)

  • Workshop
  • Intermediate
  • 20 upvotes
  • 0 comments
  • Mon, 18 May
  • slideshow

Two Years Wiser: The Nilenso Experiment

Steven Deobald (@stevendeobald)

  • Full Talk
  • Beginner
  • 6 upvotes
  • 0 comments
  • Mon, 11 May

Instrumenting your kafka & storm pipeline

Bhasker Kode (@bhaskerkode)

  • Full Talk
  • Intermediate
  • 12 upvotes
  • 0 comments
  • Mon, 11 May
  • slideshow

Unconfirmed proposals

Real Time Bid Modification @ Million Requests per second...

Jatinder Singh (@jatinder)

  • Crisp Talk
  • Intermediate
  • 5 upvotes
  • 0 comments
  • Wed, 17 Jun

Introduction to MaelStorm and Performance Engineering

Jatinder Singh (@jatinder)

  • Workshop
  • Advanced
  • 2 upvotes
  • 0 comments
  • Tue, 16 Jun

Reviews and Ratings Spam Detection

Mohit Kumar (@mohitkum)

  • Crisp Talk
  • Intermediate
  • 9 upvotes
  • 1 comments
  • Mon, 15 Jun

AB testing: What, Why & How

Renuka Khandelwal (@renuka)

  • Full Talk
  • Beginner
  • 9 upvotes
  • 0 comments
  • Mon, 15 Jun

Building a distributed cache system with redis, clojure and math

Kapil Reddy (@kapilr)

  • Full Talk
  • Intermediate
  • 23 upvotes
  • 0 comments
  • Mon, 15 Jun

From Search to Discovery at Housing

Mudit Gupta (@mudit-housing)

  • Full Talk
  • Beginner
  • 10 upvotes
  • 0 comments
  • Mon, 15 Jun

High Performance Tiled Map Service

Shubham Bansal (@shubham-bansal)

  • Full Talk
  • Intermediate
  • 2 upvotes
  • 0 comments
  • Mon, 15 Jun
  • play_arrow
  • slideshow

Think Incremental with hive.

ravi teja (@ravi-teja)

  • Crisp Talk
  • Intermediate
  • 11 upvotes
  • 0 comments
  • Mon, 15 Jun

Scalable real-time personalized recommendation system

Jasvinder Singh (@jasvinder)

  • Full Talk
  • Intermediate
  • 6 upvotes
  • 0 comments
  • Mon, 15 Jun

How to stop admiring and start using Deep Learning

Vivek Mehta (@vivekmehta)

  • Full Talk
  • Intermediate
  • 20 upvotes
  • 0 comments
  • Mon, 15 Jun

Holistic Security Process for Humanitarian Projects

chinmayi sk

  • Full Talk
  • Intermediate
  • 7 upvotes
  • 1 comments
  • Mon, 15 Jun

Stream Processing in production: Metrics that matter

Siddhartha Reddy (@sids)

  • Crisp Talk
  • Intermediate
  • 11 upvotes
  • 0 comments
  • Mon, 15 Jun

Map Tile Server

Niranjan Bala V (@niranjanbalav)

  • Crisp Talk
  • Intermediate
  • 45 upvotes
  • 0 comments
  • Mon, 15 Jun

Designing distributed components in a multi tenant architecture

Ronak (@ronak-kothari)

  • Full Talk
  • Intermediate
  • 7 upvotes
  • 0 comments
  • Mon, 15 Jun

Data Infrastructure for Real Time Analysis of User Click Stream Data

Aditya Prasad Narisetty (@adityaprasadn)

  • Full Talk
  • Beginner
  • 9 upvotes
  • 0 comments
  • Mon, 15 Jun
  • play_arrow

What does your website look like to a web-crawler

Gagandeep singh (@gagan-goku)

  • Full Talk
  • Intermediate
  • 7 upvotes
  • 0 comments
  • Mon, 15 Jun

Developing a Hybrid Recommender System for Some of Life’s Most Important Choices

Paul Meinshausen (@pmeins)

  • Full Talk
  • Intermediate
  • 7 upvotes
  • 0 comments
  • Mon, 15 Jun
  • play_arrow
  • slideshow

Solr compute cloud - An elastic Solr infrastructure

Suchitra Amalapurapu (@asmsuchi)

  • Full Talk
  • Advanced
  • 11 upvotes
  • 0 comments
  • Mon, 15 Jun

postgres clusters and their nuances

Srihari Sriraman (@ssrihari)

  • Full Talk
  • Intermediate
  • 2 upvotes
  • 0 comments
  • Mon, 15 Jun
  • slideshow

Practical Approach to Python based Supervised Machine Learning: User Generated Text Classification Techniques

Kausik Ghatak (@kausikg)

  • Full Talk
  • Intermediate
  • 15 upvotes
  • 0 comments
  • Mon, 15 Jun

An Integrated Weblog Processing and Machine Learning Workflow for Building and Deploying Intent Prediction Models at Scale

Dhanesh Padmanabhan (@dhanesh123us)

  • Full Talk
  • Intermediate
  • 4 upvotes
  • 0 comments
  • Mon, 15 Jun
  • play_arrow

Building Real time solution within 30 minutes

Sudhir Rawat (@rawatsudhir)

  • Crisp Talk
  • Beginner
  • 1 upvotes
  • 0 comments
  • Mon, 15 Jun

Anatomy of Decision Trees using an example from Kaggle

Saurabh Banerjee (@saurabhbanerjee)

  • Full Talk
  • Intermediate
  • 18 upvotes
  • 5 comments
  • Mon, 15 Jun
  • slideshow

Getting Started with IoT

Sudhir Rawat (@rawatsudhir)

  • Full Talk
  • Intermediate
  • 2 upvotes
  • 0 comments
  • Mon, 15 Jun

Automating news discovery in real-time

Anand S (@sanand0)

  • Full Talk
  • Beginner
  • 14 upvotes
  • 0 comments
  • Mon, 15 Jun

Static & Interactive Exploratory Data Analysis in R

Amit Kapoor (@amitkaps)

  • Workshop
  • Intermediate
  • 4 upvotes
  • 0 comments
  • Sun, 14 Jun
  • slideshow

Deconstructing Linear Regression

Vishal (@vishalgokhale)

  • Crisp Talk
  • Beginner
  • 4 upvotes
  • 0 comments
  • Sun, 14 Jun

The many ways of parallel computing with Julia

Viral B. Shah (@viralbshah)

  • Full Talk
  • Beginner
  • 4 upvotes
  • 0 comments
  • Sun, 14 Jun

Big Data Engineering made easy

Kaushik Paranjape (@kaushik-paranjape)

  • Full Talk
  • Intermediate
  • 4 upvotes
  • 4 comments
  • Sun, 14 Jun

Benchmarks from JVM to Big Data

Srinivasa Rao Aravilli (@aravilli)

  • Full Talk
  • Intermediate
  • 4 upvotes
  • 0 comments
  • Sun, 14 Jun
  • play_arrow

Ensemble Learning

Swaroop Krothapalli (@swaroop)

  • Full Talk
  • Beginner
  • 11 upvotes
  • 0 comments
  • Sat, 13 Jun

Aerospike : High Performance NoSQL store with flash optimization

Gagan Agrawal (@gagana24)

  • Full Talk
  • Intermediate
  • 5 upvotes
  • 0 comments
  • Sat, 13 Jun

Building Complex Data Workflows with Cascading on Hadoop

Gagan Agrawal (@gagana24)

  • Full Talk
  • Intermediate
  • 5 upvotes
  • 0 comments
  • Sat, 13 Jun

IT Operations Analytics: Using Text Analytics and Statistical Modeling in IT Operations Data

Vishnuteja Nanduri (@vishnunanduri)

  • Full Talk
  • Intermediate
  • 1 upvotes
  • 1 comments
  • Mon, 8 Jun

High Performance Computing in R

Ravishankar Rajagopalan (@vioravis)

  • Workshop
  • Intermediate
  • 19 upvotes
  • 0 comments
  • Wed, 3 Jun
  • play_arrow
  • slideshow

Anomaly Detection Using Apache Spark

Kiran Veigas (@kiranveigas) (proposing)

  • Crisp Talk
  • Advanced
  • 6 upvotes
  • 1 comments
  • Mon, 1 Jun

Squirrel – Enabling Accessible Analytics for All

sudipta mukherjee (@samthecoder)

  • Crisp Talk
  • Intermediate
  • 7 upvotes
  • 0 comments
  • Sun, 31 May
  • play_arrow
  • slideshow

Leveraging Cloud for BigData Analytics - Patterns, Options and Practical Next Steps

Amit Jain (@jnamit)

  • Full Talk
  • Intermediate
  • 3 upvotes
  • 0 comments
  • Thu, 28 May

Securing your Enterprise Hadoop Cluster

Manoj Sundaram (@manojsundaram)

  • Full Talk
  • Intermediate
  • 33 upvotes
  • 0 comments
  • Wed, 27 May

Building Spark as Service in Cloud using YARN

Rajat (@rgupta)

  • Full Talk
  • Intermediate
  • 27 upvotes
  • 0 comments
  • Mon, 25 May

Big Data Benchmarking

Venkata Naga Ravi (@venkatanagaravi)

  • Full Talk
  • Intermediate
  • 1 upvotes
  • 0 comments
  • Sat, 23 May
  • play_arrow
  • slideshow

On building a cloud-based black-box predictive modeling system

Bargava Subramanian (@barsubra)

  • Full Talk
  • Beginner
  • 5 upvotes
  • 0 comments
  • Thu, 21 May

Building Data Products for Small / Mid-Sized Data

Ramesh Sampath (@sampathweb)

  • Full Talk
  • Intermediate
  • 7 upvotes
  • 0 comments
  • Tue, 12 May

Deprecating MapReduce Patterns with Apache Spark

Rahul Kavale (@rahulkavale)

  • Full Talk
  • Intermediate
  • 3 upvotes
  • 2 comments
  • Thu, 7 May

Scrap Your MapReduce - Introduction to Apache Spark

Rahul Kavale (@rahulkavale)

  • Full Talk
  • Beginner
  • 7 upvotes
  • 0 comments
  • Thu, 7 May

Anatomy of RDD : A Deep dive into Spark RDD Data structure.

Madhukara Phatak (@phatak-dev)

  • Full Talk
  • Advanced
  • 14 upvotes
  • 0 comments
  • Wed, 6 May
  • slideshow

Big data analysis with Apache Spark

Madhukara Phatak (@phatak-dev)

  • Workshop
  • Beginner
  • 9 upvotes
  • 0 comments
  • Wed, 6 May

Networks and Network Analysis

Dr. Jai Ganesh (@jaiganesh)

  • Full Talk
  • Advanced
  • 7 upvotes
  • 0 comments
  • Mon, 27 Apr

Tackling ML's black boxes with probabilistic programming

Rudraksh MK (@rudrakshmk)

  • Full Talk
  • Advanced
  • 13 upvotes
  • 0 comments
  • Sat, 18 Apr