Big data is big money, and a relative new-comer to the game is trying to make a big impact. 0xdata (pronounced hexadata), started by SriSatish Ambati, is that new-comer. Their current flagship product, simply titled H20, is an open source platform used to crunch huge amounts of data to more accurately display analytic results. It is able to compute these large data sets by combining machine learning with advanced mathematical algorithms. H20 allows for customers to their entire data sets, instead of sample sets which are traditionally used for such processes.
We recently had a chance to talk with SriSatish Ammbati, CEO and co-founder of 0xdata to help shed more light on their product.
1. A little bit about yourselves, please, and how long have you been working on this problem?
H2O by 0xdata brings better algorithms to big data. H2O is the fast open source in-memory prediction engine & machine learning platform. With H2O enterprises can use all of their data (instead of sampling) in real-time for better predictions. Data Scientists can take both simple & sophisticated models to production from the same interactive platform used for modeling, within R and JSON. H2O is also used as an algorithms library for Making Hadoop Do Math. Our earliest customers have built powerful domain specific predictive engines for Recommendations, Pricing and Outlier detection in Fraud & Insurance. 0xdata is the maker of H2O and nurturing a grassroots movement of math, systems and data scientists to herald the new wave of Discovery with Big Data Science.
2. How would you explain H2O to the everyday person?
Google's Search and Adsense is powered by machine learning algorithms much like Algorithmic trading in Finance and Recommendation Engines of Amazon & Netflix. H2O brings google-scale machine learning to Enterprise customers so they too can build smarter applications to transform their customer experiences.
3. You main product is H2O, care to explain a little more about what it's used for?
H2O brings fast, scalable machine learning to build the smarter applications that will power the internet of the future. H2O rewrote fundamental analytical algorithms from math, statistics and machine learning to be fast and on big datasets.
4. How is H2O able to accomplish crunching large amounts of data and then presenting important statistical information?
H2O was built from the ground up to work as a mathematical engine that can be distributed over a large number of servers. It does this by breaking apart the data that is stored into multiple smaller and compressed pieces that are too large to fit into the memory of any one machine. Data scientists interact with H2O to conduct statistics and machine learning with pre-built models, standard tools for conducting end-to-end analysis and presenting the results in a simple to use UI or standard tools like R, Tableau, Excel, and our web-based application.
5. H2O seems great for larger businesses, corporations, etc. Would a small business see any benefits from H2O?
H2O helps organizations large and small that use data science to drive data decision making in all aspects. Customers of H2O have found value when they reach memory or computation limits of traditional software or already have data sitting in Hadoop or other enterprise data warehouses.
6. You guys obviously like big data, where do you see big data going in the next 5-10 years?
Big Data is only going to continue to grow, especially with unstructured and text data over the next 10 years to well over 10,000 exabytes according IDC's 2013 report. Most of this data is coming from sensors or machine generated data. Standard querying or visualization tools are not going to be able to handle this influx of data. Over the next decade, big data is giving way to purpose built applications and just like batteries for a whole new era of transportation, so is Machine Learning opening up a whole new world of smart applications.
7. So, care to tell us a little bit more about Sparkling Water? The team-up of Apache Spark and H2O?
Sparkling Water is the latest innovation to combine two best-of-breed open source technologies Apache Spark and H2O. Sparkling Water is the newest application on the Apache Spark in-memory platform to extend Machine Learning for better predictions and to quickly deploy models into production. H2O is proud to partner with Cloudera and Databricks to bring this capability to a wide audience.
8. Anything you would like to close with?
Big data is coming of age, first there was data storage for cost reductions, next there was business intelligence, now we are finally getting into the era of actually building the next generation of smarter applications. Its a perfect storm and H2O is at the center of it.
Most recently, 0xdata announced their latest innovation, Sparkling Water. Sparkling Water is the combination of two open source technologies, 0xdata's H20 and Apache Spark. Sparkling Water is an application used on Apache Spark to help extend the Machine Learning process. This technology has helped breed advancements in Deep Learning, bringing Machine Learning closer to one of its original goals of artificial intelligence.
As data sets and analytics continue to be more integral to business growth and efficiency, companies like 0xdata are hoping to lead the way. With products like H20 and Sparkling Water, 0xdata seems to be doing just fine on that front.