H2O in Big Data Environments
H2O is the open source math & machine learning engine for big data that brings distribution and parallelism to powerful algorithms while keeping the widely used languages of R and JSON as an API. H2O brings and elegant lego-like infrastructure that brings fine-grained parallelism to math over simple distributed arrays.
Customers can use data locked in HDFS as a data source. H2O is a primary citizen of the Hadoop infrastructure & interacts naturally with the Hadoop JobTracker & TaskTrackers on all major distros.
Similarily users can also import data sitting in S3 by launching H2O on EC2 instances with access to S3 buckets. This way customers can take advantage of having the flexibility to add nodes to a cloud with minimal difficulty.
Read more about H2O on Big Data Environments from our VP of Engineering, Tom Kraljevic: