Get Started with Serengeti

Project Serengeti is an open-source project initiated by VMware to automate deployment and management of Apache Hadoop clusters on virtualized environments such as vSphere. The Project Serengeti code can run multiple Hadoop distributions from multiple vendors.

Agilelydeploy and run an Apache Hadoop cluster in 10 minutes

  • Deploy clusters with HDFS, MapReduce, Pig, Hive, and Hive server
  • One command to deploy and scale out Apache Hadoop clusters
  • Tune Apache Hadoop configurations

Fully customizable configuration profile to meet your needs

  • Dedicated machines or share with other work load
  • Shared or local storage
  • Static IP or DHCP network
  • Fully control the placement of Apache nodes

Speed up time to insight

  • Upload/download data, run MapReduce job, Pig and Hive scripts from Project Serengeti interface
  • Consume data in HDFS through Hive server SQL connection using existing tools

Elastic scalability on demand

  • Separate compute node from data node without losing data locality
  • Scale out and decommission compute nodes on demand
  • Spin up compute only clusters to analyze data in existing HDFS

Improved availability for Apache Hadoop cluster

  • One click to Highly Available NameNode and JobTracker to avoid single point of failure
  • Fault-tolerance (FT) for NameNode and JobTracker
  • One click HA for Apache Hadoop tools Pig, Hive and Hbase
  • VMware vMotion to reduce planned downtime

Support multiple Apache Hadoop distributions

Try Serengeti

Download Serengeti Virtual Appliance for free.

Download Now