About Hadoop®


Apache™ Hadoop® is an open source software project that enables the distributed processing of large data sets across clusters of commodity servers. It is designed to scale up from a single server to thousands of machines, with a very high degree of fault tolerance. Rather than relying on high-end hardware, the resiliency of these clusters comes from the software’s ability to detect and handle failures at the application layer.

What is hadoop?
What is hadoop?
Hadoop is defined in 3 minutes with Rafael Coss, manager Big Data Enablement for IBM


Main subprojects

Apache Hadoop has two main subprojects:

Hadoop is supplemented by an ecosystem of Apache projects, such as Pig, Hive and Zookeeper, that extend the value of Hadoop and improves its usability.


So what’s the big deal?


Hadoop changes the economics and the dynamics of large scale computing. Its impact can be boiled down to four salient characteristics.

Hadoop enables a computing solution that is:


Think Hadoop is right for you?


Eighty percent of the world’s data is unstructured, and most businesses don’t even attempt to use this data to their advantage. Imagine if you could afford to keep all the data generated by your business? Imagine if you had a way to analyze that data?

IBM InfoSphere BigInsights brings the power of Hadoop to the enterprise. With built-in analytics, extensive integration capabilities and the reliability, security and support that you require, IBM can help put your big data to work for you.

InfoSphere BigInsights Quick Start Edition, the latest edition to the InfoSphere BigInsights family, is a free, downloadable, non-production version.

With InfoSphere BigInsights Quick Start, you get access to hands-on learning through a set of tutorials designed to guide you through your Hadoop experience. Plus, there is no data capacity or time limitation, so you can experiment with large data sets and explore different use cases, on your own timeframe.


Want to learn more on the use of Hadoop?


Contact IBM

Considering a purchase?


IBM BigInsights Quick Start

Learn about InfoSphere BigInsights Quick Start Edition

Download InfoSphere BigInsights Quick Start now


  • Hadoop in the cloud

    Hadoop in the cloud

    Leverage big data analytics easily and cost-effectively with IBM InfoSphere BigInsights
    Get the eBook


  • Business Case for Enterprise Big Data Deployments

    Business Case for Enterprise Big Data Deployments

    Comparing Costs, Benefits and Risks for Use of IBM InfoSphere BigInsights and open source Apache Hadoop
    Get the report


  • SQL-on-Hadoop without compromise

    HSQL-on-Hadoop without compromise

    How Big SQL 3.0 from IBM represents an important leap forward for speed, portability and robust functionality in SQL-on-Hadoop solutions
    Get the white paper


  • Big Data Hadoop Solutions, Q1 2014

    Big Data Hadoop Solutions, Q1 2014

    Read the report to see why IBM InfoSphere BigInsights was named a leader and how it stands in relation to other big data Hadoop vendors.
    Get the report


  • Hadoop appliances: the key to simplicity, speed, scalability, and stability in big data

    Hadoop appliances: the key to simplicity, speed, scalability, and stability in big data

    Overcoming challenges for early Hadoop adopters
    Get the white paper


  • Understanding Big Data

    Understanding Big Data

    Analytics for Enterprise Class Hadoop and Streaming Data
    Download the ebook