Skip to main content

About Hadoop

Apache Hadoop is an open source software project that enables the distributed processing of large data sets across clusters of commodity servers. It is designed to scale up from a single server to thousands of machines, with a very high degree of fault tolerance. Rather than relying on high-end hardware, the resiliency of these clusters comes from the software’s ability to detect and handle failures at the application layer.

Apache Hadoop has two main subprojects:

Hadoop is supplemented by an ecosystem of Apache projects, such as Pig, Hive and Zookeeper, that extend the value of Hadoop and improves its usability.

So what’s the big deal?

Hadoop changes the economics and the dynamics of large scale computing. Its impact can be boiled down to four salient characteristics.

Hadoop enables a computing solution that is:

Think Hadoop is right for you?

Eighty percent of the world’s data is unstructured, and most businesses don’t even attempt to use this data to their advantage. Imagine if you could afford to keep all the data generated by your business? Imagine if you had a way to analyze that data?

IBM InfoSphere BigInsights brings the power of Hadoop to the enterprise. With built-in analytics, extensive integration capabilities and the reliability, security and support that you require, IBM can help put your big data to work for you.

Want to learn more on the use of Hadoop?

Contact IBM

live-assistance

Easy ways to get the answers you need.


Or call us at:
800-966-9875
Priority code:
109HF03W