"Why is Spark special? It has a lot of dimensions from a business value standpoint; it addressed needs we were starting to hear from customers, and that we knew would only become more pronounced: speed of analysis and insight, reducing time-to-value by an order of magnitude, flexibility of the way Spark was designed to work, the ability of Spark to address a number of existing, pressing business needs."
Rod Smith, VP of Emerging Technologies, IBM
What is Spark?
Apache Spark is an open source framework and engine designed to process large amounts of data in a quick an efficient manner. Similar to Hadoop, Spark differs in that it places the data in memory to be analyzed, giving it significant speed advantages over Hadoop of up to 100x faster in some cases. In addition, Spark is designed as a platform, and can connect to disparate data sources--including traditional databases, big data file systems such as HDFS, or even cloud based storage.
How is IBM involved with Spark?
Recently, IBM announced it's own Spark initiative: dedicating over 3000 employees within over a dozen labs, IBM has been and will be an active member of the Spark community. Work is currently underway to create a Spark as a Service offering on the company's cloud platform, Bluemix, and numerous IBM product teams (including IBM's SystemML, bringing the company's machine learning expertise to the platform) are being enhanced and extended to support Spark. IBM is also partnering with other companies, including DataBricks (founded by the creators of Spark), AMPLab, and Galvanize, to contribute to the development of the platform. In addition, IBM has created a Spark Technology Center, based in San Francisco, where the Spark community can collaborate on building Spark solutions.
Interested in knowing more?
IBM Emerging Technologies and jStart are currently building solutions and working with customers who are interested in seeing how Spark can be leveraged to solve real business challenges, create business value, and enhance operations, today. Keep an eye out for announcements by the team about those technologies, as well as demos on pilots with customers we've been working on for the past few months. If you would like to get started today in a conversation about Spark with the Emerging Technologies team, feel free to contact us.
What's in store for Spark?
While Spark addresses scalability, speed, and integration of data sources, where does the technology go from here? Rod pointed to two possible avenues in which Spark might evolve:
- Extending Spark to the Cloud as a service. Cloud platforms will begin offering "Spark as a Service" on their platforms, giving cloud app developers the ability to leverage Spark's analytics capabilities within their apps. Already work is being done to make this happen, and should be a near-term development for the framework.
- Portable Analytics. With Spark being deployed in a variety of environments, from on-prem traditional IT infrastructures, to the cloud, and even bridging the two (hybrid cloud, public/private data stores, etc.), the ability to move analytics capabilities and models seamlessly from environment to environment rapidly changes from a "nice to have" feature, to a "have to have" capability.
- Evolution into an Integration Platform. As Spark integrates with an increasingly diverse set of data sources, it will naturally become a defacto integration platform. Tying together data from disparate data sources from internal and external stores, as well as using a variety of endpoints/methods of accessing that data, Spark has the potential to be a critical component in IT infrastructures.
- Development of Next Gen Spark Systems. Companies will work to make the implementation of Spark simpler and easier for customers. The creation of Spark as a Service on platforms, should allows customers to avoid having build and maintain complex back end systems. But further, the development of front-end systems will reduce the barrier to entry for data analysts and scientists, by giving them an intuitive interface that allows them to rapidly get started on analysis, without having to wade through a steep learning curve.
Start Small, Grow Fast
Learn how the jStart Team can help your business get started using our "start small, grow fast" engagement process. Today's business challenges aren't just about huge amounts of information, rather it is leveraging the valuable insights and opportunities living within that data. jStart is a highly skilled team focused on providing fast, smart, and valuable business solutions leveraging the latest technologies. The team typically focuses on emerging technologies which have commercial potential within 12-18 months. This allows the team to keep ahead of the adoption curve, while being prepared for client engagements and partnerships. The team’s focus includes: predictive and prescriptive analytics, cognitive computing, cloud technologies, big data, social data and mobile platforms.
Follow us on: