HPCC Systems tunes big data platform for Amazon cloud
HPCC Systems, the division of LexisNexis thats pushing a big-data processing-and-delivery platform, has tuned its software to run on Amazons cloud computing platform. Interested developers can now experiment with the open source software without having to wrangle physical servers for that purpose, which brings HPCC one step closer to establishing itself as a viable alternative to the uber-popular Hadoop framework.
When I last spoke with HPCC Systems CTO Armando Escalante in September, he explained that although he thinks his company will have little trouble attracting risk-averse large enterprise and government customers, it will be tougher to establish a developer ecosystem similar to what Hadoop has built. As good as HPCC might be and at least some analysts are starting to sing its praises having a vibrant community goes a long way.
Hadoop has no shortage of startups, large vendors and individual developers committed to it already. That gives potential users the confidence that not only will Hadoop products be supported for a long time, but that the code will continue to improve and interoperate across a variety of different vendors data products, Hadoop-based or not.
Microsoft killing its Dr! yad data -processing platform to focus on Hadoop opened a door for HPCC Systems, but also served to block its entry into the room. Now there are really only two unstructured-data processing platforms of note, but having Microsoft on the Hadoop bandwagon is yet another sign that Hadoop is for real.
Making HPCC run on AWS or any cloud is a good start for HPCC Systems, as it provides a low-risk option for developers to get started on the platform. It will be even more appealing when HPCCs software is supported by AWSs Elastic MapReduce service, which the company says is the next step. Assuming one has data there to work with, the cloud is a great place to get started with big data tools because they generally require server clusters that are cheaper to rent than to buy, in the short term.
Technically, HPCC has only tuned a portion of its platform the Thor Data Refinery Cluster to run on Amazon Web Services, but thats the part that matters most. Thor does the data-processing for HPCC, which makes it the apples-to-apples comparison with Hadoop. The platform also consists of the Roxie Data Query Cluster, a data-warehouse and query layer thats akin to the higher-level Hive and HBase projects that have been developed for Hadoop.
HPCC Systems is quick to point out, however, that its platform all utilizes a single language, Enterprise Control Language, whereas Hadoop itself uses MapReduce, but projects such as HBase and Hive have their own languages.
Related research and analysis from GigaOM Pro:
Subscriber content. Sign up for a free trial.
- Infrastructure Q1: IaaS Comes Down to Earth; Big Data TakesFlight
- Defining Hadoop: the Players, Technologies and Challenges of2011
- Putting Big Data to Work: Opportunities forEnterprises
Comments