Top 15 HortonWorks Data Platform Alternative and Similar Softwares | May 2024

Hortonworks develops, distributes and supports a 100% open source distribution of Apache Hadoop for the enterprise, also training, support & services.

1. Microsoft HDInsight

Microsoft HDInsight * A Data Lake service* Scale to petabytes on demand* Crunch all data—structured, semi-structured, unstructured* Develop in Java, .NET, and more* Skip buying and maintaining hardware* Spin up Apache Hadoop, Spark, and R clusters in the cloud* Use Excel or your favorite BI tool to visualize Hadoop data* Connect on-premises......

2. Azkaban

Azkaban Azkaban is a batch workflow job scheduler created at LinkedIn to run Hadoop jobs. Azkaban resolves the ordering through job dependencies and provides an easy to use web user interface to maintain and track your workflows.Features:- Compatible with any version of Hadoop- Easy to use web UI- Simple web and......

3. World Programming System (WPS)

World Programming System (WPS) The WPS industrial analytics platform is designed for data science and heavyweight data processing with the languages of SAS and R. Best known for its SAS language compiler, the WPS software includes advanced graphical user interfaces, robust, high-performance data processing and production-ready application frameworks.WPS software is versatile and is used......

4. Apache Oozie

Apache Oozie Oozie is a workflow scheduler system to manage Apache Hadoop jobs.Oozie Workflow jobs are Directed Acyclical Graphs (DAGs) of actions.Oozie Coordinator jobs are recurrent Oozie Workflow jobs triggered by time (frequency) and data availability.Oozie is integrated with the rest of the Hadoop stack supporting several types of Hadoop jobs out......

5. Amazon Elastic MapReduce

Amazon Elastic MapReduce Amazon Elastic MapReduce is a web service that enables businesses, researchers, data analysts, and developers to easily and cost-effectively process vast amounts of data.......

6. MapR

MapR MapR makes Apache Hadoop more affordable and easier to use for big data analytics, business intelligence, distributed computing, machine learning, distributed file systems and map reduce grid computing.......

7. ProActive Workflows & Scheduling

ProActive Workflows & Scheduling ProActive Workflows & Scheduling allows you to easily execute all your company jobs and business applications, monitor activity and quickly access job results. Allowing your IT to scale up and down according to your actual workload, it will ensure the optimal match between disponibilty and cost. It ensures more work......

8. Recombee

Recombee Recombee is a real-time Recommendation-as-a-service (SaaS). It offers an intuitive RESTful API, excellent documentation and SDKs in many different languages (Java, Python, PHP, Ruby,...). It is domain-independent and has successful applications in areas such as job offers, news, online retail, cultural events, video-on-demand or mobile apps. Even the most complex......

9. JobsPikr

JobsPikr JobsPikr is a job data delivery platform that extracts data directly from the company websites. It runs on top of automated crawlers powered by machine learning techniques to extract latest job listings directly from the career pages of company websites and delivers the data feed in the form of pre-packaged......

10. Soley Studio

Soley Studio Soley GmbH develops agile and innovative software solutions for data analysis in engineering. With Soley Studio experts digitalize their knowledge, automate time-consuming processes and, thus, overcome existing complexity. At the push of a button, practicable workflows – from the consolidation of data, through data analysis to the visualization of the......

11. Deep.BI

Deep.BI Deep.BI measures content consumption metrics to help publishers distribute content across platforms and grow audiences.Deep.BI collects all kinds of raw event data related to publishing - readers behavior and content performance and lets analyze this data in real-time.By collecting raw data publishers get unprecedented flexibility and can build their own......

12. Spinn3r

Spinn3r Spinn3r is a web service for indexing the blogosphere. We provide raw access to every blog post being published - in real time. We provide the data and you can focus on building your app.You can be up and running with Spinn3r in less than an hour. We ship a......

13. Greenplum HD

Greenplum HD Greenplum HD is an open-source certified and supported version of the Apache Hadoop stack. It includes Hadoop Distributed File System (HDFS), MapReduce, Hive, Pig, HBase, and ZooKeeper. Greenplum HD’s packaged Hadoop distribution removes the need in building out a Hadoop cluster from scratch, which is required with other distributions. Isilon......

14. Cloudera CDH

Cloudera CDH Cloudera's open-source Apache Hadoop distribution, CDH (Cloudera Distribution Including Apache Hadoop), targets enterprise-class deployments of that technology. Cloudera says that more than 50% of its engineering output is donated upstream to the various Apache-licensed open source projects (Apache Hive, Apache Avro, Apache HBase, and so on) that combine to form......

15. Platfora

Platfora Platfora puts the power of Big Data Analytics into the hands of business users, providing self-service analytics capability across all of your customer interaction, machine and transactional data sets. With Platfora, you can visualize insights and make decisions that were never before possible—all at the speed of business and without......