This is a blog showing how Java applications can query data on HDFS using HAWQ. HAWQ is a SQL based MPP engine on Hadoop. Having a SQL interface to query data sitting on a Hadoop clusters opens up a lot of possibilities and interesting usecases for analytics and visualization of structured and unstructured data.
Pivotal Greenplum database and HAWQ are quite easy to integrate with Java applications. The first barrier to entry while moving from an RDBMS to a MPP/Hadoop based datastore is the work involved in changing the application code to work with the new datasources and environment.
Postgres is a popular RDBMS used by several companies and if you have ever used Postgres you are in luck!
Both Greenplum and HAWQ work with the same JDBC drivers used with Postgres.
And hence, if you want to scale your app to an MPP (Massively Parallel Processing) database you just need to point it to a Greenplum or a HAWQ cluster.
Below is a link to some PoC code to show how to query data on HDFS (Hadoop Distributed Filesystem) from Java applications using HAWQ. Though this is a usecase specific to HDFS, the point I want to drive home is that Postgres JDBC driver works seamlessly with both Greenplum and HAWQ.
I hope this blog encourages you to consider Greenplum or HAWQ as an option when you want to scale out your Postgres or other RDBMS backed applications.
Github repo is here https://github.com/amithn/pivotal-hawq-jdbc
Pivotal Greenplum database and HAWQ are quite easy to integrate with Java applications. The first barrier to entry while moving from an RDBMS to a MPP/Hadoop based datastore is the work involved in changing the application code to work with the new datasources and environment.
Postgres is a popular RDBMS used by several companies and if you have ever used Postgres you are in luck!
Both Greenplum and HAWQ work with the same JDBC drivers used with Postgres.
And hence, if you want to scale your app to an MPP (Massively Parallel Processing) database you just need to point it to a Greenplum or a HAWQ cluster.
Below is a link to some PoC code to show how to query data on HDFS (Hadoop Distributed Filesystem) from Java applications using HAWQ. Though this is a usecase specific to HDFS, the point I want to drive home is that Postgres JDBC driver works seamlessly with both Greenplum and HAWQ.
I hope this blog encourages you to consider Greenplum or HAWQ as an option when you want to scale out your Postgres or other RDBMS backed applications.
Github repo is here https://github.com/amithn/pivotal-hawq-jdbc
No comments:
Post a Comment