Apache Phoenix is an open source, relational database layer on top of noSQL store such as Apache HBase. Phoenix provides a JDBC driver that hides the intricacies of the noSQL store enabling users to create, delete, and alter SQL tables, views, indexes, and sequences; upsert and delete rows singly and in bulk; and query data through SQL.
Installation:
Following are the steps that need to be followed to configure Apache Phoenix in Cloudera Distribution for Hadoop (CDH)
- Login to Cloudera Manager, click on Hosts, then Parcels.
- Select Edit Settings.
- Click the + sign next to an existing Remote Parcel Repository URL, and add the URL: http://archive.cloudera.com/cloudera-labs/phoenix/parcels/latest/ Click Save Changes.
- Select Hosts, then Parcels.
- In the list of Parcel Names, CLABS_PHOENIX is now available. Select it and choose Download.
- The first cluster is selected by default. To choose a different cluster for distribution, select it. Find CLABS_PHOENIX in the list, and click Distribute.
- If you to use secondary indexing, add the following to the hbase-site.xml advanced configuration snippet. Go to the HBase service, click Configuration, and choose/search for HBase Service Advanced Configuration Snippet (Safety Valve) for hbase-site.xml. Paste in the following XML, then save the changes.
- . Restart the HBase service.
1
2
3
4
<property>
<name>hbase.regionserver.wal.codec</name>
<value>org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec</value>
</property>
Using Apache Phoenix Utilities
Several command-line utilities for Apache Phoenix are installed into /usr/bin.
Prerequisites Before using the Phoenix utilities, set the JAVA_HOME environment variable in your terminal session, and ensure that the java executable is in your path. Adjust the following commands to your operating system’s configuration.
1
2
$ export JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera
$ export PATH=$PATH:$JAVA_HOME/bin
phoenix-sqlline.py A command-line interface to execute SQL from the command line. It takes a single argument, which is the ZooKeeper quorum of the corresponding HBase cluster. For example:
1
$ /usr/bin/phoenix-sqlline.py zookeeper01.test.com:2181
phoenix-psql.py A command-line interface to load CSV data or execute SQL scripts. It takes two arguments, the ZooKeeper quorum and the CSV or SQL file to process. For example:
1
2
$ /usr/bin/phoenix-psql.py zookeeper01.test.com:2181 create_stmts.sql data.csv
$ /usr/bin/phoenix-psql.py zookeeper01.test.com:2181 create_stmts.sql query.sql
phoenix-performance.py A command-line interface to create a given number of rows and run timed queries against the data. It takes two arguments, the ZooKeeper quorum and the number of rows to create. For example:
1
$ /usr/bin/phoenix-psql.py zookeeper01.test.com:2181 100000
References:
https://en.wikipedia.org/wiki/Apache_Phoenix
https://phoenix.apache.org/