Hive II: Using it with MySQL as a metadata db
In the previous post, we setup a Hive with metadata stored in Derby.
Sometimes this is not desirable, since Derbty DB it's not as popular as MySQL / Posgre or other more known dbms.
In this post, we will make hive store it's metadata in a MySQL database. Remember that the data itself is still being stored in the HDFS. What we're changing here is only the location for the metadata info.
We wont deal with data migration, for the sake of simplicity, so please know that any metadata you have will be lost. We're assuming that you are doing this in the beginning of your system setup. Be aware of that!
MySQL Install
This is already covered in a previous topic, where we installed a MySQL to be accessed by the Spark engine. Please refer to it if you have any other questions, but in summary, what we'll do is:
sudo apt install mysql-server
sudo mysql_secure_installation
sudo mysql -u root
mysql> USE mysql;
mysql> UPDATE user SET plugin='mysql_native_password' WHERE User='root';
mysql> FLUSH PRIVILEGES;
mysql> ALTER USER 'root'@'localhost' IDENTIFIED BY 'MyNewPass';
mysql> exit;
check for connection:
mysql -u root -p
mysql> exit;
Now, let's install the mysql connector and link it to our hive instance:
ln -s /usr/share/java/mysql-connector-java-8.0.28.jar /opt/hive/lib/mysql-connector-java.jar
mysql -u root -p
mysql> CREATE USER hive@'%' identified by '(password)';
mysql> GRANT all on *.* to hive;
mysql> flush privileges;
mysql> exit;
Hive Setup
Finally, we need to change hive-site.xml to make it go for MySQL instead of Dearby:
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost/metastore?createDatabaseIfNotExist=true</value>
<description>metadata is stored in a MySQL server</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>MySQL JDBC driver class</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hiveuser</value>
<description>user name for connecting to mysql server</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>hivepassword</value>
<description>password for connecting to mysql server</description>
</property>
Now, all we have to do is to format and initialize the new metadata schema:
/opt/hive/bin/schematool -dbType mysql -initSchema
And that's it:
hive> create table tab_users (id int, name string, age int, sex string) clustered by(id) into 2 buckets stored as orc TBLPROPERTIES ('transactional'='true');
hive> show tables;
OK
tab_users
Time taken: 0.188 seconds, Fetched: 1 row(s)
Comentários
Postar um comentário