-
Notifications
You must be signed in to change notification settings - Fork 262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hive Dataset as external table with HDFS Dataset #461
Comments
It is not a great solution but you can repair[1] the table with:
|
@mkwhitacre Thank you very much, it really solved my problem. I have one more question:
Map Reduce job will always fail without specific exception. Only when I set That is why I create a new hive dataset:
So, what could be the possible reason that causes this problem? |
Does it also fail when you do: "dataset:hdfs://nameservice/path/to/depa_raw"? Without a specific exception it is harder to diagnose but I'm guessing it is a problem with your config not being populated with either the configuration for the Hive Metastore or the jars on your classpath. |
No matter what format I use, as long as it is the All the configuration and the class I've used are here: |
I create a dataset at HDFS with
schema
andpartition
:and use Gobblin to continuously ingest data from kafka to HDFS. The partition looks like:
This part works well.
Then I try to use Hive to query this data, so I create a new Hive dataset as an external table by assign the
--location
parameter:Then I can find the table
default/depa_raw
and data in Hive.But one thing wrong. With the data keep coming from Kafka to HDFS, the partition increases in HDFS by
path
, but in Hive table, no partition will be created automatically! Which means I can't see newly updated data in Hive.So what can I do to solve this problem? (I just want to get newly coming data in Hive)
kite-dataset delete depa_raw
, and wanted to create a new external Hive table, but all the data on HDFS gone after the command.kite-dataset update depa_raw --location hdfs://10.0.1.63:8020/user/pnda/PNDA_datasets/datasets/kafka/depa_raw
but nothing happened.The text was updated successfully, but these errors were encountered: