Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add parquet support #46

Open
sorenmacbeth opened this issue Apr 2, 2014 · 7 comments
Open

add parquet support #46

sorenmacbeth opened this issue Apr 2, 2014 · 7 comments

Comments

@sorenmacbeth
Copy link
Collaborator

I'd like to add parquet support in addition to sequence files.

@jeroenvandijk
Copy link

I'd like to have this too. I haven't used Parquet yet, but it seems it would speed up most of my queries. Do you have a list of the work that needs to be done?

@sorenmacbeth
Copy link
Collaborator Author

No, I don't have a list. I have a branch with a skeleton for it locally.
The pail storage format is abstracted, so it essentially just copying what
is there for sequence files except using parquet files instead.

On Tue, Apr 15, 2014 at 4:10 AM, Jeroen van Dijk
[email protected]:

I'd like to have this too. I haven't used Parquet yet, but it seems it
would speed up most of my queries. Do you have a list of the work that
needs to be done?


Reply to this email directly or view it on GitHubhttps://github.com//issues/46#issuecomment-40469893
.

http://about.me/soren

@jeroenvandijk
Copy link

Ok so you are saying subclassing PailFormat for Parquet like in SequenceFileFormat.java would do the trick, right?

@sorenmacbeth
Copy link
Collaborator Author

basically, yes.

On Tue, Apr 15, 2014 at 9:09 AM, Jeroen van Dijk
[email protected]:

Ok so you are saying subclassing PailFormat for Parquet like in
SequenceFileFormat.javahttps://github.com/nathanmarz/dfs-datastores/blob/develop/dfs-datastores/src/main/java/com/backtype/hadoop/pail/SequenceFileFormat.javawould do the trick, right?


Reply to this email directly or view it on GitHubhttps://github.com//issues/46#issuecomment-40500998
.

http://about.me/soren

@jeroenvandijk
Copy link

Cool, I'll give it a try soon

@caminic
Copy link

caminic commented Nov 23, 2016

Hi Jeroen, were you able to get this to work? I'm looking at doing the same thing.

@jeroenvandijk
Copy link

@caminic Sorry for the late response. No I didn't get to it. Priorities shifted and I also didn't fully see my way through all the Java indirection. Parquet support still sounds useful as it is supported by quite a number of tools these days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants