Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support CREATE EXTERNAL TABLE using DataFusion #125

Closed
gohalo opened this issue Aug 31, 2024 · 5 comments
Closed

Support CREATE EXTERNAL TABLE using DataFusion #125

gohalo opened this issue Aug 31, 2024 · 5 comments
Labels
feature rust Related to Rust codebase

Comments

@gohalo
Copy link
Contributor

gohalo commented Aug 31, 2024

Just like the java client routies.

HoodieTableMetaClient.withPropertyBuilder()
        .setTableType(HoodieTableType.MERGE_ON_READ)
        .setTableName(tableName)
        .setTableCreateSchema(SCHEMA)
        .setPayloadClassName(HoodieAvroPayload.class.getName())
        .initTable(conf, tablePath);

Currently, got some problems.

  1. object_store crate doesn't support to create directory. Because of the struct Path which without leading or trailing delimiters.
pub struct Path {
    /// The raw path with no leading or trailing delimiters
    raw: String,
}

But in most of cloud object store including minio, you could put empty to some path like /data/example/.hoodies/. Maybe we could change the value of private raw in the struct Path with unsafe codes, or, create some marker files like .hoodies/.schema/marker.

  1. We should create compatiable config files for Java properties, especilly for hoodies.properties.
@xushiyan
Copy link
Member

we are not prioritizing write support at this moment. let's pivot this issue to support create external table in datafusion https://datafusion.apache.org/user-guide/sql/ddl.html#create-external-table

@jonathanc-n are you interested in picking this up? would you share some high-level implementation points?

@xushiyan xushiyan added this to the release-0.3.0 milestone Nov 30, 2024
@xushiyan xushiyan added feature rust Related to Rust codebase labels Nov 30, 2024
@xushiyan xushiyan changed the title Support create hudi table. Support CREATE EXTERNAL TABLE using DataFusion Nov 30, 2024
@jonathanc-n
Copy link
Contributor

jonathanc-n commented Dec 1, 2024

High level implementation:

  • Have tableproviderfactory validate parameters for creating an instance of CreateExternalTable.
  • Since we are currently focusing on read capabilities, add read table
    What exactly were you looking for create external table to be able to do, since there is already some functionality for it.

@jonathanc-n
Copy link
Contributor

For example, https://github.com/apache/hudi-rs/blob/main/crates/datafusion/src/lib.rs#L291 has a solid test or the functionality. What functionality would you like to target on top of this?

@xushiyan
Copy link
Member

xushiyan commented Dec 2, 2024

@jonathanc-n you're right, i almost forgot this is actually a dup of #150 we should update the readme to reflect the usage, and add a module example to show this in datafusion/lib.rs

@xushiyan
Copy link
Member

xushiyan commented Dec 2, 2024

Closing this due to being a dup. @jonathanc-n pls feel free to file a PR to update the docs if you're keen. I'll keep a todo for myself also for an overall docs update before working on the release.

@jonathanc-n you're right, i almost forgot this is actually a dup of #150 we should update the readme to reflect the usage, and add a module example to show this in datafusion/lib.rs

@xushiyan xushiyan closed this as not planned Won't fix, can't repro, duplicate, stale Dec 2, 2024
@github-project-automation github-project-automation bot moved this from Todo to Done in hudi-rs roadmap Dec 2, 2024
@xushiyan xushiyan removed this from the release-0.3.0 milestone Dec 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature rust Related to Rust codebase
Projects
None yet
Development

No branches or pull requests

3 participants