Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

COPY format "parquet" not recognized #68

Open
dalonso-clarivate opened this issue Nov 4, 2024 · 2 comments
Open

COPY format "parquet" not recognized #68

dalonso-clarivate opened this issue Nov 4, 2024 · 2 comments

Comments

@dalonso-clarivate
Copy link

Hi all
I think I have successfully installed "pg_parquet" following the instructions:

****Installed these dependencies:

sudo yum install -y epel-release
sudo yum install -y cmake boost-devel zlib-devel
sudo yum install -y https://apache.jfrog.io/artifactory/arrow/almalinux/8/apache-arrow-release-latest.rpm
sudo yum install -y arrow-devel parquet-devel
sudo yum install openssl-devel

*** Installed cargo
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
cargo install cargo-pgrx

*** Installed pgrx

export PATH=$PATH:/var/lib/pgsql/.cargo/bin
cargo pgrx install
cargo pgrx init --pg15 /usr/pgsql-15/bin/pg_config

#Created a new Extension project
mkdir extensions
cd extensions
cargo pgrx new pg_parquet
cd pg_parquet

Updated Cargo.toml as "https://github.com/CrunchyData/pg_parquet/blob/main/Cargo.toml"

Run

cargo pgrx run

A test Postgres instance is created. I successfully create the extension:

pg_parquet=# create extension pg_parquet;
CREATE EXTENSION
pg_parquet=# \dx+
Objects in extension "pg_parquet"
Object description

function hello_pg_parquet()
(1 row)

pg_parquet=# SELECT hello_pg_parquet();
hello_pg_parquet

Hello, pg_parquet
(1 row)

##But cannot export the data:
pg_parquet=# INSERT INTO test_table (id, name) VALUES (1, 'Alice'), (2, 'Bob');
INSERT 0 2
pg_parquet=#
pg_parquet=# COPY test_table TO '/tmp/test_table.parquet' (FORMAT 'parquet', COMPRESSION 'gzip');
ERROR: COPY format "parquet" not recognized
LINE 1: COPY test_table TO '/tmp/test_table.parquet' (FORMAT 'parque...

Any clue?

Thanks in advance!

@aykut-bozkurt
Copy link
Collaborator

aykut-bozkurt commented Nov 4, 2024

You need to add pg_parquet to shared_preload_libraries and restart postgres instance. (should be somewhere like ~/.pgrx/data-15/postgresql.conf)

@dalonso-clarivate
Copy link
Author

Thanks for your answer aykut-bozkurt. I had already added the extension to the shared_preload_libraries parameter. However, I have finally got it running. I paste the steps I have followed in case they can help to anyone else. I am using Oracle Linux 8:

*****Installed these packages

$> sudo yum install -y epel-release
$> sudo yum install -y cmake boost-devel zlib-devel
$> sudo yum install -y https://apache.jfrog.io/artifactory/arrow/almalinux/8/apache-arrow-release-latest.rpm
$> sudo yum install -y arrow-devel parquet-devel
$> sudo yum install openssl-devel
$> sudo yum install git

**** Clone the source and install cargo and pgrx
$> git clone https://github.com/CrunchyData/pg_parquet.git
$> cd pg_parquet
$> cargo install cargo-pgrx
$> cargo pgrx init
$> curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
$> . "$HOME/.cargo/env"
$> cargo install cargo-pgrx
$> cargo pgrx init --pg15 /usr/pgsql-15/bin/pg_config

*** Open Cargo.toml and set up as default the postgres version you are using. In this case it is 15:
[features]
default = ["pg15"]
pg17 = ["pgrx/pg17", "pgrx-tests/pg17"]
pg16 = ["pgrx/pg16", "pgrx-tests/pg16"]
pg15 = ["pgrx/pg15", "pgrx-tests/pg15"]
pg14 = ["pgrx/pg14", "pgrx-tests/pg14"]
pg_test = []


$> cargo build --release --features pg15
$> echo "shared_preload_libraries = 'pg_parquet'" >> ~/.pgrx/data-15/postgresql.conf
$> cargo pgrx run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants