Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Begin moving to postgres spilo + adding pgvector #8309

Merged
merged 25 commits into from
Nov 15, 2024

Conversation

FelixMalfait
Copy link
Member

@FelixMalfait FelixMalfait commented Nov 4, 2024

We will remove the twenty-postgres image that was used for local development and only use twenty-postgres-pilo (which we use in prod), bringing the development environment closer to prod and avoiding having to maintain 2 images.

Instead of provisioning the super user after the db initialization, we directly rely on the superuser provided by Spilo for simplicity. We also introduce a change that tries to create the right database (default or test) based on the context.

How to test:

docker build -t twentycrm/twenty-postgres-spilo:latest -f ./packages/twenty-docker/twenty-postgres-spilo/Dockerfile .
docker images --no-trunc | grep twenty-postgres-spilo
postgres-on-docker:
	docker run \
	--name twenty_pg \
	-e PGUSER_SUPERUSER=twenty \
	-e PGPASSWORD_SUPERUSER=twenty \
	-e ALLOW_NOSSL=true \
	-v twenty_db_data:/home/postgres/pgdata \
	-p 5432:5432 \
	REPLACE_WITH_IMAGE_ID

@FelixMalfait FelixMalfait changed the title Begin moving to postgres spilo + adding pgvector/hydra Begin moving to postgres spilo + adding pgvector Nov 6, 2024
Copy link

github-actions bot commented Nov 13, 2024

Warnings
⚠️ Changes were made to the environment variables, but not to the documentation - Please review your changes and check if a change needs to be documented!

Generated by 🚫 dangerJS against 0c37a83

@FelixMalfait FelixMalfait marked this pull request as ready for review November 15, 2024 08:38
@FelixMalfait FelixMalfait merged commit 736635a into main Nov 15, 2024
19 checks passed
@FelixMalfait FelixMalfait deleted the move-to-postgres-spilo branch November 15, 2024 08:38
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Summary

This PR transitions from a custom PostgreSQL image to Spilo PostgreSQL across development and production environments, standardizing database management and adding pgvector support.

  • Replaces POSTGRES_ADMIN_PASSWORD with PGUSER_SUPERUSER/PGPASSWORD_SUPERUSER across all configurations to align with Spilo's environment variables
  • Changes database volume mount path from /var/lib/postgresql/data to /home/postgres/pgdata in Docker and K8s configurations
  • Adds vector extension support in setup-db.ts for pgvector functionality
  • Modifies database connection strings to use postgres user instead of twenty across all environments
  • Preserves critical schemas (metric_helpers, user_management, public) during database truncation

28 file(s) reviewed, 20 comment(s)
Edit PR Review Bot Settings | Greptile

Comment on lines +73 to +77
- name: Server / Create DB
if: steps.changed-files.outputs.any_changed == 'true'
run: |
PGPASSWORD=twenty psql -h localhost -p 5432 -U postgres -d postgres -c 'CREATE DATABASE "default";'
PGPASSWORD=twenty psql -h localhost -p 5432 -U postgres -d postgres -c 'CREATE DATABASE "test";'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Database creation step should be moved before worker run but after environment setup to ensure proper initialization order

Comment on lines 122 to +136
services:
postgres:
image: twentycrm/twenty-postgres
image: twentycrm/twenty-postgres-spilo
env:
POSTGRES_PASSWORD: postgres
POSTGRES_USER: postgres
PGUSER_SUPERUSER: postgres
PGPASSWORD_SUPERUSER: twenty
ALLOW_NOSSL: "true"
SPILO_PROVIDER: "local"
ports:
- 5432:5432
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Missing database creation step in server-integration-test job that's present in server-setup job

docker compose logs db
exit 1
fi
echo "Still waiting for database... (${count}/60)"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Message shows '(${count}/60)' but timeout is 300 - inconsistent messaging

Comment on lines 54 to 55
docker compose logs db server -f &
pid=$!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Background process pid is never killed if containers start successfully - potential resource leak

@@ -37,16 +44,20 @@ jobs:
if: steps.changed-files.outputs.changed == 'true'
uses: ./.github/workflows/actions/yarn-install

- name: Server / Create DB
if: steps.changed-files.outputs.any_changed == 'true'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Condition uses any_changed but previous steps use changed - this inconsistency could cause the database creation to run when it shouldn't or vice versa

Comment on lines 33 to +34
- name: PG_DATABASE_URL
value: "postgres://twenty:[email protected]/default"
value: "postgres://postgres:[email protected]/default"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Ensure this connection string matches the credentials configured in the Spilo database deployment. The superuser password should be securely stored and accessed via a secret rather than hardcoded.

Comment on lines 51 to 52
| <a name="input_twentycrm_app_hostname"></a> [twentycrm\_app\_hostname](#input\_twentycrm\_app\_hostname) | The protocol, DNS fully qualified hostname, and port used to access TwentyCRM in your environment. Ex: https://crm.example.com:443 | `string` | n/a | yes |
| <a name="input_twentycrm_pgdb_admin_password"></a> [twentycrm\_pgdb\_admin\_password](#input\_twentycrm\_pgdb\_admin\_password) | TwentyCRM password for postgres database. | `string` | n/a | yes |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: check if pgdb_admin_password variable needs to be renamed to match new Spilo environment variable PGPASSWORD_SUPERUSER

Comment on lines +8 to +11
PGUSER=$(echo $PG_DATABASE_URL | awk -F '//' '{print $2}' | awk -F ':' '{print $1}')
PGPASS=$(echo $PG_DATABASE_URL | awk -F ':' '{print $3}' | awk -F '@' '{print $1}')
PGHOST=$(echo $PG_DATABASE_URL | awk -F '@' '{print $2}' | awk -F ':' '{print $1}')
PGPORT=$(echo $PG_DATABASE_URL | awk -F ':' '{print $4}' | awk -F '/' '{print $1}')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: URL parsing with awk is fragile - will break if credentials contain special characters or if URL format changes. Consider using a more robust parsing method.

Comment on lines +12 to +13
PGPASSWORD=${PGPASS} psql -h ${PGHOST} -p ${PGPORT} -U ${PGUSER} -d postgres -tc "SELECT 1 FROM pg_database WHERE datname = 'default'" | grep -q 1 || \
PGPASSWORD=${PGPASS} psql -h ${PGHOST} -p ${PGPORT} -U ${PGUSER} -d postgres -c "CREATE DATABASE \"default\""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Missing error handling if psql commands fail. Script continues even if database creation fails.

Comment on lines +24 to +30
if (
schema.schema_name === 'metric_helpers' ||
schema.schema_name === 'user_management' ||
schema.schema_name === 'public'
) {
continue;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Consider making protected schemas configurable via environment variables rather than hardcoding them

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants