Skip to content

Latest commit

 

History

History
307 lines (246 loc) · 16.4 KB

configurations.md

File metadata and controls

307 lines (246 loc) · 16.4 KB

Configuring the API service

Raw Data API can be setup using two configuration options. You can choose based on your convienience

  • config.txt : You can follow config.txt.sample in dir and documentation below to set your configurations
  • .env : Another option is from OS Environment variable , You can export all your env variables ( They are same as you put in config without blocks ) and pass it to API , API will pick it up automatically.

What you need to start?

The default configuration file is an ini-style text file named config.txt in the project root.

Users Table

Users table is present on backend/sql/users.sql Make sure you have it before moving forward

psql -a -f backend/sql/users.sql

& Add your admin's OSM ID as admin

INSERT INTO users (osm_id, role) VALUES (1234, 1);

Sections

The following sections are recognised.

  • [DB] - For database connection information. Required.
  • [OAUTH] - For connecting to OpenStreetMap using an OAuth2 app. Required.
  • [CELERY] - For task queues on Redis. Required.
  • [API_CONFIG] - API service related configuration. Required.
  • [EXPORT_UPLOAD] - For external file hosts like S3. Optional.
  • [SENTRY] - Sentry monitoring configuration. Optional.
  • [HDX] - HDX Exports related configuration. Optional.

The following are the different configuration options that are accepted.

Config option ENVVAR Section Defaults Description Required?
PGHOST PGHOST [DB] none PostgreSQL hostname or IP REQUIRED
PGPORT PGPORT [DB] 5432 PostgreSQL connection port OPTIONAL
PGUSER PGUSER [DB] none PostgreSQL user/role REQUIRED
PGPASSWORD PGPASSWORD [DB] none PostgreSQL user/role password REQUIRED
PGDATABASE PGDATABASE [DB] none PostgreSQL database name REQUIRED
OSM_CLIENT_ID OSM_CLIENT_ID [OAUTH] none Client ID of OSM OAuth2 application REQIRED
OSM_CLIENT_SECRET OSM_CLIENT_SECRET [OAUTH] none Client Secret of OSM OAuth2 application REQIRED
OSM_PERMISSION_SCOPE OSM_PERMISSION_SCOPE [OAUTH] read_prefs OSM access permission for OAuth2 application OPTIONAL
LOGIN_REDIRECT_URI LOGIN_REDIRECT_URI [OAUTH] none Redirect URL set in the OAuth2 application REQUIRED
APP_SECRET_KEY APP_SECRET_KEY [OAUTH] none High-entropy string generated for the application REQUIRED
OSM_URL OSM_URL [OAUTH] https://www.openstreetmap.org OSM instance Base URL OPTIONAL
LOG_LEVEL LOG_LEVEL [API_CONFIG] debug Application log level; info,debug,warning,error OPTIONAL
RATE_LIMITER_STORAGE_URI RATE_LIMITER_STORAGE_URI [API_CONFIG] redis://redis:6379 Redis connection string for rate-limiter data OPTIONAL
RATE_LIMIT_PER_MIN RATE_LIMIT_PER_MIN [API_CONFIG] 5 Number of requests per minute before being rate limited OPTIONAL
EXPORT_PATH EXPORT_PATH [API_CONFIG] exports? Local path to store exports OPTIONAL
EXPORT_MAX_AREA_SQKM EXPORT_MAX_AREA_SQKM [API_CONFIG] 100000 max area in sq. km. to support for rawdata input OPTIONAL
USE_CONNECTION_POOLING USE_CONNECTION_POOLING [API_CONFIG] false Enable psycopg2 connection pooling OPTIONAL
ALLOW_BIND_ZIP_FILTER ALLOW_BIND_ZIP_FILTER [API_CONFIG] true Enable zip compression for exports OPTIONAL
EXTRA_README_TXT EXTRA_README_TXT [API_CONFIG] `` Append extra string to export readme.txt OPTIONAL
ENABLE_TILES ENABLE_TILES [API_CONFIG] false Enable Tile Output (Pmtiles and Mbtiles) OPTIONAL
ENABLE_SOZIP ENABLE_SOZIP [API_CONFIG] false Enables sozip compression OPTIONAL
DEFAULT_QUEUE_NAME DEFAULT_QUEUE_NAME [API_CONFIG] raw_daemon Option to define default queue name OPTIONAL
ONDEMAND_QUEUE_NAME ONDEMAND_QUEUE_NAME [API_CONFIG] raw_ondemand Option to define daemon queue name for scheduled and long exports OPTIONAL
ENABLE_POLYGON_STATISTICS_ENDPOINTS ENABLE_POLYGON_STATISTICS_ENDPOINTS [API_CONFIG] False Option to enable endpoints related the polygon statistics about the approx buildings,road length in passed polygon OPTIONAL
ENABLE_CUSTOM_EXPORTS ENABLE_CUSTOM_EXPORTS [API_CONFIG] False Enables custom exports endpoint and imports OPTIONAL
POLYGON_STATISTICS_API_URL POLYGON_STATISTICS_API_URL [API_CONFIG] None API URL for the polygon statistics to fetch the metadata , Currently tested with graphql query endpoint of Kontour , Only required if it is enabled from ENABLE_POLYGON_STATISTICS_ENDPOINTS OPTIONAL
POLYGON_STATISTICS_API_URL POLYGON_STATISTICS_API_RATE_LIMIT [API_CONFIG] 5 Rate limit to be applied for statistics endpoint per minute, Defaults to 5 request is allowed per minute OPTIONAL
WORKER_PREFETCH_MULTIPLIER WORKER_PREFETCH_MULTIPLIER [CELERY] 1 No of tasks that worker can prefetch at a time OPTIONAL
DEFAULT_SOFT_TASK_LIMIT DEFAULT_SOFT_TASK_LIMIT [API_CONFIG] 7200 Soft task time limit signal for celery workers in seconds.It will gently remind celery to finish up the task and terminate, Defaults to 2 Hour OPTIONAL
DEFAULT_HARD_TASK_LIMIT DEFAULT_HARD_TASK_LIMIT [API_CONFIG] 10800 Hard task time limit signal for celery workers in seconds. It will immediately kill the celery task.Defaults to 3 Hour OPTIONAL
USE_DUCK_DB_FOR_CUSTOM_EXPORTS USE_DUCK_DB_FOR_CUSTOM_EXPORTS [API_CONFIG] False Enable this setting to use duckdb , By default duck db is disabled and postgres is used OPTIONAL
CELERY_BROKER_URL CELERY_BROKER_URL [CELERY] redis://localhost:6379/0 Redis connection string for the broker OPTIONAL
CELERY_RESULT_BACKEND CELERY_RESULT_BACKEND [CELERY] redis://localhost:6379/0 Redis/psotgresql connection string for the the result backend, eg : db+postgresql://username:password@localhost:5432/db_name OPTIONAL
FILE_UPLOAD_METHOD FILE_UPLOAD_METHOD [EXPORT_UPLOAD] disk File upload method; Allowed values - disk, s3 OPTIONAL
BUCKET_NAME BUCKET_NAME [EXPORT_UPLOAD] none AWS S3 Bucket name CONDITIONAL
AWS_ACCESS_KEY_ID AWS_ACCESS_KEY_ID [EXPORT_UPLOAD] none AWS Access Key ID for S3 access CONDITIONAL
AWS_SECRET_ACCESS_KEY AWS_SECRET_ACCESS_KEY [EXPORT_UPLOAD] none AWS Secret Access Key for S3 access CONDITIONAL
SENTRY_DSN SENTRY_DSN [SENTRY] none Sentry Data Source Name OPTIONAL
SENTRY_RATE SENTRY_RATE [SENTRY] 1.0 Sample rate percentage for shipping errors to sentry; Allowed values between 0 (0%) to 1 (100%) OPTIONAL
ENABLE_HDX_EXPORTS ENABLE_HDX_EXPORTS [HDX] False Enables hdx related endpoints and imports OPTIONAL
ENABLE_METRICS_APIS ENABLE_METRICS_APIS [API_CONFIG] False Enables download metrics related endpoints , Require different setup of metrics populator OPTIONAL
HDX_SITE HDX_SITE [HDX] 'demo' HDX site to point , By default demo site , use prod for production CONDITIONAL
HDX_API_KEY HDX_API_KEY [HDX] None Your API Secret key for hdx upload , should have write access and it is compulsory if ENABLE_HDX_EXPORTS is True CONDITIONAL
HDX_OWNER_ORG HDX_OWNER_ORG [HDX] None Your HDX organization ID CONDITIONAL
HDX_MAINTAINER HDX_MAINTAINER [HDX] None Your HDX Maintainer ID CONDITIONAL
DUCK_DB_MEMORY_LIMIT DUCK_DB_MEMORY_LIMIT [API_CONFIG] None Duck DB max memory limit , 80 % of your RAM eg : '5GB' CONDITIONAL
DUCK_DB_THREAD_LIMIT DUCK_DB_THREAD_LIMIT [API_CONFIG] None Duck DB max threads limit ,n of your cores eg : 2 CONDITIONAL
HDX_SOFT_TASK_LIMIT HDX_SOFT_TASK_LIMIT [HDX] 18000 Soft task time limit signal for celery workers in seconds.It will gently remind celery to finish up the task and terminate, Defaults to 5 Hour OPTIONAL
HDX_HARD_TASK_LIMIT HDX_HARD_TASK_LIMIT [HDX] 21600 Hard task time limit signal for celery workers in seconds. It will immediately kill the celery task.Defaults to 6 Hour OPTIONAL
PROCESS_SINGLE_CATEGORY_IN_POSTGRES PROCESS_SINGLE_CATEGORY_IN_POSTGRES [HDX] False Recommended for workers with low memery or CPU usage , This will process single category request like buildings only , Roads only in postgres itself and avoid extraction from duckdb OPTIONAL
PARALLEL_PROCESSING_CATEGORIES PARALLEL_PROCESSING_CATEGORIES [HDX] True Enable parallel processing for mulitple categories and export formats , Disable this if you have single cpu and limited RAM , Enabled by default OPTIONAL

Note : HDX_API_KEY

In order to generate HDX_API_KEY , You need to be logged in to https://data.humdata.org/ . Follow following navigation to generate tokens :

  • Your profile section > User settings > API Tokens

API Tokens have expiry date, It is important to update API Tokens manually each year for hosted api service !

Which Service uses which settings?

Parameter Config Section API Worker
PGHOST [DB] Yes Yes
PGPORT [DB] Yes Yes
PGUSER [DB] Yes Yes
PGPASSWORD [DB] Yes Yes
PGDATABASE [DB] Yes Yes
OSM_CLIENT_ID [OAUTH] Yes No
OSM_CLIENT_SECRET [OAUTH] Yes No
OSM_PERMISSION_SCOPE [OAUTH] Yes No
LOGIN_REDIRECT_URI [OAUTH] Yes No
APP_SECRET_KEY [OAUTH] Yes No
OSM_URL [OAUTH] Yes No
LOG_LEVEL [API_CONFIG] Yes Yes
RATE_LIMITER_STORAGE_URI [API_CONFIG] Yes No
RATE_LIMIT_PER_MIN [API_CONFIG] Yes No
EXPORT_PATH [API_CONFIG] Yes (Not needed for upload_s3) Yes
EXPORT_MAX_AREA_SQKM [API_CONFIG] Yes No
USE_CONNECTION_POOLING [API_CONFIG] Yes Yes
ENABLE_TILES [API_CONFIG] Yes Yes
ENABLE_SOZIP [API_CONFIG] Yes Yes
ALLOW_BIND_ZIP_FILTER [API_CONFIG] Yes Yes
EXTRA_README_TXT [API_CONFIG] No Yes
INDEX_THRESHOLD [API_CONFIG] No Yes
MAX_WORKERS [API_CONFIG] No Yes
DEFAULT_QUEUE_NAME [API_CONFIG] Yes No
ONDEMAND_QUEUE_NAME [API_CONFIG] Yes No
ENABLE_POLYGON_STATISTICS_ENDPOINTS [API_CONFIG] Yes Yes
POLYGON_STATISTICS_API_URL [API_CONFIG] Yes Yes
POLYGON_STATISTICS_API_RATE_LIMIT [API_CONFIG] Yes No
DEFAULT_SOFT_TASK_LIMIT [API_CONFIG] No Yes
DEFAULT_HARD_TASK_LIMIT [API_CONFIG] No Yes
USE_DUCK_DB_FOR_CUSTOM_EXPORTS [API_CONFIG] Yes Yes
DUCK_DB_MEMORY_LIMIT [API_CONFIG] Yes Yes
DUCK_DB_THREAD_LIMIT [API_CONFIG] Yes Yes
ENABLE_CUSTOM_EXPORTS [API_CONFIG] Yes Yes
ENABLE_METRICS_APIS [API_CONFIG] Yes No
CELERY_BROKER_URL [CELERY] Yes Yes
CELERY_RESULT_BACKEND [CELERY] Yes Yes
WORKER_PREFETCH_MULTIPLIER [CELERY] Yes Yes
FILE_UPLOAD_METHOD [EXPORT_UPLOAD] Yes Yes
BUCKET_NAME [EXPORT_UPLOAD] Yes Yes
AWS_ACCESS_KEY_ID [EXPORT_UPLOAD] Yes Yes
AWS_SECRET_ACCESS_KEY [EXPORT_UPLOAD] Yes Yes
SENTRY_DSN [SENTRY] Yes No
SENTRY_RATE [SENTRY] Yes No
ENABLE_HDX_EXPORTS [HDX] Yes Yes
HDX_SITE [HDX] Yes Yes
HDX_API_KEY [HDX] Yes Yes
HDX_OWNER_ORG [HDX] Yes Yes
HDX_MAINTAINER [HDX] Yes Yes
HDX_SOFT_TASK_LIMIT [HDX] No Yes
HDX_HARD_TASK_LIMIT [HDX] No Yes
PROCESS_SINGLE_CATEGORY_IN_POSTGRES [HDX] No Yes
PARALLEL_PROCESSING_CATEGORIES [HDX] No Yes

Compulsory Configuration

Create config.txt inside root directory.

It should be on the same place where config.txt.sample

Prepare your OSM Snapshot Data

Initialize rawdata from here OR Create database "raw" in your local postgres and insert sample dump from

/tests/fixtures/pokhara.sql
psql -U postgres -h localhost raw < pokhara.sql

Put your credentials on Rawdata block

[DB]
PGHOST=localhost
PGUSER=postgres
PGPASSWORD=admin
PGDATABASE=raw
PGPORT=5432

Setup Oauth for Authentication

Login to OSM , Click on My Settings and register your local galaxy app to Oauth2applications

image

Check on read user preferences and Enter redirect URI as following

http://127.0.0.1:8000/v1/auth/callback/

Grab Client ID and Client Secret and put it inside config.txt as OAUTH Block , you can generate secret key for your application by yourself

[OAUTH]
OSM_CLIENT_ID= your client id
OSM_CLIENT_SECRET= your client secret
OSM_URL=https://www.openstreetmap.org
OSM_PERMISSION_SCOPE=read_prefs
LOGIN_REDIRECT_URI=http://127.0.0.1:8000/v1/auth/callback/
APP_SECRET_KEY=your generated secret key

Configure celery and redis

API uses Celery 5 and Redis 6 for task queue management , Currently implemented for Rawdata endpoint. 6379 is the default port . if you are running redis on same machine your broker could be redis://localhost:6379/. You can change the port according to your configuration for the current docker compose use following

[CELERY]
CELERY_BROKER_URL=redis://redis:6379/0
CELERY_RESULT_BACKEND=redis://redis:6379/0

Finalizing config.txt

Insert your config blocks with the database credentials where you have underpass ,insight and rawdata in your database along with oauth block

Summary of command :

Considering You have PSQL-POSTGIS setup with user postgres host localhost on port 5432 as password admin

  export PGPASSWORD='admin';
  psql -U postgres -h localhost -p 5432 -c "CREATE DATABASE raw;"

  cd tests/fixtures/
  psql -U postgres -h localhost -p 5432 raw  < pokhara.sql

Your config.txt will look like this

[DB]
PGHOST=localhost
PGUSER=postgres
PGPASSWORD=admin
PGDATABASE=raw
PGPORT=5432

[OAUTH]
OSM_CLIENT_ID= your client id
OSM_CLIENT_SECRET= your client secret
OSM_URL=https://www.openstreetmap.org
OSM_PERMISSION_SCOPE=read_prefs
LOGIN_REDIRECT_URI=http://127.0.0.1:8000/v1/auth/callback/
APP_SECRET_KEY=jnfdsjkfndsjkfnsdkjfnskfn

[API_CONFIG]
LOG_LEVEL=debug
RATE_LIMITER_STORAGE_URI=redis://redis:6379

[CELERY]
CELERY_BROKER_URL=redis://redis:6379/0
CELERY_RESULT_BACKEND=redis://redis:6379/0

Tips : Follow .github/workflows/unit-test If you have any confusion on implementation of config file .

Optional Configuration [ You can skip this part for basic installation ]

You can further customize API if you wish with API_CONFIG Block

[API_CONFIG]
EXPORT_PATH=exports # used to store export path
EXPORT_MAX_AREA_SQKM=100000 # max area to support for rawdata input
USE_CONNECTION_POOLING=True # default it will not use connection pooling but you can configure api to use to for psycopg2 connections
LOG_LEVEL=info #options are info,debug,warning,error
ALLOW_BIND_ZIP_FILTER=true # option to configure export output zipped/unzipped Default all output will be zipped
RATE_LIMITER_STORAGE_URI=redis://localhost:6379 # API uses redis as backend for rate limiting
INDEX_THRESHOLD=5000 # value in sqkm to apply grid/country index filter
RATE_LIMIT_PER_MIN=5 # no of requests per minute - default is 5 requests per minute

Based on your requirement you can also customize rawdata exports parameter using EXPORT_UPLOAD block

[EXPORT_UPLOAD]
FILE_UPLOAD_METHOD=disk # options are s3,disk , default disk
AWS_ACCESS_KEY_ID= your id
AWS_SECRET_ACCESS_KEY= yourkey
BUCKET_NAME= your bucket name

Sentry Config :

[SENTRY]
SENTRY_DSN=
SENTRY_RATE=

You can export config variables without block as system env variables too