Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to open X display when running in docker #1285

Open
YoelRidgway opened this issue Jun 18, 2024 · 12 comments
Open

Failed to open X display when running in docker #1285

YoelRidgway opened this issue Jun 18, 2024 · 12 comments

Comments

@YoelRidgway
Copy link

YoelRidgway commented Jun 18, 2024

Hello there. I'm using tileserver-gl v4.11.1 and I'm getting the classic:

terminate called after throwing an instance of 'std::runtime_error'
   what(): Failed to open X display.

Normally this happens when the x display server is not running, however I'm running with docker... Are there any suggestions on how I can investigate this issue? I'm not sure where to start. This issue happened a few times on my local machine, but restarting would do the trick, but now it is hosted on a remote server and it is happening every time.

@nathanpackard
Copy link

I'm having the same problem.

@acalcutt
Copy link
Collaborator

acalcutt commented Jul 4, 2024

Does it work either of you with previous versions? what is the last version that worked for you?

@nathanpackard
Copy link

nathanpackard commented Jul 4, 2024

I am actually running: Tileserver-gl v4.4.10, and get this error.

I am new to the project I'm working on so can't comment on the last version that worked.

For me, it mostly doesn't work. However, every once in a while I'll try again and it works (where I didn't change anything). It is sort of hit and miss when it works.

It happens when I run: docker-compose up
When it works, I get:
tileservergl | Starting tileserver-gl v4.4.10
tileservergl | Using specified config file from config.json
tileservergl | Starting server
tileservergl | Listening at http://[::]:8080/
tileservergl | Style "dark_offline" changed, updating...
tileservergl | Style "dark_online" changed, updating...
tileservergl | Style "dark_terrain_offline" changed, updating...
tileservergl | Style "dark_terrain_online" changed, updating...
tileservergl | Style "legacy" changed, updating...
tileservergl | Startup complete
tileservergl | GET /health 200 2 - 2.156 ms

When it doesn't work, I get:
tileservergl | Starting tileserver-gl v4.4.10
tileservergl | Using specified config file from config.json
tileservergl | Starting server
tileservergl | Listening at http://[::]:8080/
tileservergl | Style "dark_offline" changed, updating...
tileservergl | Style "dark_online" changed, updating...
tileservergl | Style "dark_terrain_offline" changed, updating...
tileservergl | Style "dark_terrain_online" changed, updating...
tileservergl | terminate called after throwing an instance of 'std::runtime_error'
tileservergl | what(): Failed to open X display.
tileservergl exited with code 0

@acalcutt
Copy link
Collaborator

acalcutt commented Jul 4, 2024

The docker image uses xvfb to provide X display. So the only thing i could think of is your cpu doesn't meet the requirements for it to emulate open-gl. What cpu and os are you running on?

I ran into something like this running directly on windows 2022 when i ran on a vitual server, since it didn't support opengl. There I had to force it to use mesa3d, which is am emulated open-gl similar to xvfb

the crash is likely happening when you visit the index page, where it needs to render thumbnails, or loading a rendered tiles. since this is when maplibre-native needs X display to render.

@acalcutt
Copy link
Collaborator

acalcutt commented Jul 4, 2024

We have found that when using xvfb in maplibre-native ci workflows, it does fail with that error sometimes. it seemed to be a known xvfb issue

@nathanpackard
Copy link

Yeah it sounds like a resource issue. however, I have lots of resources. Here is my system summary:
image

Also, my docker-compose.yml file has a lot of resources assigned:
mem_limit: 24G
cpus: '8.0'
deploy:
resources:
limits:
cpus: '8.0'
memory: 24G
reservations:
cpus: '8.0'
memory: 24G

@acalcutt
Copy link
Collaborator

acalcutt commented Jul 6, 2024

Are you able to test swapping these two packages around and see if it makes a difference
https://github.com/maptiler/tileserver-gl/blob/master/src/serve_rendered.js#L5-L11

Edit: I guess this is a slightly different error, so probably not it.

@asukachiharu
Copy link

We have found that when using xvfb in maplibre-native ci workflows, it does fail with that error sometimes. it seemed to be a known xvfb issue

I found that as long as there is a style.json file in the folder, starting the Docker container will result in this error. However, this issue almost never occurs on Windows.

@asukachiharu
Copy link

We have found that when using xvfb in maplibre-native ci workflows, it does fail with that error sometimes. it seemed to be a known xvfb issue

I found that as long as there is a style.json file in the folder, starting the Docker container will result in this error. However, this issue almost never occurs on Windows.

Specifically, this error does not appear locally. However, when previewing the raster, an error occurs: [error: failed to parse json: the document is empty. at offset 0] /GET xxxxx/xx512/0/0/0.png 500.

@docuracy
Copy link

docuracy commented Jan 1, 2025

I guess this is the same problem:

2024-12-31T22:47:14.607Z | [CI] Failed to open X display, retrying...
... (20 of these retry notifications in total) ...
2024-12-31T22:47:24.115Z | [CI] Failed to open X display, retrying...
2024-12-31T22:47:24.615Z | terminate called after throwing an instance of 'std::runtime_error'
2024-12-31T22:47:24.615Z |   what():  Failed to open X display.

It's triggered by any call for a static map, for example:

curl -I "http://localhost:30080/styles/elevation/static/9.051,48.228,10/1x1.png"

I'm running the Docker v5.0.0 image in Kubernetes (which includes xvfb , intended I understand to run as a daemon when required), with these resources:

    requests:
      memory: "2Gi"
      cpu: "2"
    limits:
      memory: "4Gi"
      cpu: "4"

Any suggestions, please?

@mloskot
Copy link
Contributor

mloskot commented Jan 27, 2025

I'm experiencing the same issue, as in @docuracy #1285 (comment), when running latest maptiler/tileserver-gl:v5.1.3 from container on AKS cluster.
Thumbnails of previews do not load and attempt to directly access thumbnail URL crashes TileServer-GL, and terminates the container.

I run the container with the following command

containers:
- name: tileserver-gl
  image: maptiler/tileserver-gl:v5.1.3
  command:
  - node
  args:
  - /usr/src/app
  - "--verbose"
  - "--public_url"
  - "https://svc.example.com/test/mbtiles/"

The container has Xvfb included, but this logic seems skipped in that case, isn't it?

if ! which -- "${1}"; then
# first arg is not an executable
if [ -e /tmp/.X99-lock ]; then rm /tmp/.X99-lock -f; fi
export DISPLAY=:99
Xvfb "${DISPLAY}" -nolisten unix &
exec node /usr/src/app/ "$@"
fi

@mloskot
Copy link
Contributor

mloskot commented Jan 27, 2025

Fixed

This is a quick follow-up to my previous #1285 (comment)

I think I have fixed or rather worked around the issue:

I removed explicit execution of node - see my previous YAML snippet above

containers:
- name: tileserver-gl
  image: maptiler/tileserver-gl:v5.1.3
  args:
  - "--verbose"
  - "--public_url"
  - "https://svc.example.com/test/mbtiles/"

in order to ensure this if-ed logic is triggered so the Xvfb is executed:

if ! which -- "${1}"; then
# first arg is not an executable
if [ -e /tmp/.X99-lock ]; then rm /tmp/.X99-lock -f; fi
export DISPLAY=:99
Xvfb "${DISPLAY}" -nolisten unix &
exec node /usr/src/app/ "$@"
fi

Once that tweak is deployed, shell'ed to TileServer-GL container and ps aux-ed to verify Xvfb :99 -nolisten unix is running indeed.

Finally, TileServer-GL frontpage shows the the preview thumbnail

Image

and no more crashes logged by the server

Image

Workaround

Above, I referred to the solution as a workaround because, I think, the container entrypoint could be improved to make it harder for users to trip over explicit execution of node :) The container could allow to pass values for all the command line options via env vars or those could be made configurable in the config.json, lots of ways... I'm happy to propose a PR, but I'd like to hear about developers preferences here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

6 participants