Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JEG 3.2.3 on K3s v1.28.5 unable to start kernels. Same actions with helm chart 3.2.2 pyspark kubernetes work. #1379

Open
paf91 opened this issue Apr 2, 2024 · 6 comments
Labels

Comments

@paf91
Copy link

paf91 commented Apr 2, 2024

Description

Whenever I start kernel based on kubernetes I'm getting this error on version 3.2.3:

Error Starting Kernel
HTTP 500: Internal Server Error (Error from Gateway: [Error occurred creating role binding for namespace 'guest-fab3e59b-edbb-4e1d-912e-087b1798425b': module 'kubernetes.client' has no attribute 'V1Subject'] Error occurred creating role binding for namespace 'guest-fab3e59b-edbb-4e1d-912e-087b1798425b': module 'kubernetes.client' has no attribute 'V1Subject'. Ensure gateway url is valid and the Gateway instance is running.)

Reproduce

values.yaml:

service:
  type: "LoadBalancer"
  # Master public IP on which to expose EG.
  k8sMasterPublicIP: '<redacted, private ip like 10.x.x.x>'
  ports:
    - name: "http"
      port: 8888
      targetPort: 8888
    - name: "http-response"
      port: 8877
      targetPort: 8877
ingress:
  enabled: false
kernel:
  shareGatewayNamespace: false
  allowedKernels:
    - r_kubernetes
    - python_kubernetes
    - python_tf_kubernetes
    - python_tf_gpu_kubernetes
    - scala_kubernetes
    - spark_r_kubernetes
    - spark_python_kubernetes
    - spark_scala_kubernetes
    - spark_python_operator
    - python3
  defaultKernelName: python_kubernetes
kip:
  enabled: true
  serviceAccountName: 'kernel-image-puller-sa'
  criSocket: /run/containerd/containerd.sock

helm upgrade --install enterprise-gateway https://github.com/jupyter-server/enterprise_gateway/releases/download/v3.2.3/jupyter_enterprise_gateway_helm-3.2.3.tar.gz --namespace enterprise-gateway -f ~/jupyter/gateway/values-balancer.yaml

kubectl get pods -n enterprise-gateway:
image

Try to run:

`curl -X POST -i 'http://<redacted_private_ip>:8888/api/kernels' --data '{ "name": "spark_python_kubernetes", "env": { "KERNEL_USERNAME": "jovyan" }}'`

Response:

{"reason": "Error occurred creating role binding for namespace 'jovyan-99878812-28c5-49ad-8cbc-cb81713e7ba3': module 'kubernetes.client' has no attribute 'V1Subject'", "message": ""}

Enterprise gateway logs:

kubectl logs -n enterprise-gateway enterprise-gateway-cfbb54797-7dph8

[D 2024-04-02 23:17:01.296 EnterpriseGatewayApp] RemoteMappingKernelManager.start_kernel: spark_python_kubernetes, kernel_username: jovyan
[D 2024-04-02 23:17:01.298 EnterpriseGatewayApp] Instantiating kernel 'Spark - Python (Kubernetes Mode)' with process proxy: enterprise_gateway.services.processproxies.k8s.KubernetesProcessProxy
[D 2024-04-02 23:17:01.299 EnterpriseGatewayApp] Starting kernel (async): ['/usr/local/share/jupyter/kernels/spark_python_kubernetes/bin/run.sh', '--RemoteProcessProxy.kernel-id', '<redacted>', '--RemoteProcessProxy.port-range', '0..0', '--RemoteProcessProxy.response-address', '<redacted>:8877', '--RemoteProcessProxy.public-key', '<redacted>', '--RemoteProcessProxy.spark-context-initialization-mode', 'lazy']
[D 2024-04-02 23:17:01.299 EnterpriseGatewayApp] Launching kernel: 'Spark - Python (Kubernetes Mode)' with command: ['/usr/local/share/jupyter/kernels/spark_python_kubernetes/bin/run.sh', '--RemoteProcessProxy.kernel-id', '<redacted>', '--RemoteProcessProxy.port-range', '0..0', '--RemoteProcessProxy.response-address', '<redacted>:8877', '--RemoteProcessProxy.public-key', '<redacted>', '--RemoteProcessProxy.spark-context-initialization-mode', 'lazy']
[I 2024-04-02 23:17:01.336 EnterpriseGatewayApp] Created kernel namespace: jovyan-99878812-28c5-49ad-8cbc-cb81713e7ba3
[W 2024-04-02 23:17:01.367 EnterpriseGatewayApp] Deleted kernel namespace: jovyan-99878812-28c5-49ad-8cbc-cb81713e7ba3
[E 2024-04-02 23:17:01.367 EnterpriseGatewayApp] Error occurred creating role binding for namespace 'jovyan-99878812-28c5-49ad-8cbc-cb81713e7ba3': module 'kubernetes.client' has no attribute 'V1Subject'
[E 240402 23:17:01 web:2271] 500 POST /api/kernels (<redacted>) 83.95ms

Expected behavior

Kernel starts

Context

JEG 3.2.3 on K3s v1.28.5

@paf91 paf91 added the bug label Apr 2, 2024
Copy link

welcome bot commented Apr 2, 2024

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! 🤗

If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively.
welcome
You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! 👋

Welcome to the Jupyter community! 🎉

@paf91 paf91 changed the title JEG 3.2.3 on K3s v1.28.5 unable to start kernels. 3.2.2 pyspark worked. JEG 3.2.3 on K3s v1.28.5 unable to start kernels. Same actions with helm chart 3.2.2 pyspark kubernetes work. Apr 2, 2024
@paf91
Copy link
Author

paf91 commented Apr 4, 2024

Okay so the culprit is the updated kubernetes pip version. EG 3.2.2 has 26.1.0 kubernetes python ver, EG 3.2.3 has 29.0.0

>>> import kubernetes; print(kubernetes.__version__)
29.0.0

I hope will find time to try to fix this

>>> name='spark'
>>> namespace='default'
>>> from kubernetes import client
>>> client.V1Subject(
...             api_group="", kind="ServiceAccount", name=service_account_name, namespace=namespace
...         )
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: module 'kubernetes.client' has no attribute 'V1Subject'. Did you mean: 'RbacV1Subject'?
>>>

So this code cant execute:

https://github.com/jupyter-server/enterprise_gateway/blame/d01e84a2457d44d14bd6bd3335307b9d0e3b483d/enterprise_gateway/services/processproxies/k8s.py#L352

@lresende
Copy link
Member

lresende commented Apr 4, 2024

Could this probably be related to new Kubernetes Client version where these have changed?

@paf91
Copy link
Author

paf91 commented Apr 5, 2024

@lresende see my reply above

@lresende
Copy link
Member

lresende commented Apr 5, 2024

So, we should cap the kubernetes client for now I would say

@merqri
Copy link

merqri commented Jun 3, 2024

@paf91
Greetings, I think the easiest and fastest way is to build a new image as I mentioned in #1382 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants