Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Self Hosted Chat Times Out VSCode #250

Open
stratus-ss opened this issue Dec 21, 2023 · 0 comments
Open

Self Hosted Chat Times Out VSCode #250

stratus-ss opened this issue Dec 21, 2023 · 0 comments

Comments

@stratus-ss
Copy link

LOGS: watchdog_20231221.log

I am running the containerized version of the Refact self hosting. When trying to use one of the models for chat (such as wizardlm) the chat always times out.

I have a system that has 2 Quadro P6000s:

image

The models load and I can see that inference is happening. However, VSCode times out before the processing has completed:
vscode_error

When I view nvtop during this time I can see that it is taking a while to process, but that should be ok for the self hosted model
nvtop

I would expect that this should be configurable or at least not so short.

File is app.py. 134 lines (including comments) of python code. I understand I am not running this on an A100 card, so I expect the response to be slower. How can I reasonable interact with the chatbot in the self hosted model?

When I uncheck the "use app.py" option and only put in a code snippet by reducing the amount of code to 48 lines the VSCode extension still times out

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant