Self Hosted Chat Times Out VSCode #250

stratus-ss · 2023-12-21T17:37:37Z

LOGS: watchdog_20231221.log

I am running the containerized version of the Refact self hosting. When trying to use one of the models for chat (such as wizardlm) the chat always times out.

I have a system that has 2 Quadro P6000s:

The models load and I can see that inference is happening. However, VSCode times out before the processing has completed:

When I view nvtop during this time I can see that it is taking a while to process, but that should be ok for the self hosted model

I would expect that this should be configurable or at least not so short.

File is app.py. 134 lines (including comments) of python code. I understand I am not running this on an A100 card, so I expect the response to be slower. How can I reasonable interact with the chatbot in the self hosted model?

When I uncheck the "use app.py" option and only put in a code snippet by reducing the amount of code to 48 lines the VSCode extension still times out

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Self Hosted Chat Times Out VSCode #250

Self Hosted Chat Times Out VSCode #250

stratus-ss commented Dec 21, 2023

Self Hosted Chat Times Out VSCode #250

Self Hosted Chat Times Out VSCode #250

Comments

stratus-ss commented Dec 21, 2023