-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add certificate doc #326
add certificate doc #326
Conversation
Signed-off-by: jooho <[email protected]>
✅ Deploy Preview for elastic-nobel-0aef7a ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
mkdocs.yml
Outdated
@@ -88,6 +88,8 @@ nav: | |||
- Inference Observability: | |||
- Prometheus Metrics: modelserving/observability/prometheus_metrics.md | |||
- Grafana Dashboards: modelserving/observability/grafana_dashboards.md | |||
- Certiricate: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Jooho Can we add this under Model Storage
section ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also there is a typo
Signed-off-by: Dan Sun <[email protected]>
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Jooho, yuzisun The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
* add certificate doc Signed-off-by: jooho <[email protected]> * Update mkdocs.yml Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: jooho <[email protected]> Signed-off-by: Dan Sun <[email protected]> Co-authored-by: Dan Sun <[email protected]>
author Dan Sun <[email protected]> 1698039744 -0400 committer agriffith50 <[email protected]> 1716219052 -0400 parent 2257489 author Dan Sun <[email protected]> 1698039744 -0400 committer agriffith50 <[email protected]> 1716218313 -0400 parent 2257489 author Dan Sun <[email protected]> 1698039744 -0400 committer agriffith50 <[email protected]> 1716217744 -0400 Add TorchServe Huggingface accelerate example (kserve#304) * Add LLM example for huggingface accelerate Signed-off-by: Dan Sun <[email protected]> * Add inputs Signed-off-by: Dan Sun <[email protected]> * Update storage uri Signed-off-by: Dan Sun <[email protected]> * Add to LLM runtime to index Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Dan Sun <[email protected]> 0.11 release blog (kserve#310) * Add 0.11 release blog Signed-off-by: Dan Sun <[email protected]> * Update blog Signed-off-by: Dan Sun <[email protected]> * Add vllm runtime doc Signed-off-by: Dan Sun <[email protected]> * Add vllm example doc Signed-off-by: Dan Sun <[email protected]> * Update blog link Signed-off-by: Dan Sun <[email protected]> * Add vLLM intro Signed-off-by: Dan Sun <[email protected]> * add python runtime open inference protocol tutorials Signed-off-by: Dan Sun <[email protected]> * Fix warning Signed-off-by: Dan Sun <[email protected]> * Add warning Signed-off-by: Dan Sun <[email protected]> * Address comments Signed-off-by: Dan Sun <[email protected]> * Fix newline Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Dan Sun <[email protected]> Fix torchserve llm example link Signed-off-by: Dan Sun <[email protected]> Fixed formatting in get_started (kserve#319) Signed-off-by: Helber Belmiro <[email protected]> clarify prometheus annotation (kserve#316) Signed-off-by: JuHyung-Son <[email protected]> Document servingruntime constraint introduced by kserve/kserve#3181 (kserve#320) * Document serving runtime constraint introduced by kserve/kserve#3181 Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Set content type for predict/explainer curl requests Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Update docs/modelserving/servingruntimes.md Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Sivanantham Chinnaiyan <[email protected]> Signed-off-by: Dan Sun <[email protected]> Co-authored-by: Dan Sun <[email protected]> Add kubeflow summit 2023 Jooho's presentation link (kserve#325) add kubeflow summit 2023 Jooho's presentation link Signed-off-by: jooho <[email protected]> docs: Add one related presentations from Kubeflow Summit 2023 (kserve#327) * docs: Add two new related presentations from Kubeflow Summit 2023Update presentations.md Signed-off-by: Yuan Tang <[email protected]> * Update presentations.md Signed-off-by: Yuan Tang <[email protected]> --------- Signed-off-by: Yuan Tang <[email protected]> Added example for torchserve grpc v1 and v2. (kserve#307) * Added example for torchserve grpc v1 and v2. Signed-off-by: Andrews Arokiam <[email protected]> * Schema order changed. Signed-off-by: Andrews Arokiam <[email protected]> * corrected v2 REST input. Signed-off-by: Andrews Arokiam <[email protected]> * Updated grpc-v2 protocolVersion. Signed-off-by: Andrews Arokiam <[email protected]> * Update README.md * Update README.md * Update README.md --------- Signed-off-by: Andrews Arokiam <[email protected]> Co-authored-by: Dan Sun <[email protected]> Add link to release process doc in developer.md (kserve#330) Signed-off-by: Yuan Tang <[email protected]> Update tranformer collocation docs for specifying storage uri (kserve#323) Signed-off-by: Sivanantham Chinnaiyan <[email protected]> Fix incorrect edit URL to docs (kserve#329) Signed-off-by: Yuan Tang <[email protected]> Set resources for inferencegraph example (kserve#322) Signed-off-by: Sivanantham Chinnaiyan <[email protected]> Fixes kserve#331 - broken link to AMD Inference Server (kserve#332) Tested locally with mkdocs serve Render KServe Python Runtime API doc with mkdoc (kserve#333) * Update KServe python sdk docs Signed-off-by: Dan Sun <[email protected]> * Update serving runtime doc Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Dan Sun <[email protected]> Fix build: Install kserve for rendering the docstring (kserve#334) * Update KServe python sdk docs Signed-off-by: Dan Sun <[email protected]> * Install kserve sdk for mkdocstring Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Dan Sun <[email protected]> Onnx docs update (kserve#275) * Updated Onnx example. Signed-off-by: Andrews Arokiam <[email protected]> * Reverting sklearn doc update as there is a separate PR Signed-off-by: andyi2it <[email protected]> * Added new schema in onnx example. Signed-off-by: Andrews Arokiam <[email protected]> * protocolVersion and old schema updated with onnx example. Signed-off-by: Andrews Arokiam <[email protected]> --------- Signed-off-by: Andrews Arokiam <[email protected]> Signed-off-by: andyi2it <[email protected]> Standardized schema order (kserve#318) * Standardized schema's order. Signed-off-by: Andrews Arokiam <[email protected]> * Fix v2 spec for torch serve --------- Signed-off-by: Andrews Arokiam <[email protected]> Signed-off-by: Dan Sun <[email protected]> Co-authored-by: Dan Sun <[email protected]> Update link to Slack instructions Signed-off-by: Yuan (Terry) Tang <[email protected]> Update README.md (kserve#344) Fix incorrect storage uri prefix Signed-off-by: zoramt <[email protected]> Added steps to delete model-store-pod (kserve#343) Signed-off-by: murata.yu <[email protected]> Update README.md Signed-off-by: Dan Sun <[email protected]> Add documentation for modelcars (kserve#337) * Add documentation for modelcars, introduced in 0.12 as experimental feature Signed-off-by: Roland Huß <[email protected]> * added some references to this feature Signed-off-by: Roland Huß <[email protected]> --------- Signed-off-by: Roland Huß <[email protected]> add certificate doc (kserve#326) * add certificate doc Signed-off-by: jooho <[email protected]> * Update mkdocs.yml Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: jooho <[email protected]> Signed-off-by: Dan Sun <[email protected]> Co-authored-by: Dan Sun <[email protected]> docs: fix the emoji deprecation message and invalid file name (kserve#348) Signed-off-by: Peter Jausovec <[email protected]> Add documentation for GCS (kserve#351) * Add documentation for GCS Signed-off-by: tjandy98 <[email protected]> * Update mkdocs to include GCS Signed-off-by: tjandy98 <[email protected]> * Fix formatting Signed-off-by: tjandy98 <[email protected]> --------- Signed-off-by: tjandy98 <[email protected]> Add ModelRegistry custom storage intializer example (kserve#346) * Add ModelRegistry custom storage intializer example Signed-off-by: Andrea Lamparelli <[email protected]> * Update docs/modelserving/storage/storagecontainers.md Co-authored-by: Dan Sun <[email protected]> Signed-off-by: Andrea Lamparelli <[email protected]> --------- Signed-off-by: Andrea Lamparelli <[email protected]> Co-authored-by: Dan Sun <[email protected]> Updated docs for autoscaling on gpu. (kserve#328) Signed-off-by: Andrews Arokiam <[email protected]> Update version matrix for 0.12 (kserve#353) * Update version matrix for 0.12 Signed-off-by: Dan Sun <[email protected]> * Update kubernetes_deployment.md Signed-off-by: Dan Sun <[email protected]> * Update notes for gRPC issues Signed-off-by: Dan Sun <[email protected]> * Update kserve install Signed-off-by: Dan Sun <[email protected]> * Update kubernetes_deployment.md Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Dan Sun <[email protected]> docs: update kserve resource yaml file (kserve#356) fix docs Signed-off-by: Niels ten Boom <[email protected]> Signed-off-by: agriffith50 <[email protected]> Update serving runtime version for 0.12 release and add some notes (kserve#354) * Fix few bugs, add quick install failure note and update docs for release 0.12.0 Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Add warning about control plane namespaces Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Resolve comments Signed-off-by: Sivanantham Chinnaiyan <[email protected]> --------- Signed-off-by: Sivanantham Chinnaiyan <[email protected]> Signed-off-by: agriffith50 <[email protected]> Add Helm installation commands in get started guide Signed-off-by: Yuan Tang <[email protected]> Signed-off-by: agriffith50 <[email protected]> Revert "Add Helm installation commands in get started guide" This reverts commit bc90c25. Signed-off-by: agriffith50 <[email protected]> Add Helm installation commands in get started guide (kserve#358) Signed-off-by: Yuan Tang <[email protected]> Signed-off-by: agriffith50 <[email protected]> Update README.md (kserve#359) Fix broken link to Ray doc on fractional GPU allocation. Signed-off-by: zoramt <[email protected]> Signed-off-by: agriffith50 <[email protected]> Add Huggingface Serving Runtime example with Llama2 (kserve#345) * Add Huggingface Serving Runtime example with Llama2 Signed-off-by: Gavrish Prabhu <[email protected]> * Fix examples Signed-off-by: Gavrish Prabhu <[email protected]> * Fix examples Signed-off-by: Gavrish Prabhu <[email protected]> * fix review comments Signed-off-by: Gavrish Prabhu <[email protected]> * add linking Signed-off-by: Gavrish Prabhu <[email protected]> * fix comments Signed-off-by: Gavrish Prabhu <[email protected]> * Update huggingface vllm runtime doc Signed-off-by: Dan Sun <[email protected]> * Update mkdocs.yml Signed-off-by: Dan Sun <[email protected]> * Update triton doc Signed-off-by: Dan Sun <[email protected]> * Fix Hugging Face Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix Hugging Face Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Gavrish Prabhu <[email protected]> Signed-off-by: Dan Sun <[email protected]> Co-authored-by: Dan Sun <[email protected]> Signed-off-by: agriffith50 <[email protected]> Update adopters.md (kserve#361) Signed-off-by: agriffith50 <[email protected]> Point users to vLLM production server (kserve#362) The vLLM teams states that the [`vllm.entrypoints.api_server`](https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/api_server.py#L2-L6) is just to demonstrates usage of their AsyncEngine, for production use they point users to `vllm.entrypoints.openai.api_server` instead. So, I think this should be the entrypoint used in the kServe documentation too, to avoid confusing new comers. Signed-off-by: Pierre Dulac <[email protected]> Signed-off-by: agriffith50 <[email protected]> initial draft of kserve release blog Signed-off-by: agriffith50 <[email protected]> change title Signed-off-by: agriffith50 <[email protected]> resolving comments Signed-off-by: agriffith50 <[email protected]> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Dan Sun <[email protected]> Signed-off-by: Alexa Griffith <[email protected]> Signed-off-by: agriffith50 <[email protected]> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Dan Sun <[email protected]> Signed-off-by: Alexa Griffith <[email protected]> Signed-off-by: agriffith50 <[email protected]> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Dan Sun <[email protected]> Signed-off-by: Alexa Griffith <[email protected]> Signed-off-by: agriffith50 <[email protected]> update comment Signed-off-by: agriffith50 <[email protected]> update for vllm comment Signed-off-by: agriffith50 <[email protected]> add more info about completions endpoints Signed-off-by: agriffith50 <[email protected]> add hf img Signed-off-by: agriffith50 <[email protected]> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Signed-off-by: Alexa Griffith <[email protected]> Signed-off-by: agriffith50 <[email protected]> Sample requests update in HuggingFace runtime with vLLM support (kserve#364) Update Sample requests for HF runtime Signed-off-by: Gavrish Prabhu <[email protected]> Signed-off-by: agriffith50 <[email protected]> add new kserve img Signed-off-by: agriffith50 <[email protected]> Update future plan and other changes Signed-off-by: agriffith50 <[email protected]> Add Huggingface Serving Runtime example with Llama2 (kserve#345) * Add Huggingface Serving Runtime example with Llama2 Signed-off-by: Gavrish Prabhu <[email protected]> * Fix examples Signed-off-by: Gavrish Prabhu <[email protected]> * Fix examples Signed-off-by: Gavrish Prabhu <[email protected]> * fix review comments Signed-off-by: Gavrish Prabhu <[email protected]> * add linking Signed-off-by: Gavrish Prabhu <[email protected]> * fix comments Signed-off-by: Gavrish Prabhu <[email protected]> * Update huggingface vllm runtime doc Signed-off-by: Dan Sun <[email protected]> * Update mkdocs.yml Signed-off-by: Dan Sun <[email protected]> * Update triton doc Signed-off-by: Dan Sun <[email protected]> * Fix Hugging Face Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix Hugging Face Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Gavrish Prabhu <[email protected]> Signed-off-by: Dan Sun <[email protected]> Co-authored-by: Dan Sun <[email protected]> Signed-off-by: agriffith50 <[email protected]> Sample requests update in HuggingFace runtime with vLLM support (kserve#364) Update Sample requests for HF runtime Signed-off-by: Gavrish Prabhu <[email protected]> Signed-off-by: agriffith50 <[email protected]> Update huggingface triton yaml Signed-off-by: Dan Sun <[email protected]> Signed-off-by: agriffith50 <[email protected]> Update blog link Signed-off-by: agriffith50 <[email protected]> Add triton huggingface reference Signed-off-by: agriffith50 <[email protected]> resolve merge Signed-off-by: agriffith50 <[email protected]> docs: update kserve resource yaml file (kserve#356) fix docs Signed-off-by: Niels ten Boom <[email protected]> Add Helm installation commands in get started guide Signed-off-by: Yuan Tang <[email protected]> Signed-off-by: agriffith50 <[email protected]> Revert "Add Helm installation commands in get started guide" This reverts commit bc90c25. Signed-off-by: agriffith50 <[email protected]> Add Helm installation commands in get started guide (kserve#358) Signed-off-by: Yuan Tang <[email protected]> Signed-off-by: agriffith50 <[email protected]> Update README.md (kserve#359) Fix broken link to Ray doc on fractional GPU allocation. Signed-off-by: zoramt <[email protected]> Signed-off-by: agriffith50 <[email protected]> Add Huggingface Serving Runtime example with Llama2 (kserve#345) * Add Huggingface Serving Runtime example with Llama2 Signed-off-by: Gavrish Prabhu <[email protected]> * Fix examples Signed-off-by: Gavrish Prabhu <[email protected]> * Fix examples Signed-off-by: Gavrish Prabhu <[email protected]> * fix review comments Signed-off-by: Gavrish Prabhu <[email protected]> * add linking Signed-off-by: Gavrish Prabhu <[email protected]> * fix comments Signed-off-by: Gavrish Prabhu <[email protected]> * Update huggingface vllm runtime doc Signed-off-by: Dan Sun <[email protected]> * Update mkdocs.yml Signed-off-by: Dan Sun <[email protected]> * Update triton doc Signed-off-by: Dan Sun <[email protected]> * Fix Hugging Face Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix Hugging Face Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Gavrish Prabhu <[email protected]> Signed-off-by: Dan Sun <[email protected]> Co-authored-by: Dan Sun <[email protected]> Signed-off-by: agriffith50 <[email protected]> Update adopters.md (kserve#361) Signed-off-by: agriffith50 <[email protected]> Point users to vLLM production server (kserve#362) The vLLM teams states that the [`vllm.entrypoints.api_server`](https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/api_server.py#L2-L6) is just to demonstrates usage of their AsyncEngine, for production use they point users to `vllm.entrypoints.openai.api_server` instead. So, I think this should be the entrypoint used in the kServe documentation too, to avoid confusing new comers. Signed-off-by: Pierre Dulac <[email protected]> Signed-off-by: agriffith50 <[email protected]> initial draft of kserve release blog Signed-off-by: agriffith50 <[email protected]> change title Signed-off-by: agriffith50 <[email protected]> resolving comments Signed-off-by: agriffith50 <[email protected]> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Dan Sun <[email protected]> Signed-off-by: Alexa Griffith <[email protected]> Signed-off-by: agriffith50 <[email protected]> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Dan Sun <[email protected]> Signed-off-by: Alexa Griffith <[email protected]> Signed-off-by: agriffith50 <[email protected]> update comment Signed-off-by: agriffith50 <[email protected]> update for vllm comment Signed-off-by: agriffith50 <[email protected]> add hf img Signed-off-by: agriffith50 <[email protected]> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Signed-off-by: Alexa Griffith <[email protected]> Signed-off-by: agriffith50 <[email protected]> Sample requests update in HuggingFace runtime with vLLM support (kserve#364) Update Sample requests for HF runtime Signed-off-by: Gavrish Prabhu <[email protected]> Signed-off-by: agriffith50 <[email protected]> add new kserve img Signed-off-by: agriffith50 <[email protected]> Update future plan and other changes Add Huggingface Serving Runtime example with Llama2 (kserve#345) * Add Huggingface Serving Runtime example with Llama2 Signed-off-by: Gavrish Prabhu <[email protected]> * Fix examples Signed-off-by: Gavrish Prabhu <[email protected]> * Fix examples Signed-off-by: Gavrish Prabhu <[email protected]> * fix review comments Signed-off-by: Gavrish Prabhu <[email protected]> * add linking Signed-off-by: Gavrish Prabhu <[email protected]> * fix comments Signed-off-by: Gavrish Prabhu <[email protected]> * Update huggingface vllm runtime doc Signed-off-by: Dan Sun <[email protected]> * Update mkdocs.yml Signed-off-by: Dan Sun <[email protected]> * Update triton doc Signed-off-by: Dan Sun <[email protected]> * Fix Hugging Face Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix Hugging Face Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Gavrish Prabhu <[email protected]> Signed-off-by: Dan Sun <[email protected]> Co-authored-by: Dan Sun <[email protected]> Signed-off-by: agriffith50 <[email protected]> Sample requests update in HuggingFace runtime with vLLM support (kserve#364) Update Sample requests for HF runtime Signed-off-by: Gavrish Prabhu <[email protected]> Update huggingface triton yaml Signed-off-by: Dan Sun <[email protected]> Signed-off-by: agriffith50 <[email protected]> Update blog link Signed-off-by: agriffith50 <[email protected]> Add triton huggingface reference Signed-off-by: agriffith50 <[email protected]> resolve merge Signed-off-by: agriffith50 <[email protected]> Add Helm installation commands in get started guide Signed-off-by: Yuan Tang <[email protected]> Revert "Add Helm installation commands in get started guide" This reverts commit bc90c25. Add Helm installation commands in get started guide (kserve#358) Signed-off-by: Yuan Tang <[email protected]> Update README.md (kserve#359) Fix broken link to Ray doc on fractional GPU allocation. Signed-off-by: zoramt <[email protected]> Update adopters.md (kserve#361) Point users to vLLM production server (kserve#362) The vLLM teams states that the [`vllm.entrypoints.api_server`](https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/api_server.py#L2-L6) is just to demonstrates usage of their AsyncEngine, for production use they point users to `vllm.entrypoints.openai.api_server` instead. So, I think this should be the entrypoint used in the kServe documentation too, to avoid confusing new comers. Signed-off-by: Pierre Dulac <[email protected]> Sample requests update in HuggingFace runtime with vLLM support (kserve#364) Update Sample requests for HF runtime Signed-off-by: Gavrish Prabhu <[email protected]> Update huggingface triton yaml Signed-off-by: Dan Sun <[email protected]>
* parent 2257489 author Dan Sun <[email protected]> 1698039744 -0400 committer agriffith50 <[email protected]> 1716219052 -0400 parent 2257489 author Dan Sun <[email protected]> 1698039744 -0400 committer agriffith50 <[email protected]> 1716218313 -0400 parent 2257489 author Dan Sun <[email protected]> 1698039744 -0400 committer agriffith50 <[email protected]> 1716217744 -0400 Add TorchServe Huggingface accelerate example (#304) * Add LLM example for huggingface accelerate Signed-off-by: Dan Sun <[email protected]> * Add inputs Signed-off-by: Dan Sun <[email protected]> * Update storage uri Signed-off-by: Dan Sun <[email protected]> * Add to LLM runtime to index Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Dan Sun <[email protected]> 0.11 release blog (#310) * Add 0.11 release blog Signed-off-by: Dan Sun <[email protected]> * Update blog Signed-off-by: Dan Sun <[email protected]> * Add vllm runtime doc Signed-off-by: Dan Sun <[email protected]> * Add vllm example doc Signed-off-by: Dan Sun <[email protected]> * Update blog link Signed-off-by: Dan Sun <[email protected]> * Add vLLM intro Signed-off-by: Dan Sun <[email protected]> * add python runtime open inference protocol tutorials Signed-off-by: Dan Sun <[email protected]> * Fix warning Signed-off-by: Dan Sun <[email protected]> * Add warning Signed-off-by: Dan Sun <[email protected]> * Address comments Signed-off-by: Dan Sun <[email protected]> * Fix newline Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Dan Sun <[email protected]> Fix torchserve llm example link Signed-off-by: Dan Sun <[email protected]> Fixed formatting in get_started (#319) Signed-off-by: Helber Belmiro <[email protected]> clarify prometheus annotation (#316) Signed-off-by: JuHyung-Son <[email protected]> Document servingruntime constraint introduced by kserve/kserve#3181 (#320) * Document serving runtime constraint introduced by kserve/kserve#3181 Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Set content type for predict/explainer curl requests Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Update docs/modelserving/servingruntimes.md Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Sivanantham Chinnaiyan <[email protected]> Signed-off-by: Dan Sun <[email protected]> Co-authored-by: Dan Sun <[email protected]> Add kubeflow summit 2023 Jooho's presentation link (#325) add kubeflow summit 2023 Jooho's presentation link Signed-off-by: jooho <[email protected]> docs: Add one related presentations from Kubeflow Summit 2023 (#327) * docs: Add two new related presentations from Kubeflow Summit 2023Update presentations.md Signed-off-by: Yuan Tang <[email protected]> * Update presentations.md Signed-off-by: Yuan Tang <[email protected]> --------- Signed-off-by: Yuan Tang <[email protected]> Added example for torchserve grpc v1 and v2. (#307) * Added example for torchserve grpc v1 and v2. Signed-off-by: Andrews Arokiam <[email protected]> * Schema order changed. Signed-off-by: Andrews Arokiam <[email protected]> * corrected v2 REST input. Signed-off-by: Andrews Arokiam <[email protected]> * Updated grpc-v2 protocolVersion. Signed-off-by: Andrews Arokiam <[email protected]> * Update README.md * Update README.md * Update README.md --------- Signed-off-by: Andrews Arokiam <[email protected]> Co-authored-by: Dan Sun <[email protected]> Add link to release process doc in developer.md (#330) Signed-off-by: Yuan Tang <[email protected]> Update tranformer collocation docs for specifying storage uri (#323) Signed-off-by: Sivanantham Chinnaiyan <[email protected]> Fix incorrect edit URL to docs (#329) Signed-off-by: Yuan Tang <[email protected]> Set resources for inferencegraph example (#322) Signed-off-by: Sivanantham Chinnaiyan <[email protected]> Fixes #331 - broken link to AMD Inference Server (#332) Tested locally with mkdocs serve Render KServe Python Runtime API doc with mkdoc (#333) * Update KServe python sdk docs Signed-off-by: Dan Sun <[email protected]> * Update serving runtime doc Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Dan Sun <[email protected]> Fix build: Install kserve for rendering the docstring (#334) * Update KServe python sdk docs Signed-off-by: Dan Sun <[email protected]> * Install kserve sdk for mkdocstring Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Dan Sun <[email protected]> Onnx docs update (#275) * Updated Onnx example. Signed-off-by: Andrews Arokiam <[email protected]> * Reverting sklearn doc update as there is a separate PR Signed-off-by: andyi2it <[email protected]> * Added new schema in onnx example. Signed-off-by: Andrews Arokiam <[email protected]> * protocolVersion and old schema updated with onnx example. Signed-off-by: Andrews Arokiam <[email protected]> --------- Signed-off-by: Andrews Arokiam <[email protected]> Signed-off-by: andyi2it <[email protected]> Standardized schema order (#318) * Standardized schema's order. Signed-off-by: Andrews Arokiam <[email protected]> * Fix v2 spec for torch serve --------- Signed-off-by: Andrews Arokiam <[email protected]> Signed-off-by: Dan Sun <[email protected]> Co-authored-by: Dan Sun <[email protected]> Update link to Slack instructions Signed-off-by: Yuan (Terry) Tang <[email protected]> Update README.md (#344) Fix incorrect storage uri prefix Signed-off-by: zoramt <[email protected]> Added steps to delete model-store-pod (#343) Signed-off-by: murata.yu <[email protected]> Update README.md Signed-off-by: Dan Sun <[email protected]> Add documentation for modelcars (#337) * Add documentation for modelcars, introduced in 0.12 as experimental feature Signed-off-by: Roland Huß <[email protected]> * added some references to this feature Signed-off-by: Roland Huß <[email protected]> --------- Signed-off-by: Roland Huß <[email protected]> add certificate doc (#326) * add certificate doc Signed-off-by: jooho <[email protected]> * Update mkdocs.yml Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: jooho <[email protected]> Signed-off-by: Dan Sun <[email protected]> Co-authored-by: Dan Sun <[email protected]> docs: fix the emoji deprecation message and invalid file name (#348) Signed-off-by: Peter Jausovec <[email protected]> Add documentation for GCS (#351) * Add documentation for GCS Signed-off-by: tjandy98 <[email protected]> * Update mkdocs to include GCS Signed-off-by: tjandy98 <[email protected]> * Fix formatting Signed-off-by: tjandy98 <[email protected]> --------- Signed-off-by: tjandy98 <[email protected]> Add ModelRegistry custom storage intializer example (#346) * Add ModelRegistry custom storage intializer example Signed-off-by: Andrea Lamparelli <[email protected]> * Update docs/modelserving/storage/storagecontainers.md Co-authored-by: Dan Sun <[email protected]> Signed-off-by: Andrea Lamparelli <[email protected]> --------- Signed-off-by: Andrea Lamparelli <[email protected]> Co-authored-by: Dan Sun <[email protected]> Updated docs for autoscaling on gpu. (#328) Signed-off-by: Andrews Arokiam <[email protected]> Update version matrix for 0.12 (#353) * Update version matrix for 0.12 Signed-off-by: Dan Sun <[email protected]> * Update kubernetes_deployment.md Signed-off-by: Dan Sun <[email protected]> * Update notes for gRPC issues Signed-off-by: Dan Sun <[email protected]> * Update kserve install Signed-off-by: Dan Sun <[email protected]> * Update kubernetes_deployment.md Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Dan Sun <[email protected]> docs: update kserve resource yaml file (#356) fix docs Signed-off-by: Niels ten Boom <[email protected]> Signed-off-by: agriffith50 <[email protected]> Update serving runtime version for 0.12 release and add some notes (#354) * Fix few bugs, add quick install failure note and update docs for release 0.12.0 Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Add warning about control plane namespaces Signed-off-by: Sivanantham Chinnaiyan <[email protected]> * Resolve comments Signed-off-by: Sivanantham Chinnaiyan <[email protected]> --------- Signed-off-by: Sivanantham Chinnaiyan <[email protected]> Signed-off-by: agriffith50 <[email protected]> Add Helm installation commands in get started guide Signed-off-by: Yuan Tang <[email protected]> Signed-off-by: agriffith50 <[email protected]> Revert "Add Helm installation commands in get started guide" This reverts commit bc90c25. Signed-off-by: agriffith50 <[email protected]> Add Helm installation commands in get started guide (#358) Signed-off-by: Yuan Tang <[email protected]> Signed-off-by: agriffith50 <[email protected]> Update README.md (#359) Fix broken link to Ray doc on fractional GPU allocation. Signed-off-by: zoramt <[email protected]> Signed-off-by: agriffith50 <[email protected]> Add Huggingface Serving Runtime example with Llama2 (#345) * Add Huggingface Serving Runtime example with Llama2 Signed-off-by: Gavrish Prabhu <[email protected]> * Fix examples Signed-off-by: Gavrish Prabhu <[email protected]> * Fix examples Signed-off-by: Gavrish Prabhu <[email protected]> * fix review comments Signed-off-by: Gavrish Prabhu <[email protected]> * add linking Signed-off-by: Gavrish Prabhu <[email protected]> * fix comments Signed-off-by: Gavrish Prabhu <[email protected]> * Update huggingface vllm runtime doc Signed-off-by: Dan Sun <[email protected]> * Update mkdocs.yml Signed-off-by: Dan Sun <[email protected]> * Update triton doc Signed-off-by: Dan Sun <[email protected]> * Fix Hugging Face Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix Hugging Face Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Gavrish Prabhu <[email protected]> Signed-off-by: Dan Sun <[email protected]> Co-authored-by: Dan Sun <[email protected]> Signed-off-by: agriffith50 <[email protected]> Update adopters.md (#361) Signed-off-by: agriffith50 <[email protected]> Point users to vLLM production server (#362) The vLLM teams states that the [`vllm.entrypoints.api_server`](https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/api_server.py#L2-L6) is just to demonstrates usage of their AsyncEngine, for production use they point users to `vllm.entrypoints.openai.api_server` instead. So, I think this should be the entrypoint used in the kServe documentation too, to avoid confusing new comers. Signed-off-by: Pierre Dulac <[email protected]> Signed-off-by: agriffith50 <[email protected]> initial draft of kserve release blog Signed-off-by: agriffith50 <[email protected]> change title Signed-off-by: agriffith50 <[email protected]> resolving comments Signed-off-by: agriffith50 <[email protected]> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Dan Sun <[email protected]> Signed-off-by: Alexa Griffith <[email protected]> Signed-off-by: agriffith50 <[email protected]> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Dan Sun <[email protected]> Signed-off-by: Alexa Griffith <[email protected]> Signed-off-by: agriffith50 <[email protected]> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Dan Sun <[email protected]> Signed-off-by: Alexa Griffith <[email protected]> Signed-off-by: agriffith50 <[email protected]> update comment Signed-off-by: agriffith50 <[email protected]> update for vllm comment Signed-off-by: agriffith50 <[email protected]> add more info about completions endpoints Signed-off-by: agriffith50 <[email protected]> add hf img Signed-off-by: agriffith50 <[email protected]> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Signed-off-by: Alexa Griffith <[email protected]> Signed-off-by: agriffith50 <[email protected]> Sample requests update in HuggingFace runtime with vLLM support (#364) Update Sample requests for HF runtime Signed-off-by: Gavrish Prabhu <[email protected]> Signed-off-by: agriffith50 <[email protected]> add new kserve img Signed-off-by: agriffith50 <[email protected]> Update future plan and other changes Signed-off-by: agriffith50 <[email protected]> Add Huggingface Serving Runtime example with Llama2 (#345) * Add Huggingface Serving Runtime example with Llama2 Signed-off-by: Gavrish Prabhu <[email protected]> * Fix examples Signed-off-by: Gavrish Prabhu <[email protected]> * Fix examples Signed-off-by: Gavrish Prabhu <[email protected]> * fix review comments Signed-off-by: Gavrish Prabhu <[email protected]> * add linking Signed-off-by: Gavrish Prabhu <[email protected]> * fix comments Signed-off-by: Gavrish Prabhu <[email protected]> * Update huggingface vllm runtime doc Signed-off-by: Dan Sun <[email protected]> * Update mkdocs.yml Signed-off-by: Dan Sun <[email protected]> * Update triton doc Signed-off-by: Dan Sun <[email protected]> * Fix Hugging Face Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix Hugging Face Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Gavrish Prabhu <[email protected]> Signed-off-by: Dan Sun <[email protected]> Co-authored-by: Dan Sun <[email protected]> Signed-off-by: agriffith50 <[email protected]> Sample requests update in HuggingFace runtime with vLLM support (#364) Update Sample requests for HF runtime Signed-off-by: Gavrish Prabhu <[email protected]> Signed-off-by: agriffith50 <[email protected]> Update huggingface triton yaml Signed-off-by: Dan Sun <[email protected]> Signed-off-by: agriffith50 <[email protected]> Update blog link Signed-off-by: agriffith50 <[email protected]> Add triton huggingface reference Signed-off-by: agriffith50 <[email protected]> resolve merge Signed-off-by: agriffith50 <[email protected]> docs: update kserve resource yaml file (#356) fix docs Signed-off-by: Niels ten Boom <[email protected]> Add Helm installation commands in get started guide Signed-off-by: Yuan Tang <[email protected]> Signed-off-by: agriffith50 <[email protected]> Revert "Add Helm installation commands in get started guide" This reverts commit bc90c25. Signed-off-by: agriffith50 <[email protected]> Add Helm installation commands in get started guide (#358) Signed-off-by: Yuan Tang <[email protected]> Signed-off-by: agriffith50 <[email protected]> Update README.md (#359) Fix broken link to Ray doc on fractional GPU allocation. Signed-off-by: zoramt <[email protected]> Signed-off-by: agriffith50 <[email protected]> Add Huggingface Serving Runtime example with Llama2 (#345) * Add Huggingface Serving Runtime example with Llama2 Signed-off-by: Gavrish Prabhu <[email protected]> * Fix examples Signed-off-by: Gavrish Prabhu <[email protected]> * Fix examples Signed-off-by: Gavrish Prabhu <[email protected]> * fix review comments Signed-off-by: Gavrish Prabhu <[email protected]> * add linking Signed-off-by: Gavrish Prabhu <[email protected]> * fix comments Signed-off-by: Gavrish Prabhu <[email protected]> * Update huggingface vllm runtime doc Signed-off-by: Dan Sun <[email protected]> * Update mkdocs.yml Signed-off-by: Dan Sun <[email protected]> * Update triton doc Signed-off-by: Dan Sun <[email protected]> * Fix Hugging Face Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix Hugging Face Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Gavrish Prabhu <[email protected]> Signed-off-by: Dan Sun <[email protected]> Co-authored-by: Dan Sun <[email protected]> Signed-off-by: agriffith50 <[email protected]> Update adopters.md (#361) Signed-off-by: agriffith50 <[email protected]> Point users to vLLM production server (#362) The vLLM teams states that the [`vllm.entrypoints.api_server`](https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/api_server.py#L2-L6) is just to demonstrates usage of their AsyncEngine, for production use they point users to `vllm.entrypoints.openai.api_server` instead. So, I think this should be the entrypoint used in the kServe documentation too, to avoid confusing new comers. Signed-off-by: Pierre Dulac <[email protected]> Signed-off-by: agriffith50 <[email protected]> initial draft of kserve release blog Signed-off-by: agriffith50 <[email protected]> change title Signed-off-by: agriffith50 <[email protected]> resolving comments Signed-off-by: agriffith50 <[email protected]> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Dan Sun <[email protected]> Signed-off-by: Alexa Griffith <[email protected]> Signed-off-by: agriffith50 <[email protected]> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Dan Sun <[email protected]> Signed-off-by: Alexa Griffith <[email protected]> Signed-off-by: agriffith50 <[email protected]> update comment Signed-off-by: agriffith50 <[email protected]> update for vllm comment Signed-off-by: agriffith50 <[email protected]> add hf img Signed-off-by: agriffith50 <[email protected]> Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Signed-off-by: Alexa Griffith <[email protected]> Signed-off-by: agriffith50 <[email protected]> Sample requests update in HuggingFace runtime with vLLM support (#364) Update Sample requests for HF runtime Signed-off-by: Gavrish Prabhu <[email protected]> Signed-off-by: agriffith50 <[email protected]> add new kserve img Signed-off-by: agriffith50 <[email protected]> Update future plan and other changes Add Huggingface Serving Runtime example with Llama2 (#345) * Add Huggingface Serving Runtime example with Llama2 Signed-off-by: Gavrish Prabhu <[email protected]> * Fix examples Signed-off-by: Gavrish Prabhu <[email protected]> * Fix examples Signed-off-by: Gavrish Prabhu <[email protected]> * fix review comments Signed-off-by: Gavrish Prabhu <[email protected]> * add linking Signed-off-by: Gavrish Prabhu <[email protected]> * fix comments Signed-off-by: Gavrish Prabhu <[email protected]> * Update huggingface vllm runtime doc Signed-off-by: Dan Sun <[email protected]> * Update mkdocs.yml Signed-off-by: Dan Sun <[email protected]> * Update triton doc Signed-off-by: Dan Sun <[email protected]> * Fix Hugging Face Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix newline Signed-off-by: Dan Sun <[email protected]> * fix Hugging Face Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Gavrish Prabhu <[email protected]> Signed-off-by: Dan Sun <[email protected]> Co-authored-by: Dan Sun <[email protected]> Signed-off-by: agriffith50 <[email protected]> Sample requests update in HuggingFace runtime with vLLM support (#364) Update Sample requests for HF runtime Signed-off-by: Gavrish Prabhu <[email protected]> Update huggingface triton yaml Signed-off-by: Dan Sun <[email protected]> Signed-off-by: agriffith50 <[email protected]> Update blog link Signed-off-by: agriffith50 <[email protected]> Add triton huggingface reference Signed-off-by: agriffith50 <[email protected]> resolve merge Signed-off-by: agriffith50 <[email protected]> Add Helm installation commands in get started guide Signed-off-by: Yuan Tang <[email protected]> Revert "Add Helm installation commands in get started guide" This reverts commit bc90c25. Add Helm installation commands in get started guide (#358) Signed-off-by: Yuan Tang <[email protected]> Update README.md (#359) Fix broken link to Ray doc on fractional GPU allocation. Signed-off-by: zoramt <[email protected]> Update adopters.md (#361) Point users to vLLM production server (#362) The vLLM teams states that the [`vllm.entrypoints.api_server`](https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/api_server.py#L2-L6) is just to demonstrates usage of their AsyncEngine, for production use they point users to `vllm.entrypoints.openai.api_server` instead. So, I think this should be the entrypoint used in the kServe documentation too, to avoid confusing new comers. Signed-off-by: Pierre Dulac <[email protected]> Sample requests update in HuggingFace runtime with vLLM support (#364) Update Sample requests for HF runtime Signed-off-by: Gavrish Prabhu <[email protected]> Update huggingface triton yaml Signed-off-by: Dan Sun <[email protected]> * fix merge Signed-off-by: agriffith50 <[email protected]> * fix more merge issue Signed-off-by: agriffith50 <[email protected]> * Move up the diagram Signed-off-by: agriffith50 <[email protected]> * fix flag naming Signed-off-by: agriffith50 <[email protected]> * update slack Signed-off-by: agriffith50 <[email protected]> * Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Yuan Tang <[email protected]> Signed-off-by: Alexa Griffith <[email protected]> Signed-off-by: agriffith50 <[email protected]> * Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Yuan Tang <[email protected]> Signed-off-by: Alexa Griffith <[email protected]> Signed-off-by: agriffith50 <[email protected]> * Update docs/blog/articles/2024-05-15-Kserve-0.13-release.md Co-authored-by: Yuan Tang <[email protected]> Signed-off-by: Alexa Griffith <[email protected]> Signed-off-by: agriffith50 <[email protected]> * fix Hugging Face Signed-off-by: agriffith50 <[email protected]> --------- Signed-off-by: Dan Sun <[email protected]> Signed-off-by: Alexa Griffith <[email protected]> Signed-off-by: agriffith50 <[email protected]> Co-authored-by: Dan Sun <[email protected]> Co-authored-by: Yuan Tang <[email protected]>
"Fixes #issue-number" or "Add description of the problem this PR solves"
Proposed Changes