Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update README.md #1520

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Update README.md #1520

wants to merge 1 commit into from

Conversation

msampathkumar
Copy link
Member

Description

Thank you for opening a Pull Request!
Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Follow the CONTRIBUTING Guide.
  • You are listed as the author in your notebook or README file.
    • Your account is listed in CODEOWNERS for the file(s).
  • Make your Pull Request title in the https://www.conventionalcommits.org/ specification.
  • Ensure the tests and linter pass (Run nox -s format from the repository root to format).
  • Appropriate docs were updated (if necessary)

Fixes #<issue_number_goes_here> 🦕

Copy link
Contributor

@code-review-assist code-review-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi team,

I'm currently reviewing this pull request and will post my detailed review in a few minutes. In the meantime, here's a quick summary for you and other reviewers to get up to speed:

This pull request updates the README.md file for the websocket-demo-app within the gemini/multimodal-live-api directory. The main changes are focused on clarifying the tutorial's instructions and modernizing the language. Specifically, the README now explicitly mentions Gemini 2.0 Flash Experimental and streamlines the setup instructions, separating the preparation steps from the local execution steps. The Cloud Shell setup instructions have also been simplified. The intent is to make the tutorial easier to follow and more up-to-date with the latest Gemini capabilities.

The changes are primarily in the gemini/multimodal-live-api/websocket-demo-app/README.md file. The updated README provides clearer instructions for setting up and running the demo application, both locally and in Cloud Shell. The descriptions of the architecture and prerequisites have also been improved for better clarity.

If there's anything unclear or requires further clarification, please let me know.


And now, a little haiku to brighten your day:

Code flows like a stream,
Readme guides the path so clear,
Joy in every line.

Copy link
Contributor

@code-review-assist code-review-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pull request updates the README file with some helpful clarifications and improvements to the setup instructions. The changes are well-written and should make it easier for users to get started with the demo app. Here are a few suggestions for improvement.

@@ -1,76 +1,79 @@
# Multimodal Live API Demo

In this tutorial, you will be building a web application that enables you to use your voice and camera to talk to Gemini 2.0 through the [Multimodal Live API](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/multimodal-live).
This tutorial guides you through building a web application that allows you to interact with [Gemini 2.0 Flash Experimental](https://blog.google/technology/google-deepmind/google-gemini-ai-update-december-2024/#ceo-message) using your voice and camera. This is achieved through the [Multimodal Live API](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/multimodal-live), a low-latency bidirectional streaming API that supports audio and video input and can output audio.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be helpful to briefly explain what "Flash Experimental" means in this context for new users. Is it a specific version or feature set of Gemini 2.0? This would help users understand the capabilities and limitations of the demo.

Suggested change
This tutorial guides you through building a web application that allows you to interact with [Gemini 2.0 Flash Experimental](https://blog.google/technology/google-deepmind/google-gemini-ai-update-december-2024/#ceo-message) using your voice and camera. This is achieved through the [Multimodal Live API](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/multimodal-live), a low-latency bidirectional streaming API that supports audio and video input and can output audio.
This tutorial guides you through building a web application that allows you to interact with [Gemini 2.0 Flash Experimental](https://blog.google/technology/google-deepmind/google-gemini-ai-update-december-2024/#ceo-message) (a research prototype exploring new multimodal capabilities) using your voice and camera. This is achieved through the [Multimodal Live API](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/multimodal-live), a low-latency bidirectional streaming API that supports audio and video input and can output audio.

Comment on lines +7 to +8
* A Google Cloud project
* Foundational knowledge of Web development
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

low

Consider adding links to instructions for creating a Google Cloud project and enabling billing. This would be helpful for users who are new to Google Cloud.

Suggested change
* A Google Cloud project
* Foundational knowledge of Web development
* A Google Cloud project (see [instructions](https://cloud.google.com/resource-manager/docs/creating-managing-projects))
* Foundational knowledge of Web development


### Setup locally
### Preparation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The title of this section should be "Preparation", per the changes made in the previous lines.

Suggested change
### Preparation
### Preparation

Comment on lines +35 to +36
git clone https://github.com/GoogleCloudPlatform/generative-ai.git
cd generative-ai/gemini/multimodal-live-api/websocket-demo-app
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

low

The cd command should also include the repository name after cloning, since the user may clone into a directory other than the repository name.

Suggested change
git clone https://github.com/GoogleCloudPlatform/generative-ai.git
cd generative-ai/gemini/multimodal-live-api/websocket-demo-app
git clone https://github.com/GoogleCloudPlatform/generative-ai.git
cd generative-ai/gemini/multimodal-live-api/websocket-demo-app

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants