Join the Embedbase Discord Server!.
- Embeddings
- openai embeddings
- cohere embeddings
- Google PaLM embeddings
- local (BERT, LLaMa, Vicuna, etc.)
- Vector database
- supabase
- postgres
- qdrant
- local (memory, sqlite, etc.)
- fastapi
- Authentication (optional)
We have a growing task list of issues. Find an issue that appeals to you and make a comment that you'd like to work on it. Include in your comment a brief description of how you'll solve the problem and if there are any open questions you want to discuss.
If the issue is currently unclear but you are interested, please post in Discord and someone can help clarify the issue in more detail.
We write the documentation using Nextra in the docs
folder in https://github.com/different-ai/embedbase. It is automatically indexed on changes in Embedbase Cloud and provide a GPT-4-QA interface.
We're all working on different parts of Embedbase together. To make contributions smoothly we recommend the following:
- Fork this project repository and clone it to your local machine. (Read more About Forks) or use gitpod.io.
- Before working on any changes, try to sync the forked repository to keep it up-to-date with the upstream repository.
- On a
new branch
in your fork (aka a "feature branch" and not
main
) work on a small focused change that only touches on a few files. - Package up a small bit of work that solves part of the problem into a Pull Request and send it out for review.
- If you're lucky, we can merge your change into
main
without any problems. If there are changes to files you're working on, resolve them by:- First try to rebase as suggested in these instructions.
- If rebasing feels too painful, merge as suggested in these instructions.
- Once you've resolved conflicts (if any), finish the review and squash and merge your PR (when squashing try to clean up or update the individual commit messages to be one sensible single one).
- Merge in your change and move on to a new issue or the second step of your current issue.
Additionally, if someone is working on an issue that interests you, ask if they need help on it or would like suggestions on how to approach the issue. If so, share wildly. If they seem to have a good handle on it, let them work on their solution until a challenge comes up.
- At any point you can compare your feature branch to the upstream/main of
different-ai/embedbase
by using a URL like this: https://github.com/different-ai/embedbase/compare/main...bobm4894:embedbase:my-example-feature-branch. Obviously just replacebobm4894
with your own GitHub user name andmy-example-feature-branch
with whatever you called the feature branch you are working on, so something likehttps://github.com/different-ai/embedbase/compare/main...<your_github_username>:embedbase:<your_branch_name>
. This will show the changes that would appear in a PR, so you can check this to make sure only the files you have changed or added will be part of the PR. - Try not to work on the
main
branch in your fork - ideally you can keep this as just an updated copy ofmain
fromdifferent-ai/embedbase
. - If your feature branch gets messed up, just update the
main
branch in your fork and create a fresh new clean "feature branch" where you can add your changes one by one in separate commits or all as a single commit. - When working on Github actions, you can test locally using act like so
act -W .github/workflows/ci_core.yml --container-architecture linux/amd64
(container-architecture is necessary if you use Mac M series)
A review finishes when all blocking comments are addressed and at least one owning reviewer has approved the PR. Be sure to acknowledge any non-blocking comments either by making the requested change, explaining why it's not being addressed now, or filing an issue to handle it later.
- Bump the version in
pyproject.toml
make release
cd sdk/embedbase-py
.- Bump the version in
pyproject.toml
make release
For the Javascript SDK, just push to main, we use semantic-release to automatically release when a change has been made to the main branch.
We use https://semantic-release.gitbook.io/semantic-release/ under the hood.
cd hosted
make release
Just push
Just push
FYI the documentation gpt4 extension is using a dataset automatically synced with all Embedbase content (github, discord, etc.).