Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate from llama_eval to llama_decode (including llama_batch usage) #19

Open
BrutalCoding opened this issue Mar 3, 2024 · 1 comment
Assignees
Labels
work in progress Issue has been recognized and work is in progress

Comments

@BrutalCoding
Copy link
Owner

BrutalCoding commented Mar 3, 2024

Related to #15 #17

I'm updating aub.ai to make use the latest version of llama.cpp. This update has deprecated the llama_eval function in favor of llama_decode, which now requires the use of llama_batch. While I was aware of this upcoming change a while ago, I hadn't had the time to migrate away yet.

This issue has my highest priority, please be patient while I work out some technical difficulties.

At the time of this writing, it's the start of the evening here on a Sunday. I will continue development but I honestly do not think I am able to finish migration, compile/test for each platform and package this up for a release on pub.dev (as the aub_ai package), neither as an app (e.g. TestFlight) etc. Each step is do-able, but all together always takes quite some time without a proper CI/CD setup (sorry, comes later!). Please wait while I go through these steps, you can follow some of this work here in this branch: https://github.com/BrutalCoding/aub.ai/tree/feature/sync-with-latest-llamacpp

Challenges:

  • I've updated my code to use llama_decode and llama_batch, but the AI model is now outputting strange Unicode characters. This indicates an incorrect implementation on my side.

Tasks:

  • Review example code utilizing llama_decode and llama_batch within the llama.cpp repository or related projects.
  • Carefully analyze the differences between how I used llama_eval previously and the expected input/output structures for llama_decode.
  • Debug and adjust my code to ensure correct tokenization, batching, and handling of model output.
@BrutalCoding BrutalCoding added the work in progress Issue has been recognized and work is in progress label Mar 3, 2024
@BrutalCoding BrutalCoding self-assigned this Mar 3, 2024
@BrutalCoding BrutalCoding pinned this issue Mar 3, 2024
@BrutalCoding
Copy link
Owner Author

Status update time!

Good news:

  • Fixed compiler issues, code been migrated to llama_decode etc 😄
  • Gemma works

Bad news

  • I lied, kinda. I did migrate the code, but I'm missing a critical step somewhere.
  • Assistant no longer generates an answer. The prompt does properly tokenize and the "answer" (exact same prompt / convo) gets decoded with text_to_sentence_piece too but I am missing a step somewhere.

Will jump on this project again this weekend, let's see if I can solve it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
work in progress Issue has been recognized and work is in progress
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

1 participant