You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The discussion below is about "hard-coded" categories (meaning, we coerce all events into them), but it would be interesting if instead we passed in an array of documents (range queried by geo bounds) and then get "dynamic tags" where we ask the API to sort the documents into exactly 10 categories.
This would be non-deterministic, and the categories would adapt to the available content, producing an effect where you're not "spray and praying" by clicking "Live music" in the UI. If there is none, it will not show up as a category. If that category is present (optionally coerced into our defined categories) it will show in the UI
Again, the below conversation was about "after event insert" post tagging / processing of events.
There's a hacky way of doing this that might make more sense than spinning up a second index to do the tagging.
Marqo exposes an endpoint /embed that lets you embed an array of documents. You could do the following.
Every minute run a search with a filter to find uncategorized items. Starting point is just a rand []float32 of the embedding dimension. Importantly expose_facets should be true.
The service should cache the embeddings of each category. E.g. /embed [ "dancing", "sports", "singing", ... ]
For each item returned by the search, find the closest embedding (compared with cosine similarity) in the embedded categories array
After every calculation is complete, call update_documents with the result batch.
Notes:
We choose a random starting point to reduce odds of working on the same item if two batches wind up running at the same time. There's no issue with the updates, just inefficiency.
This could probably run in a Lambda function.
It seems like Go has some reasonable packages for working with vector
The text was updated successfully, but these errors were encountered:
The idea here is that we explore using the
/embed
endpoint exposed by marqo to pass documents into a request and have a defined embedding output.(Related: #146)
The discussion below is about "hard-coded" categories (meaning, we coerce all events into them), but it would be interesting if instead we passed in an array of documents (range queried by geo bounds) and then get "dynamic tags" where we ask the API to sort the documents into exactly 10 categories.
This would be non-deterministic, and the categories would adapt to the available content, producing an effect where you're not "spray and praying" by clicking "Live music" in the UI. If there is none, it will not show up as a category. If that category is present (optionally coerced into our defined categories) it will show in the UI
Again, the below conversation was about "after event insert" post tagging / processing of events.
From Robertson Taylor, Sales Engineer at Marqo (conversation here: https://meetnear.slack.com/archives/C07KQCLMQG7/p1726530661764979?thread_ts=1726527232.633069&cid=C07KQCLMQG7):
The text was updated successfully, but these errors were encountered: