State introspection #467
Replies: 2 comments
-
For what purposes do you want to look at the current state? Is this for operational monitoring? For debugging purposes? For business-level analytics? Depending on the purpose, there are some different directions we have planned here. For operational monitoring purposes, we expose some metrics you can scrape via your metrics infrastructure. Things like state size, key count, per operator throughput, etc. Some of those things are already exposed and you can read about them in our Metrics documentation but there are more we would like to eventually add. We do not plan on exposing direct access to the recovery SQLite DBs because they contain implementation details of how we do our progress tracking and transactional semantics, and need to be interpreted along with any durable backups of them. Interpreting directly what is in here requires deep coupling to Bytewax's implementation, backup, and progress model. We'd like to eventually provide some offline introspection tools here to assist with low-level debugging, but this would not be a general "dashboard". Also important to note that the correct definition of "current state" is somewhat complicated. For the entire system to know some state is "true" / "committed" there needs to be consensus that all outputs of the dataflow and the recovery backups have completed. There needs to be more thinking here, but I think it could be possible to have a lightweight basic version of querying of the "current state" in a future version of Bytewax, but not in the current version1. Even in a glorious future with a lightweight state querying interface, if you are hoping to have a production-quality way to query state for business-level questions, the Bytewax recovery DBs are not going to be optimized for all kinds of BI-like queries you might want to perform. I would instead recommend explicitly routing your state changes in your dataflow to a separate querying system that you can design to support the exact queries you need. Footnotes
|
Beta Was this translation helpful? Give feedback.
-
Thanks for your answer. The main purpose is for debugging. |
Beta Was this translation helpful? Give feedback.
-
In a stateful processing with some rich values, I would like to see the content on the current state. Ideally I would like to create a web page with stats, details and charts about the gathered states for all instances. Is this possible? Maybe am i supposed to query Sqlite directly with a tier service?
For info that's something possible with Faust, and I found this pretty powerful
Beta Was this translation helpful? Give feedback.
All reactions