New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Package: Analytics Engine #399
base: master
Are you sure you want to change the base?
Package: Analytics Engine #399
Conversation
|
@CraigglesO Hey, you were asking about how |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey! 👋 Thanks so much for doing this again! 😄 Sorry I've got to this now. I've added some comments. I'm trying to get someone on the Analytics Engine team to look at the SQL stuff too. 👍
const _blobs = blobs.map((blob) => { | ||
if (blob instanceof ArrayBuffer) { | ||
return decoder.decode(new Uint8Array(blob)); | ||
} | ||
return blob; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following on from the comment above about using type BLOB
instead of TEXT
, is it possible to do the reverse here, and convert everything to ArrayBuffer
s? If blob
contains non-UTF8 data, decode
will fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can switch to BLOB
s no problem. The change is fine, but:
- If I store as ArrayBuffer it will fail (can only be string, null, or Uint8Array with better-sqlite)
- If I store as Uint8Array, the return is a Buffer when doing searches when they should be strings
- https://developers.cloudflare.com/analytics/analytics-engine/sql-api/#table-structure -> Table is actually text which is why I assumed it should be treated as such.
blobsKeys.length > 0 ? `, ${blobsKeys}` : "" | ||
}) VALUES (@dataset, @index1${ | ||
doublesValues.length > 0 ? `, ${doublesValues}` : "" | ||
}${blobsValues.length > 0 ? `, ${blobsValues}` : ""})` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like an SQL injection vulnerability. 🙁 Rather than passing values in directly, could you insert placeholders and then fill in the actual values in the this.#db.prepare()
call? This also means blob values aren't affected by the replaceAll
calls below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused on this if you don't mind expounding. I'm only inserting blob1, blob2, ...
, @blob1, @blob2, ...
and the same for doubles depending upon length as far as I am aware.
Hi @CraigglesO, thanks for contributing this. We've been talking internally about whether SQLite is 'good enough' as a model for how Analytics Engine will behave. I think it's fine - the behaviour might not quite match production, but this will allow people to do testing locally and that's definitely a good thing. Happy to merge once some of the above issues are addressed :) |
Resolves cloudflare/workers-sdk#4383
TODO:
NOTES:
async getAnalyticsEngine(name: string): Promise<AnalyticsEngine>
async queryMiniflareAnalyticsEngine(query: string): Promise<any>
for jest and vitest.createRequestListener
includes/cdn-cgi/mf/analytics_engine/sql
QUANTILEWEIGHTED
is pretty weak against large scale data.