-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sidecar: Add /api/v1/flush
endpoint
#7358
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: mluffman <[email protected]>
Signed-off-by: mluffman <[email protected]>
Signed-off-by: mluffman <[email protected]>
If the prometheus that belongs to a sidecar is down we dont need to query the sidecar. This PR makes it so that we take the sidecar out of the endpoint set then. We do the same for all other store APIs by retuning an error in the info/Info gRPC call if they are marked as not ready. Signed-off-by: Michael Hoffmann <[email protected]> Signed-off-by: mluffman <[email protected]>
Signed-off-by: Nicolas Takashi <[email protected]> Signed-off-by: mluffman <[email protected]>
…nos-io#7305) * Query|Receiver|Store: Do not log full request on ProxyStore by default We had a problem on our production where a sudden increase in requests with long matchers was putting our receivers under a lot of pressure. Upon checking profiles we saw that the problem was calls to Log() Signed-off-by: Pedro Tanaka <[email protected]> * Adding changelog Signed-off-by: Pedro Tanaka <[email protected]> --------- Signed-off-by: Pedro Tanaka <[email protected]> Signed-off-by: mluffman <[email protected]>
* *: Updating hashicorp LRU cache to v2 Signed-off-by: Pedro Tanaka <[email protected]> * Adding some new comments regarding removing complexity of TTL Signed-off-by: Pedro Tanaka <[email protected]> * Using new version everywhere Signed-off-by: Pedro Tanaka <[email protected]> * rephrase the comment Signed-off-by: Pedro Tanaka <[email protected]> --------- Signed-off-by: Pedro Tanaka <[email protected]> Signed-off-by: mluffman <[email protected]>
Remove a long-standing TODO item in the code - let's use the great loser tree implementation by Bryan. It is faster than the heap because less comparisons are needed. Should be a nice improvement given that the heap is used in a lot of hot paths. Since Prometheus also uses this library, it's tricky to import the "any" version. I tried doing bboreham/go-loser#3 but it's still impossible to do that. Let's just copy/paste the code, it's not a lot. Bench: ``` goos: linux goarch: amd64 pkg: github.com/thanos-io/thanos/pkg/store cpu: Intel(R) Core(TM) i9-10885H CPU @ 2.40GHz │ oldkway │ newkway │ │ sec/op │ sec/op vs base │ KWayMerge-16 2.292m ± 3% 2.075m ± 15% -9.47% (p=0.023 n=10) │ oldkway │ newkway │ │ B/op │ B/op vs base │ KWayMerge-16 1.553Mi ± 0% 1.585Mi ± 0% +2.04% (p=0.000 n=10) │ oldkway │ newkway │ │ allocs/op │ allocs/op vs base │ KWayMerge-16 27.26k ± 0% 26.27k ± 0% -3.66% (p=0.000 n=10) ``` Signed-off-by: Giedrius Statkevičius <[email protected]> Signed-off-by: mluffman <[email protected]>
Batch TSDB Infos for bucket store for blocks with overlapping ranges. Signed-off-by: Michael Hoffmann <[email protected]> Signed-off-by: mluffman <[email protected]>
Signed-off-by: Michael Hoffmann <[email protected]> Signed-off-by: mluffman <[email protected]>
…io#7310) * Proxy: acceptance test for proxy store with replica labels Signed-off-by: Michael Hoffmann <[email protected]> * Stores: handle replica labels in label_value and label_names grpcs Signed-off-by: Michael Hoffmann <[email protected]> --------- Signed-off-by: Michael Hoffmann <[email protected]> Signed-off-by: mluffman <[email protected]>
Signed-off-by: Saswata Mukherjee <[email protected]> Signed-off-by: mluffman <[email protected]>
Signed-off-by: Saswata Mukherjee <[email protected]> Signed-off-by: mluffman <[email protected]>
Signed-off-by: Saswata Mukherjee <[email protected]> Signed-off-by: mluffman <[email protected]>
Signed-off-by: Kartikay <[email protected]> Signed-off-by: mluffman <[email protected]>
This commit adds a resource_attributes field to the OTLP tracing configuration. Signed-off-by: Filip Petkovski <[email protected]> Signed-off-by: mluffman <[email protected]>
Signed-off-by: Filip Petkovski <[email protected]> Signed-off-by: mluffman <[email protected]>
thanos-io#7301) Signed-off-by: mluffman <[email protected]>
For thanos-io#6775, it would be useful to know the exact block IDs to aid debugging. Signed-off-by: Giedrius Statkevičius <[email protected]> Signed-off-by: mluffman <[email protected]>
Signed-off-by: Saswata Mukherjee <[email protected]> Signed-off-by: mluffman <[email protected]>
Signed-off-by: Michael Hoffmann <[email protected]> Signed-off-by: mluffman <[email protected]>
Adding a minimal test case for issue thanos-io#6775 - reproduces the panic in the compactor. Signed-off-by: Giedrius Statkevičius <[email protected]> Signed-off-by: mluffman <[email protected]>
This commit adds a new tracing span for remotely delegated queries with attributes related to the query and remote engine. Signed-off-by: Filip Petkovski <[email protected]> Signed-off-by: mluffman <[email protected]>
Signed-off-by: Filip Petkovski <[email protected]> Signed-off-by: mluffman <[email protected]>
* Adding repro case for broken query with distributed engine Signed-off-by: Pedro Tanaka <[email protected]> * Fixing problem with distributed queries and xfunctios Signed-off-by: Pedro Tanaka <[email protected]> * Adding support for extended functions in tenancy enforcement Signed-off-by: Pedro Tanaka <[email protected]> * Moving custom parser to new package Signed-off-by: Pedro Tanaka <[email protected]> * fixing go-lint Signed-off-by: Pedro Tanaka <[email protected]> * Using same opts and reorganize imports Signed-off-by: Pedro Tanaka <[email protected]> * fixing problem with query format Signed-off-by: Pedro Tanaka <[email protected]> * fixing flaky tests Signed-off-by: Pedro Tanaka <[email protected]> * removing extra test Signed-off-by: Pedro Tanaka <[email protected]> * yet another flaky test Signed-off-by: Pedro Tanaka <[email protected]> --------- Signed-off-by: Pedro Tanaka <[email protected]> Signed-off-by: mluffman <[email protected]>
Signed-off-by: Vanshikav123 <[email protected]> Signed-off-by: mluffman <[email protected]>
* rule Signed-off-by: Vanshikav123 <[email protected]> * rule-changes Signed-off-by: Vanshikav123 <[email protected]> * prettier Signed-off-by: Vanshikav123 <[email protected]> * Rebuild Signed-off-by: Vanshikav123 <[email protected]> * changes after make react-app Signed-off-by: Vanshikav123 <[email protected]> --------- Signed-off-by: Vanshikav123 <[email protected]> Signed-off-by: mluffman <[email protected]>
Signed-off-by: Pedro Tanaka <[email protected]> Signed-off-by: mluffman <[email protected]>
When using the exemplars proxy to search for exemplars on receivers, if one receiver had tenants that did not match the selector on the external label it would get skipped completely even if it had a tenant that actually matched Signed-off-by: Pedro Tanaka <[email protected]> Signed-off-by: mluffman <[email protected]>
Signed-off-by: Pedro Tanaka <[email protected]> Signed-off-by: mluffman <[email protected]>
Signed-off-by: Pedro Tanaka <[email protected]> Signed-off-by: mluffman <[email protected]>
* Update minio-go to v7.0.70 Add support for EKS Pod Identity fix issue: thanos-io#7157 Signed-off-by: farhad <[email protected]> * Changelog - support for EKS Pod Identity Updated changelog Signed-off-by: farhad <[email protected]> --------- Signed-off-by: farhad <[email protected]> Signed-off-by: mluffman <[email protected]>
thanos-io#7338) * fixing extended functions support in more places Signed-off-by: Pedro Tanaka <[email protected]> * Adding new failint for the Parse() method Signed-off-by: Pedro Tanaka <[email protected]> * Adding new method for ParseMetricSelector Signed-off-by: Pedro Tanaka <[email protected]> * Fixing missing imports Extending test to check behavior More missing imports Signed-off-by: Pedro Tanaka <[email protected]> * Fixing method name Signed-off-by: Pedro Tanaka <[email protected]> * Solving references to forbidden functions Signed-off-by: Pedro Tanaka <[email protected]> * Treating promql validation from ParseExpr Signed-off-by: Pedro Tanaka <[email protected]> * fixing funcs Signed-off-by: Pedro Tanaka <[email protected]> --------- Signed-off-by: Pedro Tanaka <[email protected]> Signed-off-by: Pedro Tanaka <[email protected]> Signed-off-by: mluffman <[email protected]>
Bumps [webpack](https://github.com/webpack/webpack) from 5.70.0 to 5.91.0. - [Release notes](https://github.com/webpack/webpack/releases) - [Commits](webpack/webpack@v5.70.0...v5.91.0) --- updated-dependencies: - dependency-name: webpack dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: mluffman <[email protected]>
Signed-off-by: Filip Petkovski <[email protected]> Signed-off-by: mluffman <[email protected]>
* Align tenant pruning according to wall clock. Pruning a tenant currently acquires a lock on the tenant's TSDB, which blocks reads from incoming queries. We have noticed spikes in query latency when tenants get decomissioned since each receiver will prune the tenant at a different time. To reduce the window where queries get degraded, this commit makes sure that pruning happens at predictable intervals by aligning it to the wall clock, similar to how head compaction is aligned. The commit also changes the tenant deletion condition to look at the duration from the min time of the tenant, rather than from the last append time. Signed-off-by: Filip Petkovski <[email protected]> * Improve tests Signed-off-by: Filip Petkovski <[email protected]> --------- Signed-off-by: Filip Petkovski <[email protected]> Signed-off-by: mluffman <[email protected]>
Bumps [ip](https://github.com/indutny/node-ip) from 1.1.5 to 1.1.9. - [Commits](indutny/node-ip@v1.1.5...v1.1.9) --- updated-dependencies: - dependency-name: ip dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Signed-off-by: mluffman <[email protected]>
…hanos-io#7348) Bumps [webpack-dev-middleware](https://github.com/webpack/webpack-dev-middleware) from 5.3.1 to 5.3.4. - [Release notes](https://github.com/webpack/webpack-dev-middleware/releases) - [Changelog](https://github.com/webpack/webpack-dev-middleware/blob/v5.3.4/CHANGELOG.md) - [Commits](webpack/webpack-dev-middleware@v5.3.1...v5.3.4) --- updated-dependencies: - dependency-name: webpack-dev-middleware dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Signed-off-by: mluffman <[email protected]>
Signed-off-by: mluffman <[email protected]>
Signed-off-by: mluffman <[email protected]>
Signed-off-by: mluffman <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Changes
Adds a sidecar API with one endpoint:
/api/v1/flush
which calls the TSDB snapshot endpoint on the prometheus instance, then uploads all not-already-present blocks in the snapshot to object store.There are a few issues that explain the motivation:
Essentially if this is the last time sidecar will be running (ie. cluster is being deleted, shard being removed, etc...) then without some flushing mechanism you will permanently lose up to 2 hours of data.
Verification
Beside the unit tests, running prometheus locally and calling the endpoint works as expected.