Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(feat) central cluster ADR #1

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,10 @@ During developement the question got raised whatever it is a good decision/archi
> User story: Customer onboards a newly created cluster and requires an Ingress to expose applications. Via Greenhouse the Ingress Plugin can be configured which results in a deployment of the ingress controller within the customer cluster.
> The PluginConfig, dashboard reflects the current status of relevant underlying resources.

## Related Decision Records

Superseded by [Greenhouse-ADR-6-central_cluster.md](Greenhouse-ADR-6-central_cluster.md)

## Decision Drivers

* Should work with/ focus on the for MVP in scope Applications
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
# ADR-6 Central cluster

## Decision Contributors

- Arno Uhlig
- Ivo Gosemann
- David Rochow
- Martin Vossen
- David Gogl
- Fabian Ruff
- Richard Tief
- Tommy Sauer
- Timo Johner

## Status

- Proposed

## Context and Problem Statement

The central cluster in Greenhouse hosts non-organization specific core components as well as organization-specific metadata and configuration.
Organizations are isolated by namespaces and permissions (RBAC) are restricted to Greenhouse resource.
Granting more permissions would increase the attack surface and introduce additional risks.

Another aspect to consider is billing.
The shared nature of the central cluster and underlying infrastructure does not allow tenant-specific measurement and billing of consumed resources.
Thus workload in the central cluster is charged on the provider.

Moreover, workload within the central cluster is neither transparent nor accessible to the customer.
It cannot be configured, its metrics, logs, etc. are not exposed and access (kubectl exec/delete pod) is restricted.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
It cannot be configured, its metrics, logs, etc. are not exposed and access (kubectl exec/delete pod) is restricted.
It cannot be configured, and its metrics and logs are not exposed. Access to operations like 'kubectl exec' or 'kubectl delete pods' is restricted in the central cluster.

Thus operations of all workload within the central cluster is on the provider.

From a network perspective and as documented in the security concept, communication is only uni-directional from the central to the remote clusters.

Currently, the central Prometheus Alertmanager (AM) is being run within the central cluster for each organization as part of the alerts plugin.
Since Prometheis servers push alerts to the AM, it is exposed via an ingress resource incl. TLS certificates and DNS records.
While this contributes to simplicity and easiness of use, this violates the security concept and introduces additional costs for the provider.
Moreover, it assumes the network zone of the central Greenhouse cluster is a good fit across all organizations and cloud providers.

Use cases being:
1) Prometheus Alertmanager for holistic alerting capabilities
2) Thanos query and ruler component for organization-wide access to decentralized metric stores
3) Grafana/Plutono for holistic dashboards

## Related Decision Records

Supersedes [Greenhouse-ADR-3-location_of_plugins.md](Greenhouse-ADR-3-location_of_plugins.md)

## Decision Drivers

* **Network Compatibility**
It assumes that the network zone of the central Greenhouse cluster is suitable for all organizations and cloud providers.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicit mention: This would enable use cases residing in different hyperscalers.


* **Security aspects**
Increased permissions and capabilities enlarge the attack surface, introducing risks.

* **Operational concerns**
User-configurable workloads in the central cluster are not transparent to customers and must be managed by the Greenhouse team.

* **Billing**
Tenant-specific resources must be charged to the respective tenant.

* **Easiness of use**
Greenhouse should offer an easy way to manage operational aspects with a low entry barrier.

## Decision

* No user-configurable plugins should be allowed in the Greenhouse central cluster.
* Maintain restrictive permissions within the central cluster limited to Greenhouse resources.
* Introduce `AdminPlugins` to utilize the plugin concept for handling core responsibilities.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding context from Slack DMs: AdminPlugins in this case could be Plugins such as IdP Integration, Cluster Registry, Greenhouse Teams to Slack syncing, ... . These would all be Plugins which are close the backend (e.g. use Greenhouse CRDs) but are developed separately from the Core Operators.

Copy link
Member Author

@auhlig auhlig Jul 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Sharpen definition AdminPlugins.
kubeconfig generator, CAM integration - things not directly configurable by the user

They cannot be configured by a user and are fully managed by Greenhouse.
* A customer has to onboard at least one cluster to instantiate plugins with a backend.

---

## Evaluated options, technical details, etc.

N/A
Loading