Skip to content

Commit 903249c

Browse files
razvansbernauer
andauthored
feat: automatic cluster detection (#1068)
* added kubelet.rs * fetch kubelet config when initializing operators * deserialize proxy reposponse once * don't fetch cluster domain from kubelet if the user has set it already * remove comment with typo * revert unintended auto-format * Apply suggestions from code review Co-authored-by: Sebastian Bernauer <sebastian.bernauer@stackable.de> * better error messages * move kubelet query to cluster_info mod * review feedback * Update crates/stackable-operator/CHANGELOG.md Co-authored-by: Sebastian Bernauer <sebastian.bernauer@stackable.de> * fix md lint --------- Co-authored-by: Sebastian Bernauer <sebastian.bernauer@stackable.de>
1 parent a8e9dbc commit 903249c

File tree

8 files changed

+127
-17
lines changed

8 files changed

+127
-17
lines changed

Cargo.lock

Lines changed: 1 addition & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ educe = { version = "0.6.0", default-features = false, features = ["Clone", "De
2525
either = "1.13.0"
2626
futures = "0.3.30"
2727
futures-util = "0.3.30"
28+
http = "1.3.1"
2829
indexmap = "2.5.0"
2930
indoc = "2.0.6"
3031
insta = { version= "1.40", features = ["glob"] }

crates/stackable-operator/CHANGELOG.md

Lines changed: 22 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,23 @@ All notable changes to this project will be documented in this file.
44

55
## [Unreleased]
66

7+
### Added
8+
9+
- The default Kubernetes cluster domain name is now fetched from the kubelet API unless explicitly configured ([#1068])
10+
This requires operators to have the RBAC permission to `get` `nodes/proxy` in the apiGroup "", an example RBAC rule could look like:
11+
12+
```yaml
13+
---
14+
apiVersion: rbac.authorization.k8s.io/v1
15+
kind: ClusterRole
16+
metadata:
17+
name: operator-cluster-role
18+
rules:
19+
- apiGroups: [""]
20+
resources: [nodes/proxy]
21+
verbs: [get]
22+
```
23+
724
### Changed
825
926
- Update `kube` to `1.1.0` ([#1049]).
@@ -23,6 +40,7 @@ All notable changes to this project will be documented in this file.
2340
[#1058]: https://github.com/stackabletech/operator-rs/pull/1058
2441
[#1060]: https://github.com/stackabletech/operator-rs/pull/1060
2542
[#1064]: https://github.com/stackabletech/operator-rs/pull/1064
43+
[#1068]: https://github.com/stackabletech/operator-rs/pull/1068
2644

2745
## [0.93.2] - 2025-05-26
2846

@@ -148,7 +166,7 @@ All notable changes to this project will be documented in this file.
148166
### Added
149167

150168
- Add Deployments to `ClusterResource`s ([#992]).
151-
- Add `DeploymentConditionBuilder` ([#993]).
169+
- Add `DeploymentConditionBuilder` ([#993]).
152170

153171
### Changed
154172

@@ -369,7 +387,7 @@ All notable changes to this project will be documented in this file.
369387
### Fixed
370388

371389
- BREAKING: `KeyValuePairs::insert` (as well as `Labels::`/`Annotations::` via it) now overwrites
372-
the old value if the key already exists. Previously, `iter()` would return *both* values in
390+
the old value if the key already exists. Previously, `iter()` would return _both_ values in
373391
lexicographical order (causing further conversions like `Into<BTreeMap>` to prefer the maximum
374392
value) ([#888]).
375393

@@ -634,7 +652,7 @@ All notable changes to this project will be documented in this file.
634652

635653
### Changed
636654

637-
- Implement `PartialEq` for most *Snafu* Error enums ([#757]).
655+
- Implement `PartialEq` for most _Snafu_ Error enums ([#757]).
638656
- Update Rust to 1.77 ([#759])
639657

640658
### Fixed
@@ -1385,7 +1403,7 @@ This is a rerelease of 0.25.1 which some last-minute incompatible API changes to
13851403
### Changed
13861404

13871405
- Objects are now streamed rather than polled when waiting for them to be deleted ([#452]).
1388-
- serde\_yaml 0.8.26 -> 0.9.9 ([#450])
1406+
- serde_yaml 0.8.26 -> 0.9.9 ([#450])
13891407

13901408
[#450]: https://github.com/stackabletech/operator-rs/pull/450
13911409
[#452]: https://github.com/stackabletech/operator-rs/pull/452

crates/stackable-operator/Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ dockerfile-parser.workspace = true
2828
either.workspace = true
2929
educe.workspace = true
3030
futures.workspace = true
31+
http.workspace = true
3132
indexmap.workspace = true
3233
json-patch.workspace = true
3334
k8s-openapi.workspace = true

crates/stackable-operator/src/client.rs

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,11 @@ pub enum Error {
8484

8585
#[snafu(display("unable to create kubernetes client"))]
8686
CreateKubeClient { source: kube::Error },
87+
88+
#[snafu(display("unable to fetch cluster information from kubelet"))]
89+
NewKubeletClusterInfo {
90+
source: crate::utils::cluster_info::Error,
91+
},
8792
}
8893

8994
/// This `Client` can be used to access Kubernetes.
@@ -651,7 +656,9 @@ pub async fn initialize_operator(
651656
.context(InferKubeConfigSnafu)?;
652657
let default_namespace = kubeconfig.default_namespace.clone();
653658
let client = kube::Client::try_from(kubeconfig).context(CreateKubeClientSnafu)?;
654-
let cluster_info = KubernetesClusterInfo::new(cluster_info_opts);
659+
let cluster_info = KubernetesClusterInfo::new(&client, cluster_info_opts)
660+
.await
661+
.context(NewKubeletClusterInfoSnafu)?;
655662

656663
Ok(Client::new(
657664
client,
Lines changed: 19 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,14 @@
1-
use std::str::FromStr;
1+
use kube::Client;
2+
use snafu::{ResultExt, Snafu};
23

3-
use crate::commons::networking::DomainName;
4+
use crate::{commons::networking::DomainName, utils::kubelet};
45

5-
const KUBERNETES_CLUSTER_DOMAIN_DEFAULT: &str = "cluster.local";
6+
#[derive(Debug, Snafu)]
7+
pub enum Error {
8+
#[snafu(display("unable to fetch kubelet config"))]
9+
KubeletConfig { source: kubelet::Error },
10+
}
611

7-
/// Some information that we know about the Kubernetes cluster.
812
#[derive(Debug, Clone)]
913
pub struct KubernetesClusterInfo {
1014
/// The Kubernetes cluster domain, typically `cluster.local`.
@@ -21,25 +25,28 @@ pub struct KubernetesClusterInfoOpts {
2125
}
2226

2327
impl KubernetesClusterInfo {
24-
pub fn new(cluster_info_opts: &KubernetesClusterInfoOpts) -> Self {
28+
pub async fn new(
29+
client: &Client,
30+
cluster_info_opts: &KubernetesClusterInfoOpts,
31+
) -> Result<Self, Error> {
2532
let cluster_domain = match &cluster_info_opts.kubernetes_cluster_domain {
2633
Some(cluster_domain) => {
2734
tracing::info!(%cluster_domain, "Using configured Kubernetes cluster domain");
2835

2936
cluster_domain.clone()
3037
}
3138
None => {
32-
// TODO(sbernauer): Do some sort of advanced auto-detection, see https://github.com/stackabletech/issues/issues/436.
33-
// There have been attempts of parsing the `/etc/resolv.conf`, but they have been
34-
// reverted. Please read on the linked issue for details.
35-
let cluster_domain = DomainName::from_str(KUBERNETES_CLUSTER_DOMAIN_DEFAULT)
36-
.expect("KUBERNETES_CLUSTER_DOMAIN_DEFAULT constant must a valid domain");
37-
tracing::info!(%cluster_domain, "Defaulting Kubernetes cluster domain as it has not been configured");
39+
let kubelet_config = kubelet::KubeletConfig::fetch(client)
40+
.await
41+
.context(KubeletConfigSnafu)?;
42+
43+
let cluster_domain = kubelet_config.cluster_domain;
44+
tracing::info!(%cluster_domain, "Using Kubernetes cluster domain from the kubelet config");
3845

3946
cluster_domain
4047
}
4148
};
4249

43-
Self { cluster_domain }
50+
Ok(Self { cluster_domain })
4451
}
4552
}
Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
use http;
2+
use k8s_openapi::api::core::v1::Node;
3+
use kube::{
4+
Api,
5+
api::{ListParams, ResourceExt},
6+
client::Client,
7+
};
8+
use serde::Deserialize;
9+
use snafu::{OptionExt, ResultExt, Snafu};
10+
11+
use crate::commons::networking::DomainName;
12+
13+
#[derive(Debug, Snafu)]
14+
pub enum Error {
15+
#[snafu(display("failed to list nodes"))]
16+
ListNodes { source: kube::Error },
17+
18+
#[snafu(display("failed to build request for url path \"{url_path}\""))]
19+
BuildConfigzRequest {
20+
source: http::Error,
21+
url_path: String,
22+
},
23+
24+
#[snafu(display("failed to fetch kubelet config from node {node:?}"))]
25+
FetchNodeKubeletConfig { source: kube::Error, node: String },
26+
27+
#[snafu(display("failed to fetch `kubeletconfig` JSON key from configz response"))]
28+
KubeletConfigJsonKey,
29+
30+
#[snafu(display("failed to deserialize kubelet config JSON"))]
31+
KubeletConfigJson { source: serde_json::Error },
32+
33+
#[snafu(display(
34+
"empty Kubernetes nodes list. At least one node is required to fetch the cluster domain from the kubelet config"
35+
))]
36+
EmptyKubernetesNodesList,
37+
}
38+
39+
#[derive(Debug, Deserialize)]
40+
#[serde(rename_all = "camelCase")]
41+
struct ProxyConfigResponse {
42+
kubeletconfig: KubeletConfig,
43+
}
44+
45+
#[derive(Debug, Deserialize)]
46+
#[serde(rename_all = "camelCase")]
47+
pub struct KubeletConfig {
48+
pub cluster_domain: DomainName,
49+
}
50+
51+
impl KubeletConfig {
52+
/// Fetches the kubelet configuration from the "first" node in the Kubernetes cluster.
53+
pub async fn fetch(client: &Client) -> Result<Self, Error> {
54+
let api: Api<Node> = Api::all(client.clone());
55+
let nodes = api
56+
.list(&ListParams::default())
57+
.await
58+
.context(ListNodesSnafu)?;
59+
let node = nodes.iter().next().context(EmptyKubernetesNodesListSnafu)?;
60+
let node_name = node.name_any();
61+
62+
let url_path = format!("/api/v1/nodes/{node_name}/proxy/configz");
63+
let req = http::Request::get(url_path.clone())
64+
.body(Default::default())
65+
.context(BuildConfigzRequestSnafu { url_path })?;
66+
67+
let resp = client
68+
.request::<ProxyConfigResponse>(req)
69+
.await
70+
.context(FetchNodeKubeletConfigSnafu { node: node_name })?;
71+
72+
Ok(resp.kubeletconfig)
73+
}
74+
}

crates/stackable-operator/src/utils/mod.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
pub mod bash;
22
pub mod cluster_info;
33
pub mod crds;
4+
pub mod kubelet;
45
pub mod logging;
56
mod option;
67
mod url;

0 commit comments

Comments
 (0)