Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,12 @@ All notable changes to this project will be documented in this file.

## [Unreleased]

### Added

- Added permissions required by Topology Provider ([#738]).

[#738]: https://github.com/stackabletech/hdfs-operator/pull/738

## [25.11.0] - 2025-11-07

## [25.11.0-rc1] - 2025-11-06
Expand Down
9 changes: 9 additions & 0 deletions deploy/helm/hdfs-operator/templates/roles.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -206,10 +206,19 @@ rules:
verbs:
- get
- list
# needed for pod informer
- watch
- apiGroups:
- listeners.stackable.tech
resources:
- listeners
verbs:
- get
- list
# needed to query the crd version (v1alpha1 etc.) before fetching listeners
- apiGroups:
- apiextensions.k8s.io
resources:
- customresourcedefinitions
verbs:
- get
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
= HDFS Rack Awareness
:rack-awareness-docs: https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/RackAwareness.html
:hdfs-topology-provider: https://github.com/stackabletech/hdfs-topology-provider
:hdfs-topology-provider: https://github.com/stackabletech/hdfs-utils/blob/main/src/main/java/tech/stackable/hadoop/StackableTopologyProvider.java

{rack-awareness-docs}[Rack awareness] is a feature in Apache Hadoop that allows users to define a cluster's node topology.
Hadoop uses that topology to distribute block replicas in a way that maximizes fault tolerance.
Expand Down Expand Up @@ -35,3 +35,7 @@ This creates an internal topology label by combining the values of the `topology
In order to enable gathering this information the Hadoop images contain the {hdfs-topology-provider}[hdfs-topology-provider] on the classpath, which can be configured to read labels from Kubernetes objects.

The operator deploys ClusterRoles and ServicesAccounts with the relevant RBAC rules to allow the Hadoop Pod to access the necessary Kubernetes objects.
Topologies and other metadata such as Node- and Pod-IPs and endpoints are held in separate caches so that they can be refeshed independently of one another.
The {hdfs-topology-provider}[hdfs-topology-provider] is namespace-scoped and pods in the active namespace are watched so that changes can be propagated to the internal cache to minimise cache misses.

NOTE: Rack awareness will not work on clusters such as `kind` or `k3s` that configure IP-masquerading differently to production-ready distributions.