-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Ideal state instance partitions metadata #17515
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Ideal state instance partitions metadata #17515
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #17515 +/- ##
============================================
- Coverage 63.25% 63.18% -0.07%
+ Complexity 1477 1476 -1
============================================
Files 3170 3172 +2
Lines 189469 189919 +450
Branches 28988 29063 +75
============================================
+ Hits 119840 120002 +162
- Misses 60339 60590 +251
- Partials 9290 9327 +37
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
2065108 to
b95aee0
Compare
b95aee0 to
5117d78
Compare
…ed during segment assignment
820e6a0 to
e642147
Compare
|
|
||
| public TableRebalancer(HelixManager helixManager) { | ||
| this(helixManager, null, null, null, null, null); | ||
| this(helixManager, null, null, null, null, null, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be true or false?
| "Cannot rebalance disabled table without downtime", null, null, null, null, null); | ||
| } | ||
|
|
||
| // Wipe out ideal state instance partitions metadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't wipe it until a rebalance is indeed required.
E.g. when segmentAssignmentUnchanged, we should check if instance partitions changed, then modify accordingly.
If we wipe it here, and following part throws exception, we might end up with an IS without instance partitions
| Map<String, List<String>> idealStateListFields = currentIdealState.getRecord().getListFields(); | ||
| InstancePartitionsUtils.replaceInstancePartitionsInIdealState(currentIdealState, instancePartitionsList); | ||
|
|
||
| return HelixHelper.updateIdealState(_helixManager, tableNameWithType, is -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't perform retry here. The update needs to be version checked update to ensure consistency of IS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should wipe the IP with the first IS change, and restore it with the last IS change. Replacing IP as separate step can cause inconsistency
| if (_updateIdealStateInstancePartitions) { | ||
| // Rebalance completed successfully, so we can update the instance partitions in the ideal state to reflect | ||
| // the new set of instance partitions. | ||
| List<InstancePartitions> instancePartitionsList = new ArrayList<>(instancePartitionsMap.values()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider making the order of this list deterministic, so that we can check if it is identical to the existing one
| Map<String, List<String>> idealStateListFields = idealState.getRecord().getListFields(); | ||
| for (InstancePartitions instancePartitions : instancePartitionsList) { | ||
| String instancePartitionsName = instancePartitions.getInstancePartitionsName(); | ||
| for (String partitionReplica : instancePartitions.getPartitionToInstancesMap().keySet()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(minor) We can do entrySet() to reduce map lookup
| Integer replicaGroup = Integer.parseInt(key.substring(separatorIndex + 1)); | ||
| listFields.getValue().forEach(value -> { | ||
| if (serverToReplicaGroupMap.containsKey(value)) { | ||
| LOGGER.warn("Server {} assigned to multiple replica groups ({}, {})", value, replicaGroup, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible that one server is assigned to multiple replicas? If so, will this break routing?
Should we consider throwing exception and fall back when this happens?
| private final String _instance; | ||
| private final boolean _online; | ||
| private final int _pool; | ||
| private final int _replicaGroupId; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(minor) Rename it to _replicaId
|
|
||
| /// Given a partition ID and replica group ID like "0_0", return the list of instances belonging to that instance | ||
| /// partition | ||
| public List<String> getInstances(String partitionReplica) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(minor) It is intentional to not provide this method to reduce map access. Caller should use entrySet() of the _partitionToInstancesMap instead of looking up each key
| for (InstancePartitions instancePartitions : instancePartitionsMap.values()) { | ||
| if (!instancePartitions.equals( | ||
| idealStateInstancePartitions.get(instancePartitions.getInstancePartitionsName()))) { | ||
| LOGGER.warn( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add a table level gauge to reflect if IP is wiped for IP enabled table
|
|
||
| // Assign instances | ||
| assignInstances(tableConfig, true); | ||
| assignInstances(tableConfig, idealState, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we revert the changes for instance assignment?
We should modify IS when assigning segment, not instance
MultiStageReplicaGroupSelector(as well as some differences in metadata maintained in inconsistent transient states). This ideal state metadata will be leveraged by future changes for better replica group routing.