HBASE-29776: Log filtering in IncrementalBackupManager can lead to data loss #7582

hgromer · 2025-12-29T13:09:18Z

Apache-HBase · 2025-12-30T15:32:18Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 29s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	hbaseanti	0m 0s		Patch does not have any anti-patterns.
			_ master Compile Tests _
+1 💚	mvninstall	3m 39s		master passed
+1 💚	compile	0m 31s		master passed
+1 💚	checkstyle	0m 13s		master passed
+1 💚	spotbugs	0m 31s		master passed
+1 💚	spotless	0m 51s		branch has no errors when running spotless:check.
			_ Patch Compile Tests _
+1 💚	mvninstall	3m 4s		the patch passed
+1 💚	compile	0m 29s		the patch passed
-0 ⚠️	javac	0m 29s	/results-compile-javac-hbase-backup.txt	hbase-backup generated 1 new + 127 unchanged - 0 fixed = 128 total (was 127)
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	0m 11s		the patch passed
+1 💚	spotbugs	0m 35s		the patch passed
+1 💚	hadoopcheck	12m 13s		Patch does not cause any errors with Hadoop 3.3.6 3.4.1.
+1 💚	spotless	0m 45s		patch has no errors when running spotless:check.
			_ Other Tests _
+1 💚	asflicense	0m 11s		The patch does not generate ASF License warnings.
		31m 23s

Subsystem	Report/Notes
Docker	ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7582/2/artifact/yetus-general-check/output/Dockerfile
GITHUB PR	#7582
JIRA Issue	HBASE-29776
Optional Tests	dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless
uname	Linux 27ce6787059f 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	master / `25d095b`
Default Java	Eclipse Adoptium-17.0.11+9
Max. process+thread count	82 (vs. ulimit of 30000)
modules	C: hbase-backup U: hbase-backup
Console output	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7582/2/console
versions	git=2.34.1 maven=3.9.8 spotbugs=4.7.3
Powered by	Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

Apache-HBase · 2025-12-30T15:35:01Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 29s		Docker mode activated.
-0 ⚠️	yetus	0m 3s		Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
			_ Prechecks _
			_ master Compile Tests _
+1 💚	mvninstall	3m 39s		master passed
+1 💚	compile	0m 19s		master passed
+1 💚	javadoc	0m 16s		master passed
+1 💚	shadedjars	6m 21s		branch has no errors when building our shaded downstream artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	3m 11s		the patch passed
+1 💚	compile	0m 19s		the patch passed
+1 💚	javac	0m 19s		the patch passed
+1 💚	javadoc	0m 13s		the patch passed
+1 💚	shadedjars	6m 13s		patch has no errors when building our shaded downstream artifacts.
			_ Other Tests _
+1 💚	unit	11m 58s		hbase-backup in the patch passed.
		34m 1s

Subsystem	Report/Notes
Docker	ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7582/2/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR	#7582
JIRA Issue	HBASE-29776
Optional Tests	javac javadoc unit compile shadedjars
uname	Linux c8fccd520408 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	master / `25d095b`
Default Java	Eclipse Adoptium-17.0.11+9
Test Results	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7582/2/testReport/
Max. process+thread count	3869 (vs. ulimit of 30000)
modules	C: hbase-backup U: hbase-backup
Console output	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7582/2/console
versions	git=2.34.1 maven=3.9.8
Powered by	Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

krconv

Looks great; and wanted to note, this has resolved a number of inconsistencies for us. Nearly all of our backups with a host that was taken offline recently (i.e. last ~30 days) was hitting some variation of this issue, where old WALs would be replayed and overwrite newer edits in the incremental backups.

Apache-HBase · 2026-01-05T14:21:23Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 13s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	hbaseanti	0m 0s		Patch does not have any anti-patterns.
			_ master Compile Tests _
+1 💚	mvninstall	3m 59s		master passed
+1 💚	compile	0m 36s		master passed
+1 💚	checkstyle	0m 13s		master passed
+1 💚	spotbugs	0m 34s		master passed
+1 💚	spotless	0m 50s		branch has no errors when running spotless:check.
			_ Patch Compile Tests _
+1 💚	mvninstall	2m 58s		the patch passed
+1 💚	compile	0m 31s		the patch passed
-0 ⚠️	javac	0m 31s	/results-compile-javac-hbase-backup.txt	hbase-backup generated 1 new + 127 unchanged - 0 fixed = 128 total (was 127)
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	0m 12s		the patch passed
+1 💚	spotbugs	0m 36s		the patch passed
+1 💚	hadoopcheck	12m 16s		Patch does not cause any errors with Hadoop 3.3.6 3.4.1.
+1 💚	spotless	0m 43s		patch has no errors when running spotless:check.
			_ Other Tests _
+1 💚	asflicense	0m 11s		The patch does not generate ASF License warnings.
		30m 28s

Subsystem	Report/Notes
Docker	ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7582/3/artifact/yetus-general-check/output/Dockerfile
GITHUB PR	#7582
JIRA Issue	HBASE-29776
Optional Tests	dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless
uname	Linux de11d54be049 6.8.0-1024-aws #26~22.04.1-Ubuntu SMP Wed Feb 19 06:54:57 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	master / `d45cb7b`
Default Java	Eclipse Adoptium-17.0.11+9
Max. process+thread count	83 (vs. ulimit of 30000)
modules	C: hbase-backup U: hbase-backup
Console output	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7582/3/console
versions	git=2.34.1 maven=3.9.8 spotbugs=4.7.3
Powered by	Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

Apache-HBase · 2026-01-05T14:25:30Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 29s		Docker mode activated.
-0 ⚠️	yetus	0m 3s		Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
			_ Prechecks _
			_ master Compile Tests _
+1 💚	mvninstall	3m 42s		master passed
+1 💚	compile	0m 19s		master passed
+1 💚	javadoc	0m 16s		master passed
+1 💚	shadedjars	6m 23s		branch has no errors when building our shaded downstream artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	3m 11s		the patch passed
+1 💚	compile	0m 19s		the patch passed
+1 💚	javac	0m 19s		the patch passed
+1 💚	javadoc	0m 14s		the patch passed
+1 💚	shadedjars	6m 17s		patch has no errors when building our shaded downstream artifacts.
			_ Other Tests _
+1 💚	unit	12m 13s		hbase-backup in the patch passed.
		34m 33s

Subsystem	Report/Notes
Docker	ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7582/3/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR	#7582
JIRA Issue	HBASE-29776
Optional Tests	javac javadoc unit compile shadedjars
uname	Linux fbc84f21bee9 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	master / `d45cb7b`
Default Java	Eclipse Adoptium-17.0.11+9
Test Results	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7582/3/testReport/
Max. process+thread count	3797 (vs. ulimit of 30000)
modules	C: hbase-backup U: hbase-backup
Console output	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7582/3/console
versions	git=2.34.1 maven=3.9.8
Powered by	Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

DieterDP-ng · 2026-01-02T19:17:12Z

hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/IncrementalBackupManager.java

Hi @hgromer, I agree HBASE-29776 is an issue (sorry for not responding sooner there, I was on vacation the past weeks), but I'm not yet convinced this is the right approach to fix it. It feels very complex to reason about, so I wonder if there isn't a simpler approach. Already wanted to give some intermediate feedback while I think a bit more about it.

Since newTimestamps never is pruned, the entry in the backup table will keep growing over time.

newTimestamps will end up being written in BackupSystemTable#createPutForWriteRegionServerLogTimestamp, but these change no longer match the javadoc of that method.

My suggestion is to simplify all changes in this file to (the fix for the excluded log files is also needed):

LOG.info("Execute roll log procedure for incremental backup ..."); long rollStartTs = EnvironmentEdgeManager.currentTime(); BackupUtils.logRoll(conn, backupInfo.getBackupRootDir(), conf); Map<String, Long> rollTimestamps = readRegionServerLastLogRollResult(); Map<String, Long> newTimestamps = rollTimestamps.entrySet().stream() // Region servers that are offline since the last backup will have old roll timestamps, // prune their information here, as it is not relevant to be stored or used for finding // the relevant logs. .filter(entry -> entry.getValue() > rollStartTs) .collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue)); // This method needs to be adjusted to use "rollStartTs" if an entry is not found in newTimestamps. // Or alternatively: getLogFilesForNewBackup(previousTimestampMins, // DefaultedMap.defaultedMap(newTimestamps, rollStartTs), conf, savedStartCode); logList = getLogFilesForNewBackup(previousTimestampMins, newTimestamps, conf, savedStartCode);

Then when finding which logs to include, these are the options:

server found in both previous and newTimestamps: a region server that is unchanged, include logs older than previous and newer than newTimestamps

server found in only previous: a region server that has gone offline, all logs will be older than rollStartTs and should be included

server found in only newTimestamps: a new region server, include all logs that are older than the corresponding newTimestamp

server found in neither: a region server that was started and went back offline in between the previous and current backup, all logs will be older than rollStartTs and should be included

This approach will keep newTimestamps limited to the relevant entries. We could consider cleaning up the entries for readRegionServerLastLogRollResult as well, but left that out of scope for now.

Similar code suffices in the FullTableBackupClient.

No worries, appreciate you taking the time to look here. We've found a slew of bugs in the WAL retention system and I think it's important to get this right, so happy to iterate on feedback.

Since newTimestamps never is pruned, the entry in the backup table will keep growing over time.

Agree with this. It's something we should take a look at.

To your point about WAL retention and boundaries, conceptually, I've been trying to think about it from the perspective of "which WAL files have been backed up". Otherwise you run into issues when a host goes offline.

For example, in the case where

We have a Server A with row X

An incremental backup is taken, A is rolled

A writes more WAL files

Row X is deleted

A major compaction happens

A full backup is taken, WALs are rolled, but we don't update the timestamp for A. Row X is not included in the full backup

An incremental backup is taken, we still have a very old roll time for this host. Row X is backed up again, and shows up in the backup even though we had previously deleted (but the tombstone no longer exists).

So we've resurfaced dead data that shouldn't be included. It's problematic to back up WALs that are very old. So this is the main culprit for the added complexity here

Additionally, I'm weary of comparing timestamps across hosts, which is why I was wary of doing something like generating a boundary timestamp in the backup process, which happens client side and opted to compare WAL timestamps which are generated by the same host.

server found in both previous and newTimestamps: a region server that is unchanged, include logs older than previous and newer than newTimestamps

server found in only previous: a region server that has gone offline, all logs will be older than rollStartTs and should be included

If I understand correctly, run into this issue

server found in only newTimestamps: a new region server, include all logs that are older than the corresponding newTimestamp

server found in neither: a region server that was started and went back offline in between the previous and current backup, all logs will be older than rollStartTs and should be included

Agree here on the first backup this happens, but we never update the host TS and so we'll continue to backup the WAL files and run into the issue mentioned above.

I'd be more than happy to find a simpler solution though, I really don't love how complex this WAL retention boundary logic is; but struggled to do so and also avoid corrupting the data

Map<String, Long> newTimestamps = rollTimestamps.entrySet().stream() // Region servers that are offline since the last backup will have old roll timestamps, // prune their information here, as it is not relevant to be stored or used for finding // the relevant logs. .filter(entry -> entry.getValue() > rollStartTs) .collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));

I'm not sure I necessarily agree with this logic. B/c the RS not being rolled this go around doesn't mean we've backed up all the files from the RS we need to backup. It just means the host doesn't exist on the cluster at the moment.

Server A is backed up

Server A generate more WAL files

Server A is removed from cluster

New backup occurs, but we don't get a roll time for Server A so we ignore its files

We need to backup the files that were generated between the last backup, and this backup

No worries, appreciate you taking the time to look here. We've found a slew of bugs in the WAL retention system and I think it's important to get this right, so happy to iterate on feedback.

Since newTimestamps never is pruned, the entry in the backup table will keep growing over time.

Agree with this. It's something we should take a look at.

To your point about WAL retention and boundaries, conceptually, I've been trying to think about it from the perspective of "which WAL files have been backed up". Otherwise you run into issues when a host goes offline.

For example, in the case where

We have a Server A with row X

1. An incremental backup is taken, A is rolled 2. A writes more WAL files 3. Row X is deleted 4. A major compaction happens 5. A full backup is taken, WALs are rolled, but we don't update the timestamp for A. Row X is _not_ included in the full backup 6. An incremental backup is taken, we still have a very old roll time for this host. Row X is backed up again, and shows up in the backup even though we had previously deleted (but the tombstone no longer exists).

So we've resurfaced dead data that shouldn't be included. It's problematic to back up WALs that are very old. So this is the main culprit for the added complexity here

I agree with your example, and agree that the change to FullTableBackupClient would fix this. It also shrinks the newTimestamps, which nice.

Additionally, I'm weary of comparing timestamps across hosts, which is why I was wary of doing something like generating a boundary timestamp in the backup process, which happens client side and opted to compare WAL timestamps which are generated by the same host.

I see, I originally thought it might be less code to generate a client pre-roll timestamp, but it doesn't really simplify things. For the FullTableBackupClient at least, the current code is simple enough. I would suggest to split off a dedicated calculateNewTimestamps method with some proper javadoc. (Still thinking about the incremental case.)

One advantage of not basing newTimestamps on the logs, but on the roll timestamps would be that the same logic could be used for full and incremental backups.

I'm not sure I necessarily agree with this logic. B/c the RS not being rolled this go around doesn't mean we've backed up all the files from the RS we need to backup. It just means the host doesn't exist on the cluster at the moment.

1. Server A is backed up 2. Server A generate more WAL files 3. Server A is removed from cluster 4. New backup occurs, but we don't get a roll time for Server A so we ignore its files

We need to backup the files that were generated between the last backup, and this backup

Not sure we're on the same page here. (Did you miss that I mentioned that getLogFilesForNewBackup needs some adjustments?)

For the example you gave (quoted above), with the code block I suggested, the following would happen:

when the logs are rolled, rollTimestamps would not have been updated, since the RS A was offline

so the rollTimestamps value for A would be < rollStartTs, and A would not be present as a key in newTimestamps

in the updated getLogFilesForNewBackup, we'd fall in case server found in only previous: a region server that has gone offline, all logs will be older than rollStartTs and should be included, so all logs of A would be included.

DieterDP-ng · 2026-01-02T19:17:47Z

hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/IncrementalBackupManager.java

    BackupUtils.logRoll(conn, backupInfo.getBackupRootDir(), conf);

-    newTimestamps = readRegionServerLastLogRollResult();
+    Map<String, Long> newTimestamps = readRegionServerLastLogRollResult();


This method does an unnecessary scan, since you override all entries in the code you add below.

DieterDP-ng · 2026-01-02T19:18:26Z

hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/IncrementalBackupManager.java

+        long logTs = BackupUtils.getCreationTime(logPath);
+        Long latestTimestampToIncludeInBackup = newTimestamps.get(logHost);
+        if (latestTimestampToIncludeInBackup == null || logTs > latestTimestampToIncludeInBackup) {
+          LOG.info("Updating backup boundary for inactive host {}: timestamp={}", logHost, logTs);


Is this log line correct? Are we dealing with an inactive host if latestTimestampToIncludeInBackup != null?

DieterDP-ng · 2026-01-02T19:19:27Z

hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/IncrementalBackupManager.java

    logList = excludeProcV2WALs(logList);
    backupInfo.setIncrBackupFileList(logList);

+    // Update boundaries based on WALs that will be backed up


For my understanding, is this code block an optimization, or a necessary fix for a specific case of appearing/disappearing region servers?

Figured it out, it is to update the newTimestamps entries for regionservers that have since gone offline, but for which the logs are now backed up.

DieterDP-ng · 2026-01-02T19:20:29Z

hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/FullTableBackupClient.java

Haven't gotten around to processing the changes in this file, but can you sketch why they are needed? Since your original ticket only discusses an issue with incremental backups.

Figured it out, it's to ensure the newTimestamps for no longer active region servers are updated.

DieterDP-ng · 2026-01-06T15:46:46Z

hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/IncrementalBackupManager.java

+      Path logPath = new Path(logFile);
+      String logHost = BackupUtils.parseHostFromOldLog(logPath);
+      if (logHost == null) {
+        logHost = BackupUtils.parseHostNameFromLogFile(logPath.getParent());


This method seems to support parsing old log names as well, is it possible to merge with the parsing 2 lines above? Though I am confused as to why the former uses logPath and the latter logPath.getParent()

DieterDP-ng · 2026-01-06T16:15:32Z

hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/IncrementalBackupManager.java

        resultLogFiles.add(currentLogFile);
      }
-
-      // It is possible that a host in .oldlogs is an obsolete region server


I think removing this block entirely is wrong. I believe the semantics of newestTimestamps is "ensure we have everything backed up to this timestamp". So if currentLogTS > newTimestamp is true, we should indeed skip this file.

So I think this block should be kept, but adjusted to:

if (newTimestamp != null && currentLogTS > newTimestamp) { newestLogs.add(currentLogFile); }

I also think a similar issue is present for the .logs in this method.

From your comment in HBASE-29776:

newTimestamp represents the last time a backup rolled the WAL on the RS. If the RegionServer isn't running and therefore isn't able to roll the WAL again, then this timestamp will be in the past, and we end up filtering out all WAL files that were updated since then. Why would we filter out oldWALs that have been created since then? That seems wrong as well

Your comment is correct, but I think the better fix is to ensure the newTimestamps are correctly updated (as you do in your other changes). Removing this block would result in too many logs being included in the backup.

if (newTimestamp != null && currentLogTS > newTimestamp) { newestLogs.add(currentLogFile); }

I don't think so, this would exclude all WAL files between last backup (previousTimestamps) and the current log roll (newTimestamp). Unless I'm misunderstanding

I think the relevant filtering out for very old log files happens here

So this logic is safe to remove imo

I don't think so, this would exclude all WAL files between last backup (previousTimestamps) and the current log roll (newTimestamp). Unless I'm misunderstanding

A WAL log from between last backup and current log roll would have a currentLogTS <= newTimestamp, no? If currentLogTS > newTimestamp it means it was created after the log roll that just happened, and should be excluded from the backup (so it should be in newestLogs)

DieterDP-ng · 2026-01-09T15:24:16Z

hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/FullTableBackupClient.java

+        }
+        allLogs.addAll(Arrays.asList(fs.listStatus(hostLogDir.getPath())));
+      }
+      allLogs.addAll(Arrays.asList(fs.listStatus(oldLogDir)));


There's apparently a config key hbase.separate.oldlogdir.by.regionserver that adds the server address as a subfolder, so this line will not work as expected when that's set to true. (Solution is to also check files that are one folder deeper.)

Question: when exactly (and by what) are WALs moved to the oldWals folder (oldLogDir)?

DieterDP-ng · 2026-01-09T16:42:31Z

hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/FullTableBackupClient.java

      List<BulkLoad> bulkLoadsToDelete = backupManager.readBulkloadRows(tableList);
+      Map<String, Long> previousLogRollsByHost = backupManager.readRegionServerLastLogRollResult();

+      // We roll log here before we do the snapshot. It is possible there is duplicate data


Nit: this comment block can be improved a bit.

Won't there always be data duplication between the rolled log and the snapshot? I don't see a reason to mention this.

I would mention that we roll the logs to create a boundary for WALs that need to be tracked for upcoming incremental backups.

A better approach is -> ...would be (to avoid confusion that you're talking about what is actually happen

DieterDP-ng · 2026-01-09T19:02:17Z

hbase-backup/src/test/java/org/apache/hadoop/hbase/backup/TestBackupOfflineRS.java

We should also have a testcases for:

a RS goes offline after a full backup was taken, before an incremental is taken

a RS comes online & goes back offline before a full backup

a RS comes online & goes back offline after a full backup was taken, before an incremental backup

At the same time, I realize the test code for each of these is quite elaborate. So I'm thinking it might be better to write a unit-type test for IncrementalBackupManager where the logs are abstracted away in an interface. Test code could then easily define which logs are present in a given scenario, and production code uses a file-based implementation of the logs.

If you want, I could give a shot at refactoring the code for this.

This comment has been minimized.

Sign in to view

hgromer force-pushed the HBASE-29776 branch from 7ad60d2 to 25d095b Compare December 30, 2025 14:45

krconv approved these changes Jan 5, 2026

View reviewed changes

Log filtering in IncrementalBackupManager can lead to data loss

d45cb7b

hgromer force-pushed the HBASE-29776 branch from 25d095b to d45cb7b Compare January 5, 2026 13:38

DieterDP-ng suggested changes Jan 6, 2026

View reviewed changes

DieterDP-ng reviewed Jan 9, 2026

View reviewed changes

HBASE-29776: Log filtering in IncrementalBackupManager can lead to data loss #7582

Are you sure you want to change the base?

HBASE-29776: Log filtering in IncrementalBackupManager can lead to data loss #7582

Uh oh!

Conversation

hgromer commented Dec 29, 2025

Uh oh!

This comment has been minimized.

This comment has been minimized.

Apache-HBase commented Dec 30, 2025

Uh oh!

Apache-HBase commented Dec 30, 2025

Uh oh!

krconv left a comment

Choose a reason for hiding this comment

Uh oh!

Apache-HBase commented Jan 5, 2026

Uh oh!

Apache-HBase commented Jan 5, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hgromer Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hgromer Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DieterDP-ng Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DieterDP-ng Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

hgromer Jan 7, 2026 •

edited

Loading

hgromer Jan 7, 2026 •

edited

Loading

DieterDP-ng Jan 9, 2026 •

edited

Loading

DieterDP-ng Jan 9, 2026 •

edited

Loading