-
Notifications
You must be signed in to change notification settings - Fork 594
HDDS-14183. Attempted to decrement available space to a negative value #9655
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
...tainer-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java
Outdated
Show resolved
Hide resolved
...tainer-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add test coverage to verify the restore of both usedspace and committedBytes
| /** | ||
| * Commit space reserved for write to usedSpace when write operation succeeds. | ||
| */ | ||
| private void commitSpaceReservedForWrite(HddsVolume volume, boolean spaceReserved, long bytes) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can commit happen when space is not reserved?
| private void commitSpaceReservedForWrite(HddsVolume volume, boolean spaceReserved, long bytes) { | ||
| if (spaceReserved) { | ||
| volume.releaseReservedSpaceForWrite(bytes); | ||
| volume.incrementUsedSpace(bytes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since release and increment are not inside critical section, its better to incrementUsedSpace first and then release reserved space, if its done as it currently implemented if other thread check free space between these 2 calls it could try to use free space that already used by this write.
What changes were proposed in this pull request?
Saw this warning when datanode disk was nearly full:
Prior to this message, there were many failed writes. Perhaps it needs to increment the value when the write fails.
The fix adds rollback logic in
KeyValueHandler.handleWriteChunk()that tracks when a chunk write succeeds and increments theusedSpacecounter. If any subsequent operation fails, the exception handler callsvolume.decrementUsedSpace()to restore the counter.What is the link to the Apache JIRA
HDDS-14183
How was this patch tested?
CI: https://github.com/sarvekshayr/ozone/actions/runs/21200210393