feat(net): reduce sync memory via lazy parsing and throttling#6717
Open
xxo1shine wants to merge 1 commit intotronprotocol:developfrom
Open
feat(net): reduce sync memory via lazy parsing and throttling#6717xxo1shine wants to merge 1 commit intotronprotocol:developfrom
xxo1shine wants to merge 1 commit intotronprotocol:developfrom
Conversation
… and throttling in-flight requests
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Cap memory usage during block sync by deferring block deserialization and bounding in-flight block requests.
New
UnparsedBlockcarrier — lightweight(BlockId, byte[])pair (framework/.../net/service/sync/UnparsedBlock.java). Holds the parsed block id (cheap) plus the raw protobuf bytes; the block body is not deserialized until the worker thread is ready to process it. Implementsequals/hashCodeonblockIdso it works as a map key.Defer deserialization in sync queues — in
SyncService:blockWaitToProcessandblockJustReceivedchange fromMap<BlockMessage, PeerConnection>toMap<UnparsedBlock, PeerConnection>processBlock(peer, blockMessage)builds theUnparsedBlockfromblockMessage.getData()instead of inserting the fully-parsed messagehandleSyncBlockparses the bytes back into aBlockMessageonly when the block is about to be appliedThrottle in-flight block requests via
node.maxPendingBlockNum— new config controlling the total budget ofrequested + justReceived + waitToProcessblocks across all peers:500, clamped to[50, 2000]inNodeConfig.postProcess()NodeConfig.maxPendingBlockNum→Args.applyNodeConfig→CommonParameter.maxPendingBlockNumreference.conf; example documented inconfig.confstartFetchSyncBlockbecomes budget-aware — computesremainNum = maxPendingBlockNum - requested - justReceived - waitToProcess; once the budget isexhausted it stops requesting new heights, with one exception: blocks at or below
maxRequestedBlockNum(the previously-requested ceiling) are still allowedthrough, so a slow peer can be retried without stalling the whole sync.
Tests —
SyncServiceTestupdated to constructUnparsedBlockcarriers in place of rawBlockMessageputs, plus boundary-condition coverage for the throttlepath.
Why are these changes required?
Sync-time heap was unbounded. Every received block was parsed eagerly into a
BlockMessageand held in two maps until processing finished. EachBlockMessagekeeps the parsedBlockproto with every transaction expanded into Java objects, which inflates the on-heap footprint several times over the wire size. During catch-up sync from many peers, this routinely caused multi-GB heap growth and GC pressure, with documented OOM incidents on smaller-RAM nodes.UnparsedBlockcollapses the worst case. Raw bytes are 1–2 MB per block; the parsed in-memory representation is significantly larger. Keeping the bytes raw until processing defers (and amortizes) the parse cost to the worker that actually needs it, and frees the in-flight buffer to grow proportional to bytes, not parsed-object-graph.Concurrent peer fetching had no global cap. A peer aggressively shipping inventory could push the
blockJustReceivedmap far past safe levels, because the receiver only tracked per-peerMAX_BLOCK_FETCH_PER_PEER. Boundingrequested + justReceived + waitToProcessmakes the worst-case in-flight memory predictable:maxPendingBlockNum × avg block size.The
maxRequestedBlockNumexemption avoids deadlock. A pure hard cap could stall sync if every peer holding a needed block goes idle and the budget is full —the only way to make progress is to retry the height, which requires re-requesting a block already in the budget. Allowing retries within the existing ceiling keeps sync forward-progressing under transient peer flakiness.
Operators need the knob. Default
500is conservative for nodes with ≤ 8 GB heap; large/dedicated nodes benefit from raising it for higher sync throughput, and constrained environments can lower it. Surfacing it asnode.maxPendingBlockNum(rather than a constant) lets operators tune without a release.This PR has been tested by:
Follow up
Extra details
Fixes #6685