-
Notifications
You must be signed in to change notification settings - Fork 3.9k
[move compiler] [CSE Step 2] common subexpression elimination #17989
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: jun/reach-def
Are you sure you want to change the base?
Conversation
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
b6f8d5d to
97a2b06
Compare
97a2b06 to
e009ba3
Compare
ca16b0f to
6dd56f1
Compare
e009ba3 to
6eb7704
Compare
6dd56f1 to
d9b6d27
Compare
8b88a87 to
3066301
Compare
a6827da to
69c4e71
Compare
894fa74 to
4e29fd9
Compare
69c4e71 to
54a1873
Compare
4e29fd9 to
947e5c9
Compare
285eb08 to
054714b
Compare
cde52be to
88ac450
Compare
8bbbf38 to
4f00c2e
Compare
88ac450 to
2d543b1
Compare
9e565bb to
50f2a44
Compare
50f2a44 to
d43320c
Compare
2d543b1 to
22fffeb
Compare
e920739 to
3752618
Compare
22fffeb to
cadaebb
Compare
3752618 to
116067d
Compare
cadaebb to
68e406f
Compare
116067d to
5b4d51b
Compare
68e406f to
27dfb30
Compare
5b4d51b to
2b2c67f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is being reviewed by Cursor Bugbot
Details
Your team is on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle for each member of your team.
To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.
third_party/move/move-compiler-v2/src/pipeline/common_subexp_elimination.rs
Show resolved
Hide resolved
2b2c67f to
8679d38
Compare
| } | ||
| } | ||
| globals | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: collect_resources misses Exists operation for safety check
The collect_resources function only collects resources from Operation::BorrowGlobal but not from Operation::Exists, despite the documentation stating that Condition 5 ensures both BorrowGlobal and Exists operations are safe to reuse. Since BytecodeSanitizer::Exists allows Exists operations to be CSE candidates in aggressive mode, but collect_resources doesn't include them, the resources_safe_to_reuse check will operate on empty collections for Exists expressions. This could allow incorrect CSE replacement when an Exists check is repeated but the underlying resource state changed between the two calls (via MoveTo or MoveFrom).

Description
This PR implements the "common subexpression elimination" (CSE) transformation. This is the second PR on the stack to introduce CSE optimizations to Move.
Motivating Example:
At the stackless bytecode level,
data.xis translated into a seq ofBorrowLoc+BorrowField+ReadRefinstructions.Without CSE, all occurance of the same expression
data.x(line 2, line 3, line 5) will be translated into the seq above, despitedata.xat line 3 and line 5 share the same result of line 2 and the computations are not necessary.CSE aims to eliminate such redundant computations by reusing the result of previous computations.
Specifically, in the example above, assuming the
BorrowLoc+BorrowField+ReadRefsequence at line 2 is assigned to tempt1,then the occurrences at line 3 and line 5 can both be replaced by
t1, eliminating the redundant computations.The optimized bytecode would look like:
============================ Implementation Details ============================
Step 1: Build the Control Flow Graph (CFG) and Domination Tree of a target function.
Step 2: Traverse the Domination Tree in preorder, and for each basic block, for each instruction:
ExprKeystructureExprKeycontains the operation and its arguments, represented asExpArg,ExpArgcan be either a constant, a variable (temp), or anotherExprKeyto nest expressions recursivelyReadRef(BorrowField(BorrowLoc(x))), we want torepresent it as a single expression rather than three separate ones, so that we can eliminate
the entire sequence at once.
t1 = Op1(t0); t2 = Op2(t1);asOp2(Op1(t0)):Op1is the only definition of oft1that can reach the instruction ofOp2t1is only used once and exactly byOp2.hencing not missing opportunities for replacement
Step 3: Check if the
ExprKeyfrom Step 2 has been seen before in a dominating block.Given a seen-before
ExprKey(annotated assrc_expr) for the current expression (annotated asdest_expr),and assuming the two expressions have the following formats:
src_expr:(src_temp1, src_temp2, ...) = src_op(src_ope1, src_ope2, ...)defined atsrc_inst, wheresrc_ope1andsrc_ope2can be nested expressions.dest_expr:(dest_temp1, dest_temp2, ...) = dest_op(dest_ope1, dest_ope2, ...)defined atdest_inst, wheredest_ope1anddest_ope2can be nested expressions.we take a set of conservative conditions to check safety of the replacement:
Condition 1.
src_exprdominatesdest_exprsrc_expris always executed beforedest_exprCondition 2: type safety
src_tempsanddest_tempsshare the same typessrc_temptodest_tempstc_tempis not mutably borrowedsrc_tempis mutably borrowedCondition 3:
src_tempsare copyablesrc_tempstodest_tempsdoes not violate ability constraintsCondition 4:
src_tempsatsrc_exprare the only definitions ofsrc_tempsthat can reachdest_expr:dest_tempsCondition 5: Resources used in
src_exprare not changed atdest_expr:BorrowGlobalandExistsoperations are safe to reuse atdest_exprBorrowGlobalandExistsare involved insrc_expranddest_exprCondition 6: Operands used in
src_exprare safe to reuse atdest_expr:src_exprare identical to those used indest_exprsrc_exprare possibly re-defined in a path betweensrc_expranddest_expr(without going throughsrc_expragain)src_instremain unchanged when reachingdest_instsrc_exprare mutably borrowed elsewhereCondition 7: The replacement will bring performance gains! See comments above
gain_perffor detailsStep 4: for each
src_exprpassing the conditions to replacedest_exprin Step 3, we check gather necessary information to perform replacement like below:Example:
==>
Step 5: After processing all basic blocks, we perform the recorded replacements and eliminate the marked code.
============================ Extensions ============================
In principle, the algorithm above is designed to handle PURE instructions, defined as blow
memory(including write via references), control flow (includingabort), or external state (global storage)Yet, we found that some non-pure instructions can be safely handled under certain conditions.
Group 1: operations that are pure if no arithmetic errors like overflows happen (
+,-,*,/,%, etc):aggressivemodesrc_instGroup 2: operations that are pure if no type errors happen (
UnpackVariant):aggressivemodesrc_instGroup 3:
BorrowLoc,BorrowField,BorrowVariantFieldsrc_instanddst_inst, we can treat them as pure.Group 4:
AssignCopyorInferred(TODO([Move compiler V2] master issue tracking Common Subexpression Elimination todos #18203): reasoning more aboutInferred)Group 5:
readrefreadrefis not pure as it depends on memory states.src_instanddst_inst, we can treat them as pure.Group 6:
FunctioncallsGroup 7:
BorrowGlobalandExistssrc_instanddst_instStats by applying CSE on
frameworkcodeCall Operation::Functionas 3 bytecodes. Yet, the stats only counts it as one. This further reduces the reduction shown up in the stats.aptos-experimentalCall Operation::Functionis only counted as one bytecodeaptos-frameworkaptos-experimental::order_book_types.move)Call Operation::Functionis only counted as one bytecode in stats)Call Operation::Functionis only counted as one bytecode in stats)aptos-stdlibaptos-experimental::order_book_types.move)move-stdlibaptos-tokensTODOs
TODO(#18203).How Has This Been Tested?
Expected Result Changes
Type of Change
Which Components or Systems Does This Change Impact?
Note
Introduce a CSE optimization pass to the stackless bytecode pipeline, gated by an experiment flag, with supporting analysis/plumbing updates and test baselines.
CommonSubexpEliminationpass, integrated before ability checks; wired withLiveVar,FlushWrites,ReferenceSafety(v2/v3), andReachingDefanalyses.common-subexp-elimination; register reaching-def formatter.DomRelation, add preorder traversal; CFG: addsuccessor_insts()andedges().is_pure,pure_if_no_*),is_commutative, borrowing detectors, and commutativity handling; extend function target to get loc by offset.estimate_sizeheuristic; well-known vector names extended; minor cost model scaffolding in CSE.Written by Cursor Bugbot for commit 8679d38. This will update automatically on new commits. Configure here.