-
-
Notifications
You must be signed in to change notification settings - Fork 680
feat: Add experimental closures support with feature flag #2964
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Adds closure environment allocation, variable capture analysis, and code generation for accessing and storing captured variables in closures. Updates the compiler to prescan function bodies for captured variables, allocate and initialize closure environments, and handle closure function creation and invocation. Extends Flow and Function/Local classes to track closure-related metadata. Includes new tests for closure behavior.
Corrects the calculation of environment slot offsets for captured variables in closures, ensuring proper byte offset handling and consistent environment setup. Updates test WAT files to reflect the new closure environment layout and stack management, improving correctness and coverage for closure, function expression, return, ternary, and typealias scenarios.
Enhances closure support by properly aligning captured local offsets, caching the closure environment pointer in a local to prevent overwrites from indirect calls, and updating environment size calculations. Also adds comprehensive AST node coverage for captured variable analysis and updates related tests to reflect the new closure environment management.
Adds logic to prescan constructor arguments of 'new' expressions for function expressions. This ensures that any function expressions passed as arguments are properly processed during compilation.
Introduce new test files for closure class functionality in the compiler, including TypeScript source, expected JSON output, and debug/release WebAssembly text formats.
Introduces a new 'closures' feature flag to the compiler, updates feature enumeration, and adds checks to ensure closures are only used when the feature is enabled. Test configurations are updated to enable the closures feature for relevant tests.
Refactored the compiler to only emit closure environment setup code when the closures feature is enabled. For builds without closures, indirect calls now use a simpler code path, resulting in smaller and cleaner generated code. Updated numerous test outputs to reflect the reduced stack usage and removed unnecessary closure environment handling.
Reserve slot 0 in closure environments for the parent environment pointer, ensuring correct alignment and traversal for nested closures. Track the owning function for each captured local, update environment access logic to traverse parent chains, and initialize the parent pointer when allocating environments. This enhances support for deeply nested closures and corrects environment memory layout.
Adjusts allocation sizes and field offsets for closure environments in multiple .wat test files, changing from 4 to 8 bytes (and similar increases for larger environments) and updating i32.store/load instructions to use the correct offsets. This aligns the test code with a new closure environment memory layout, likely reflecting changes in the compiler's closure representation.
Updated the NOTICE file to include Anakun <[email protected]> as a contributor.
Adds logic to properly capture and reference 'this' in closures and methods, ensuring 'this' is stored in the closure environment when needed. Updates compiler and resolver to support lookup and environment slot assignment for captured 'this', improving closure support for methods referencing 'this'.
Removed unnecessary 'self = this' assignments in all closure-returning methods, replacing references to 'self' with 'this'. This simplifies the code and improves readability by leveraging direct 'this' capture in arrow functions.
feat: Add experimental closures support with feature flag
|
Please review. |
|
Was this primarily written using an LLM? (This itself doesn't make this PR bad, but this is code we'd have to maintain for quite a while after merging, and the existence of a human who fully understands what's going on here is ideal.) |
This code was written primarily by an human. The only part where LLM was used is in the unit test to generate an unique set of easy to complex cases with edge cases an human might not think of. You can check my org btc-vision and what we do. We do not use LLM in critical features. AI is trash at coding in general. |
|
Yes I did use copilot auto feature to give a summary of the PR content if you wonder. Everything said here is accurate. |
CountBleck
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. It seemed somewhat high-quality for an LLM, and the PR description threw me off.
This is a big chance, so I'm not too confident in merging it immediately. I'll be slow to review, especially because I'm busy until next week or so.
Could you try running some of the larger, existing compiler tests with this feature on, to see if they run correctly (obviously, the WAT output will be different, so use --create and discard the changes afterward). Also, did you make sure that all locals stored in closure environments are visited by the GC? (Sprinkling in a few __collect() calls might reveal some issues. Running those aforementioned tests should also do the same.)
Anyway, some additional things from taking a cursory glance at the PR...
| this.currentType = signatureReference.type.asNullable(); | ||
| return options.isWasm64 ? module.i64(0) : module.i32(0); | ||
| /** Scans a node and its children for captured variables from outer scopes. */ | ||
| private scanNodeForCaptures( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a big switch-case, but I suppose it makes sense. It might warrant some scrutiny to make sure there's nothing missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should information gathering of this kind be integrated into the resolve step perhaps, as it basically is the pre-pass?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a big switch-case, but I suppose it makes sense. It might warrant some scrutiny to make sure there's nothing missing.
To help you understand what is supported and what is NOT supported, I made a little table:
Fully supported
| Node Type | Notes |
|---|---|
| All expression nodes | Identifier, Binary, Call, New, Ternary, PropertyAccess, ElementAccess, Assertion, InstanceOf, Parenthesized, UnaryPrefix, UnaryPostfix, Comma |
| All control flow | If, While, Do, For, ForOf, Switch (with cases), Block |
| Exception handling | Try/Catch/Finally, Throw |
| Literals | Array, Object, Template (elements/values scanned) |
| Variable declarations | Variable, VariableDeclaration |
| Function expressions | Including nested closures with proper parameter shadowing |
| Parameter default values | (x: i32 = captured) => x works correctly |
| this capture | Properly detected and stored in environment (view tests) |
Explicitly not supported (errors before reaching closure analysis)
| Feature | Error |
|---|---|
| Class expressions | "Not implemented: Block-scoped class declarations or expressions" |
| Computed property keys | "Identifier expected" (parse error) |
| super in closures | "'super' can only be referenced in a derived class" |
| Regular expressions | "Not implemented: Regular expressions" |
And here are the intentionally skipped since there is no capture possible:
- Leaf nodes: Null, True, False, Super, Constructor, Break, Continue, Empty, Omitted, Comment
- Type nodes: TypeName, NamedType, FunctionType, TypeParameter, Parameter
- Top-level declarations: Source, ClassDeclaration, EnumDeclaration, FunctionDeclaration, InterfaceDeclaration, NamespaceDeclaration, TypeDeclaration, Import, Export, etc.
So I changed the way it was before to safety first checks. Now the default behavior is to throw (scanNodeForCaptures/collectDeclaredVariables/prescanNodeForFunctionExpressions): unhandled node kind: ... so any unhandled node type will immediately fail rather than silently skip captures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should information gathering of this kind be integrated into the resolve step perhaps, as it basically is the pre-pass?
Basically, I strongly belive that the current architecture is correct since locals don't exist during resolve. Also, capture analysis needs flow.lookupLocal() and flow.lookupLocalInOuter() which require the parent flow chain to be established. Lastly, the environment layout needs local.type.byteSize which requires compiled types.
However, compileFunctionBody uses prescanForClosures() which is a name-based analysis (Map<string, null>) and compileFunctionExpression uses analyzeCapturedVariables() which is a local-based analysis (Map<Local, i32>).
I strongly belive I should refactor this and create a function scanForCaptures that accepts a "capture collector" interface which support multiple mode:
- Name collection mode (used during prescan when Locals don't exist yet)
- Local resolution mode (used when compiling function expressions)
This will be a better design. I will update the PR shortly. It a big refactor but doable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Managed to refactor this section and remove about 400 lines of code. This is way better!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Heh, started typing this before I saw your second comment)
Fwiw, looking through the existing code again, I think the previous plan was something along these lines:
As it stands, it is already possible to resolve an outer local without doing a separate pre-scan. In compileAssignment for example, that's the target local, where a captured local lives in a different (here: outer) flow. Right now this errors as it's not implemented.
Idea there was, that at the moment the compiler sees such a local, it can mark the outer local as captured, create or update an append-only closure environment of the respective outer function, and emit an env load or store instead.
When seeing such a local again, i.e. it is marked captured, the compiler can emit the env load or store directly.
This way, the discovery can happen as part of compilation without a separate pre-scan, perhaps even without recompiling anything and instead modifying the signature of closure-producing functions plus prepending a prologue to allocate the respective closure environment - or whatever fits best.
Also, one assumption there is that compilation of the inner function happens and completes mid-compilation of the outer function, producing the information the outer functions needs to be finalized before it is finalized - so it can either do something clever to upgrade to a closure producing function mid-compilation, or be fixed up without a full recompile. Well possible that this assumption still holds.
So, in summary, closure implementation was planned in a way that avoids doing pre-passes, with some support functionality to build it this way already present. Would be cool if there's way to reuse some of that!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Heh, started typing this before I saw your second comment)
Fwiw, looking through the existing code again, I think the previous plan was something along these lines:
As it stands, it is already possible to resolve an outer local without doing a separate pre-scan. In
compileAssignmentfor example, that's thetargetlocal, where a captured local lives in a different (here: outer) flow. Right now this errors as it's not implemented.Idea there was, that at the moment the compiler sees such a local, it can mark the outer local as captured, create or update an append-only closure environment of the respective outer function, and emit an env load or store instead.
When seeing such a local again, i.e. it is marked captured, the compiler can emit the env load or store directly.
This way, the discovery can happen as part of compilation without a separate pre-scan, perhaps even without recompiling anything and instead modifying the signature of closure-producing functions plus prepending a prologue to allocate the respective closure environment - or whatever fits best.
Also, one assumption there is that compilation of the inner function happens and completes mid-compilation of the outer function, producing the information the outer functions needs to be finalized before it is finalized - so it can either do something clever to upgrade to a closure producing function mid-compilation, or be fixed up without a full recompile. Well possible that this assumption still holds.
So, in summary, closure implementation was planned in a way that avoids doing pre-passes, with some support functionality to build it this way already present. Would be cool if there's way to reuse some of that!
So if I understand correctly, this is a more "discover as you go" architecture. Let me give it a try. I will probably be unable to finish this today but let me give it a try!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I guess the tricky part would be about code that already compiled before we discovered the capture?
For example:
function outer() {
let x = 1; // Compiles: local.set $x
x = x + 1; // Compiles: local.get $x, local.set $x
let f = () => x; // NOW we discover x is captured!
x = x + 1; // Should use: env.load, env.store
}When we hit the inner function, the outer function has already emitted local.get/set for x. But for correctness, all accesses to captured variables (even in the declaring function) must go through the environment.
So I guess prepend a prologue that copies locals to env, but this only works if outer function doesn't modify the captured variable after creating the closure.
Should we do a more like hybrid approch where we could use the existing support (flow.outer, lookupLocalInOuter, declaredByFlow) lets us detect outer accesses? Let me try this...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the algorithm flow would be something like this?
flowchart TD
A[compileFunctionBody instance] --> B{bodyContainsFunctionExpressions?}
B -->|yes| C[instance.mayHaveClosures = true]
B -->|no| D[numLocalsBefore = localsByIndex.length]
C --> D
D --> E[compileStatements body]
E --> F[compileExpression identifier]
F --> G{local.isCaptured?}
G -->|no| H[local.wasAccessedAsLocal = true]
G -->|yes| I[skip]
H --> J
I --> J
E --> J[compileFunctionExpression]
J --> K[analyzeCapturedVariables]
K --> L[for each capture]
L --> M[local.isCaptured = true]
M --> N{local.wasAccessedAsLocal?}
N -->|yes| O[local.envOwner.needsCaptureRecompile = true]
N -->|no| P[allocate envSlotIndex]
O --> P
P --> Q{instance.needsCaptureRecompile?}
Q -->|yes| R[reset locals to numLocalsBefore]
R --> S[clear wasAccessedAsLocal flags]
S --> T[instance.needsCaptureRecompile = false]
T --> A
Q -->|no| U[emit environment allocation if needed]
U --> V[done]
Some edge case with this solution that would still need recompilation:
| Case | Recompile? | Notes |
|---|---|---|
| let x=1; let f=()=>x; x=2 | YES | x accessed before closure |
| let f=()=>x; let x=1; x=2 | NO | x declared after closure |
| let x=1; let f=()=>x; | YES | x initialized before closure |
| Nested closures | Each level handles independently | |
| Multiple closures | capturedLocals accumulates all |
Is this a cleaner approch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would say that possible solutions there could be to a) emit fixup code that lifts the local to the environment at the time it's needed, b) to walk already generated code backwards and replace local.get/set of the local in question with environment accesses or c) to schedule a recompile of affected functions. Might well be that c) is the way to go if it significantly reduces complexity.
Apart from that, one interesting case is compile time branch elimination
function add<T>(a: T, b: T): T {
let c: T;
if (isString<T>()) {
let concat = () => a + b;
c = concat();
} else {
c = a + b;
}
return c;
}where pre-scanning would detect a closure, even though for certain T there is none. Iirc that was part of the motivation for a demand-driven design.
Hey, sorry for the delay! I will review all the comments now. |
Extended closure variable capture logic in the compiler to handle while, do-while, for, for-of, switch, try/catch/finally, and various expression nodes. Updated test cases to cover closure captures in these new contexts, ensuring correct environment handling and variable scoping for closures in complex control flow and expression scenarios. Found unhandled case of closure with nested arrays.
Replaces all occurrences of the closure environment global variable from '$$~lib/__closure_env' to '$~lib/__closure_env' in the compiler source and test files. This change ensures consistency in global naming and avoids the use of double dollar signs.
Ensure that default values of function parameters are scanned for variable captures before parameter names are added to the inner function scope. This fixes issues where closures in default parameter expressions could not capture outer variables. Adds new tests for closures in default parameter values.
Replaces the collectCapturedNames function with a unified scanNodeForCaptures that supports both local and name modes for closure variable capture analysis. Updates all relevant call sites and improves handling of various node kinds, making closure capture logic more robust and maintainable. Also updates test WAT files to reflect changes in closure environment global naming. Refactor closure function creation logic Removed unused parameters and redundant local set in array rest parameter initialization. Simplified compileClosureFunctionCreation by removing the staticOffset argument, as it is no longer used.
Refactor closure capture analysis and remove collectCapturedNames
|
I need to refactor the code again a bit since |
|
Looks like most of these can be fixed by modifying the pattern if (foo.bar && foo.bar.baz) { ... }to let bar = foo.bar;
if (bar && bar.baz) { ... }Has something to do with side-effects, where if |
Refactors closure variable capture logic in compiler.ts for clarity and robustness, including more explicit local variable usage and conversion of capturedNames to a Set. Updates prescanNodeForFunctionExpressions to use an iterative approach to avoid stack overflows. Adds a 'closures' feature entry to tests/features.json for testing closure support.
Refactor closure capture logic and add closures test config
Yea, I got rid of the Map<string, null> and used a set instead. Im using Map<Local, i32> as well now for capturedLocals instead of generic maps since I have to assign it to a variable before to avoir the error. I mean, its more type strict so its good I guess. I prefered the implementation before because it was shorter but yea, my changes are "safer" for null checking now I guess. |
Simplifies and streamlines the allocation of environment slots for captured locals and 'this' references by removing unnecessary variable assignments and directly assigning results. This improves code readability and reduces redundancy in the closure environment setup.
Refactored analyzeCapturedVariablesWithDeclared to remove the unused outerFunc parameter and updated its call sites. Improved closure environment setup logic, fixed minor code style issues, and removed redundant code. Also made minor whitespace and formatting adjustments throughout the file. Simplify capture merging logic in Compiler Refactored the merging of captures to remove redundant check for undefined captureIndex, as the value is always present when setting in existingCaptures.
Refactor closure capture analysis and minor cleanups
Fixes #798.
Related: #173, #563, #2054, #2753.
Changes proposed in this pull request:
⯈ Added experimental closures feature - Closures can now capture variables from their enclosing scope. This includes support for:
letandvar)thisdirectly in class methods⯈ Implemented as an opt-in feature flag - Closures are disabled by default to maintain backwards compatibility and expected behavior. Users must explicitly enable the feature with
--enable closures. This ensures:⯈ Added compile-time constant
ASC_FEATURE_CLOSURES- Allows conditional compilation based on whether closures are enabled⯈ Added comprehensive test suites:
closure.ts- Basic closure patterns (captures, mutations, shared environments)closure-stress.ts- Stress tests covering many edge cases (616 lines)closure-class.ts- Complex class patterns with closures (1000+ lines) including:Implementation Details
The implementation follows the approach discussed in #798:
Closure Environment:
First-Class Functions:
_envpointer_envis loaded and made available_env = 0, avoiding overhead when closures aren't usedCore changes in
src/compiler.ts:prescanForClosures) identifies closures and captured variables before compilationSupporting changes:
std/assembly/shared/feature.ts- AddedFeature.Closuresenum valuesrc/common.ts- AddedASC_FEATURE_CLOSURESconstant namesrc/program.ts- Registered compile-time constantsrc/index-wasm.ts- ExportedFEATURE_CLOSURESfor CLIsrc/flow.ts- Added flow tracking for captured variablesLimitations
This is an experimental implementation. Known limitations:
Usage