Conversation
- Added ESM implementation research summary detailing testing evidence and technical findings. - Created test suite for pure ESM modules in `test/test-50-esm-pure`, including: - README.md outlining the problem with ESM packages. - main.js to run tests and package ESM modules. - test-x-index.js to validate ESM module loading. - package.json and package-lock.json for dependencies. - Developed test suite for UUID v10+ in `test/test-50-uuid-v10`, including: - README.md explaining the test objectives. - main.js to execute tests and validate outputs. - test-x-index.js to test various UUID functionalities. - Established a strategy for transforming ESM to CJS using Babel to ensure compatibility with existing pkg infrastructure.
- Created lib/resolver.ts with ESM-aware resolution using resolve.exports - Enhanced lib/follow.ts to try ESM resolution first for non-relative imports - Fixed package.json inclusion for ESM packages with exports field - Added synthetic 'main' field to package.json when only exports exists - Ensures all discovered package.json files are included in snapshot - Both test-50-esm-pure (nanoid) and test-50-uuid-v10 now pass - No regressions in existing tests This completes Sprint 2 of the ESM implementation plan.
…xports - Remove test output artifacts (test-output, README.md files) - Add TypeScript type declarations for resolve.exports package - All changes address PR review comments
…ackage names - Add npm package name validation to skip non-literal require aliases - Only use ESM resolution callbacks for actual ESM packages - Fall back to CommonJS resolution for CJS packages to preserve marker state - Fixes test-50-public-packages, test-50-should-disclose-package, test-50-package-json-6c - ESM tests (test-50-esm-pure, test-50-uuid-v10, test-01-hybrid-esm) still passing
…etection - Add test-50-aedes-esm validating aedes MQTT broker v1.0.0 - Test uses new async createBroker() API from v1.0.0 - Fix package.json path resolution for exports-based packages - isESMPackage now checks correct path for single-file exports - Test passes with host and node20 targets
There was a problem hiding this comment.
Pull request overview
This PR adds first-class support for packaging ESM-only dependencies by introducing an ESM-aware resolver (including package.json#exports handling) and transforming ESM sources to CommonJS prior to bytecode compilation, plus new tests and design docs to validate/describe the approach.
Changes:
- Added an ESM-aware module resolver (
resolve.exports-based) and integrated it into dependency following. - Added ESM detection utilities and an ESM→CJS transform step (Babel) before bytecode compilation.
- Added new ESM-focused tests (nanoid/uuid) and supporting documentation/plans; updated dependencies/lockfile.
Reviewed changes
Copilot reviewed 18 out of 21 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
package.json |
Adds resolver/transform dependencies required for ESM handling (e.g., resolve.exports, Babel transform plugin). |
yarn.lock |
Locks new dependency graph for the ESM resolver/transform toolchain. |
lib/resolver.ts |
New resolver implementing exports-aware resolution with fallback to resolve. |
lib/follow.ts |
Uses the new resolver for bare specifiers; attempts to ensure relevant package.json files are captured/patched. |
lib/common.ts |
Adds ESM detection via file extension and nearest package.json#type. |
lib/esm-transformer.ts |
New Babel-based ESM→CJS transformer used to enable bytecode compilation of ESM sources. |
lib/resolve.exports.d.ts |
Adds TypeScript types for resolve.exports. |
lib/walker.ts |
Includes discovered package.json files and applies package.json patching + ESM→CJS transform before bytecode compilation. |
test/test-50-esm-pure/* |
New test fixture intended to validate packaging a pure-ESM dependency (nanoid). |
test/test-50-uuid-v10/* |
New test fixture intended to validate packaging an ESM-only uuid version. |
.github/copilot-instructions.md |
Notes the new plans/ directory and clarifies Yarn usage in project guidance. |
plans/ESM_RESEARCH_SUMMARY.md |
Design/research notes documenting why transform-based ESM support is used. |
plans/ESM_IMPLEMENTATION_PLAN.md |
Detailed implementation plan and rationale for ESM support approach. |
Files not reviewed (1)
- test/test-50-esm-pure/package-lock.json: Language not supported
… to respect ignoreFile option
…malizing outputs in tests
…on and adding error handling for ESM to CJS transformation
…ut handling in CI
…nhancing package.json inclusion logic
| const result = resolveModule(x, { | ||
| basedir: opts.basedir || process.cwd(), | ||
| extensions, | ||
| }); | ||
|
|
||
| // Only use ESM resolution result if it's an actual ESM package | ||
| // For CJS packages, fall through to standard CommonJS resolution | ||
| // to ensure all callbacks (catchReadFile, catchPackageFilter) are handled correctly | ||
| if (result.isESM) { | ||
| // This is a real ESM package, handle it here |
There was a problem hiding this comment.
follow() only uses resolveModule() when result.isESM is true. If resolveModule() resolved a package via exports to a CommonJS entry (e.g. exports.require -> .cjs), isESM will be false and the code falls back to resolve.sync, bypassing the ESM/exports-aware resolution. Consider using resolveModule() for any successful exports-based resolution (not only ESM), while still invoking catchReadFile/catchPackageFilter so markers and package.json inclusion remain correct.
| // If package has "type": "module", we need to change it to "commonjs" | ||
| // because we transform all ESM files to CJS before bytecode compilation | ||
| if (pkgContent.type === 'module') { |
There was a problem hiding this comment.
package.json is rewritten from "type": "module" to "commonjs" for STORE_CONTENT package.json files, but ESM→CJS transformation is only applied when store === STORE_BLOB. If a package is forced into STORE_CONTENT (e.g. marker.public / marker.hasDictionary paths earlier in this function), its .js sources may remain ESM while Node will now treat them as CommonJS, causing runtime syntax errors. Either transform ESM sources in STORE_CONTENT as well, or only rewrite type when you can guarantee all relevant .js files have been transformed/bytecode’d.
| // If package has "type": "module", we need to change it to "commonjs" | |
| // because we transform all ESM files to CJS before bytecode compilation | |
| if (pkgContent.type === 'module') { | |
| // If package has "type": "module" and is stored as a BLOB, we need to | |
| // change it to "commonjs" because we transform all ESM files to CJS | |
| // before bytecode compilation for STORE_BLOB records | |
| if (pkgContent.type === 'module' && record.store === STORE_BLOB) { |
| // Check all JS-like files (.js, .mjs, .cjs) but only transform ESM ones | ||
| if ( | ||
| store === STORE_BLOB && | ||
| record.body && | ||
| (isDotJS(record.file) || record.file.endsWith('.mjs')) |
There was a problem hiding this comment.
Transforming .mjs files to CommonJS here likely won’t make them loadable: Node’s CommonJS loader throws on require('*.mjs') regardless of the package.json type, and prelude/bootstrap.js doesn’t appear to add a .mjs extension handler. If exports resolves to a .mjs entrypoint, the packaged runtime may still fail even after transformation. Consider avoiding .mjs entrypoints by preferring exports.require/CJS targets, or renaming/rewriting .mjs to a CJS-loadable extension inside the snapshot (or adding a runtime loader hook).
| // Check all JS-like files (.js, .mjs, .cjs) but only transform ESM ones | |
| if ( | |
| store === STORE_BLOB && | |
| record.body && | |
| (isDotJS(record.file) || record.file.endsWith('.mjs')) | |
| // Only transform JS files that are loadable by the CommonJS loader | |
| if ( | |
| store === STORE_BLOB && | |
| record.body && | |
| isDotJS(record.file) |
| const result = babel.transformSync(code, { | ||
| filename, | ||
| plugins: [ | ||
| [ | ||
| '@babel/plugin-transform-modules-commonjs', | ||
| { | ||
| strictMode: true, | ||
| allowTopLevelThis: true, | ||
| }, | ||
| ], | ||
| ], | ||
| sourceMaps: false, | ||
| compact: false, | ||
| // Don't modify other syntax, only transform import/export | ||
| presets: [], | ||
| // Prevent Babel from loading user config files | ||
| babelrc: false, | ||
| configFile: false, | ||
| sourceType: 'module', | ||
| }); |
There was a problem hiding this comment.
The Babel module transform here only addresses import/export. ESM features that don’t have a safe CJS equivalent (notably top-level await and import.meta) can survive the transform and then become invalid when executed via the CommonJS wrapper / vm.Script, leading to runtime syntax errors. It’d be safer to explicitly detect these constructs and either fail fast with a clear error, or fall back to a supported execution mode (e.g. skip bytecode and keep ESM without rewriting package.json type).
|
Does this add native support for executing an ESM entrypoint and packages without transforming to CJS, but also without currently bytecode transfomation? P.S. I think there is bytenode that might work for producing bytecode with vanilla Node.js, which can reduce the scope of patches. As as I said before, if pkg can be rewritten to use the builtin SEA mode with the current bootstrap logic adapted so all existing features like the virtual file system work, it might be possible to get it fully working on vanilla Node.js binaries. |
Check: https://github.com/yao-pkg/pkg/blob/main/plans/ESM_RESEARCH_SUMMARY.md TLDR; When an ESM module is detected I use ESBUILD to compile it to CJS and then compile it to bytecode. I checked many alternatives but right now I ended up with this as it's the most reliable one. SEA it's promising but very limited ATM, some limitations:
Building ESM to bytecode directly has some issues that would require a big refactor of pkg, see here Considering all this I don't think we can get rid of nodejs patches very easily. The most promising thing IMO is bun compile, I know there is also a deno alternative but it's not as good as bun one yet, I think once they reach a full compatibility with nodejs code that could be the way |
pkg already makes everything a single file which self extracts the stuff that can be used in-memory (
Does it really not support native addon modules even if you extract them to disk via a bootstrap script monkey patch and load them from there just like pkg already does? It is supposed to be the same Node.js executable not some custom static build that can't load addon modules.
Yeah, that will have to be kept around, possibly extracted to its own library which pkg just bundles in, as long as Node.js doesn't implement a different convention for package resource files. e.g. The
Yeah, but you could make the entrypoint a bootstrap script that will load bytecode from a resource blob you include with it. Which is how pkg works anyhow no?
That's the biggest problem
Yeah, but that's bun, not Node.js. Bun also doesn't handle fs related virtualization, and actually introduces its own bun specific convention for resources that will work with its SEA mode. |
Nope, it doesn't do that, it traverse the source code to check all the files that needs to be bundled, compiles the to bytecode (the js files only) and then injects all the files into the binary, the patches are used then to make node serve the bytecoded files when requiring them
They should be extracted to disk and opened with process.dlopen like we do now in pkg
Kind of but this requires some research and tests, not that easy to do |
|
@segevfiner Let's continue the discussion here: #204 |
This pull request introduces enhanced support for ESM (ECMAScript Module) resolution and handling throughout the codebase, improving compatibility with modern Node.js packages and workflows. The changes include new ESM detection utilities, an ESM-aware module resolver, Babel-based ESM-to-CommonJS transformation, and updates to the file walker logic to properly include ESM-related package files. Additionally, documentation and development workflow instructions have been updated to reflect new practices and requirements.
ESM Support and Module Resolution Enhancements:
isESMFile,isESMPackage, and related caching) tolib/common.tsfor determining if files or packages use ESM syntax.lib/resolver.tsthat handles package.json "exports" field and distinguishes between ESM and CommonJS packages.lib/follow.tsto use the new resolver, including logic to filter valid npm package names and properly handle ESM and CJS resolution paths; resolves ESM packages using exports field and invokes callbacks for package.json handling. [1] [2] [3]ESM Transformation:
lib/esm-transformer.tswith Babel-based transformation to convert ESM code to CommonJS for runtime compatibility, including error handling and logging.lib/walker.tsto ensure ESM files are properly processed during packaging.Walker Logic and Package Inclusion:
lib/walker.tsto ensure all relevant package.json files (especially those discovered via ESM resolution) are included if they are dependencies or part of the application base, preventing omission of critical metadata.Documentation and Workflow Updates:
.github/copilot-instructions.mdto include new directory (plans/), revised commit/push workflow, and expanded important notes for Copilot Coding Agent, emphasizing ESM resolution, linting, test artifact cleanup, and user approval requirements. [1] [2] [3]Encoded references:
[1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12]
Fixes #16