Last week, Yarn developers announced a new feature – Plug'n'Play installation. This feature allows us to run Node.js projects without using the node_modules folder, which normally establishes the project dependencies. The feature description declares that node_modules will no longer be needed – the modules will be loaded from the general cache of the package manager. I decided to test out this cool and promising feature and share my experience and findings.

yarnpkg.com

Why we don't need node_modules

The modular NodeJS system is completely based on the file system. Any call to require() is mapped to the file system. The folder node_modules was invented for third-party modules, where they are downloaded and installed. We can include package code from there and reuse going forward. This is also good way to manage build dependencies.

What if we have several projects using the same third-party module? Then each project receives its own separate set of dependencies and potentially copies the same package across projects, wasting disk space. 

Setting dependencies also takes up most of the build time in CI-systems, so the acceleration of this step will have a positive impact on the build time as a whole.

Modules installation includes following steps:

  • Dependency ranges are resolved into pinned versions
  • Packages for each version are downloaded from the repository and stored in the local cache
  • Caches are copied to the project's node_modules folder

If the first two steps are already sufficiently optimized and executed quickly, when we already have cached modules, then the third step has remained almost unchanged compared to the first versions of node and npm.

The new approach proposes to get rid of the third step and replace the actual copying of files to creating a mapping table of the requested modules to their copies in the local cache.

Symbolic link

Instead of actually copying modules, we could add a symlink to their location in the cache. This is actually implemented in alternative PNPM package manager. The approach may work well, but with symlinks there are many issues related to the dual location of the file, the search for adjacent modules, and several others. In addition, symlinks creation is a file operation, which ideally we need to avoid.

Plug'n'Play

More information about this feature can be found in the official description. Below is description on how I ran my local tests. 

First I cloned repository with their branch 

git clone git@github.com:yarnpkg/yarn.git –branch yarn-pnp

Then changed directory to cloned yarn folder and ran following 3 commands to build new yarn and create an alias yarn-local for it

yarn 
yarn build
alias yarn-local="node $PWD/lib/cli/index.js"

Now yarn-local is available in our session. Feature is enabled by either adding –pnp flag upon calling yarn or by the additional configuration in package.json: "installConfig": {"pnp": true}. And Yarn developers actually prepared a demo project that has Webpack, Babel and other frontend typical tools so we could use this project and install it to see the differences that this feature makes. 

I tested this feature using our actual project called Dashboard, doing fresh cold setup for every attempt and calling installing Plug'n'Play using flag.

yarn: 24.6 seconds
yarn-local –pnp: 17.04 seconds
yarn-local –pnp after having all in local cache: 14.7 seconds

It is also worth that the size of project directory went down from 192 MB to 19.8 MB. 

Now let's see how Plug'n'Play installation works. An additional .pnp.js file is created in the project root, it redefines built-in Node.js Module class native logic. Having this file in code gives the require () function the ability to extract modules from the global cache instead of searching for it in node_modules. All other built-in yarn commands, like yarn start or yarn test, preload this file by default, so no changes in code will be required if we have already used Yarn before.

In addition to module mappings, .pnp.js performs additional validation of the dependencies. If we try to call require ('test'), without declaring the dependency in package.json, we will get the following error: Error: You can not require a package ("test") that is not declared in your dependencies. This improvement should increase the reliability and predictability of the code. There are two  limitations to that approach. First, tools that work with the node_modules directory directly without Node.js built-in mechanism, they will require additoinal integration. For example, for Webpack and other front-end assemblers, will need additional plug-ins so that they can find the necessary files for bundling. Second is that postinstall scripts that change its state (downloading additional files, for example) can damage the cache and break down other projects because the module remains in the cache. Yarn developers recomment disabling execution of scripts with –ignore-scripts flag. They experimented with including this default flag for all projects inside Facebook and did not find any problems

NPM Tink

Interestingly, almost at the same time, NPM developers also announced a similar solution to the problem https://blog.npmjs.org/post/178027064160/next-generation-package-management. The NPM solution comes as a separate module, independent from the NPM. Tink recieves package-lock.json on the input, which is automatically generated on npm install. Based on the lock file, tink generates the file node_modules/.package-map.json, which stores the projection of the local modules to their real location in the cache.

Unlike Yarn, there is no hook file and we use the tink command instead of node to get the right environment. This approach is less ergonomic, since it will require modifications in our code to make it work. This is ok for the proof-of-concept approach though and I will be watching for new releases.

Conclusion

Considering experiences in other languages, getting rid of node_modules directory is an absolutely logical step in Node.js development. This will have a positive impact to CI-systems build speed, where there is a way to save the package cache between builds. 

Refusal from the node_modules directory is a logical step, taking into account the experience of other languages, where this approach was not original. This will favorably impact the build speed with CI-systems, where it is possible to save the package cache between builds. In addition, if we transfer the package cache and the .pnp.js file from one computer to another, we can run the environment without even launching Yarn. This can be useful in container build systems: mount the directory with the cache, put the .pnp.js file and run the tests.

As of today, Yarn Plug'n'Play pull-requiest is open and we can expect changes. I was interested in trying out the alpha version of that feature and testing it to make sure that this approach really works and it does!

For more than 15 years, gap intelligence has served manufacturers and sellers by providing world-class services monitoring, reporting, and analyzing the 4Ps: prices, promotions, placements, and products. Email us at info@gapintelligence.com or call us at 619-574-1100 to learn more.