Post

Understanding the basic webpack process and some important concepts

Understanding the basic webpack process and some important concepts

1 Introduction

webpack is a static module bundler for modern JavaScript applications. When webpack processes an application, it recursively builds a dependency graph of every module the application needs, and then packages all of those modules into one or more bundles.

Four [core concepts] (https://www.webpackjs.com/concepts/):

  • Entry. The start of constructing its internal _dependency graph
  • output Where to output the bundles it creates, and how to name them
  • loader allows webpack to work with non-JavaScript files (webpack itself only understands JavaScript), convert files from a different language (e.g. TypeScript) to JavaScript, or convert inline images to data URLs.
  • plugins Plugins can be used to handle a variety of tasks, including: tree-shaking, compression, and redefining variables in the environment.

2 Basic Flow

2.1 Basic Process

  1. entry-option initialize option
  2. run Starts compilation.
  3. make recursively analyzes the dependencies starting from entry and builds each dependent module.
  4. before-resolve - after-resolve Resolve one of the module locations.
  5. build-module Starts building the module, which will be loaded using the file’s loader.
  6. normal-module-loader compiles the module (which is a piece of js code) with the loader, compiling it with acorn and generating the ast abstract syntax tree. 7.
  7. program starts traversing the ast, and when it encounters some call expression such as require, it triggers the execution of the handler for the call require event, collects the dependencies, and . E.g. AMDRequireDependenciesBlockParserPlugin etc.
  8. seal All dependency builds are complete, and the following will begin to optimize the chunk, such as merging, extracting public modules, and adding hash.
  9. bootstrap Generate bootstrap code.
  10. emit Output each chunk to the result file.

See webpack source code analysis for detailed event flow.

2.2 Internal dependency diagram

Once inside the entry point, webpack finds out what modules and libraries the entry point (directly and indirectly) depends on. How do you find these require statements? With a regular? If require is written in a comment it will also match; require('a'+'b') similar expressions are difficult to handle with regulars. Therefore, use a js code parsing tool** (**such as esprima or acorn, webpack Parser.js uses [acorn] (https://github.com/acornjs/acorn).) , which converts JS code into an Abstract Syntax Tree (AST), then traverses the AST to find require expressions, collects dependencies, and constructs a dependency graph.

js engines also use js code parsing tools to build abstract syntax trees, such as JavaScriptCore, V8. the process is: source code => abstract syntax tree => bytecode P.S. V8 was previously converted directly to machine code, which was changed back to bytecode in 2019 because of memory issues.

A webpack pseudocode found on github parse.js

2.3 Module Resolution

resolver is a library that helps find the absolute path to a module. resolver helps webpack find the module code to be introduced in the bundle, which is included in every require statement. When packaging modules, webpack uses enhanced-resolve to resolve file paths (absolute/relative/module paths).

1
2
3
4
5
const resolve = require("enhanced-resolve");

resolve("/some/path/to/folder", "module/dir", (err, result) => {
    result; // === "/some/path/node_modules/module/dir/index.js"
});
  • Relative paths
    1
    
      import '... /src/file1'
    

    In this case, the directory where the resource file is located using import or require is considered the context directory. The relative path given in import/require is concatenated with this context path to produce the absolute path to the module.

  • Module path
    1
    2
    
      import 'module'.
      import 'module/lib/file';
    

    The module will be searched in all directories specified in resolve.modules. You can replace the initial module path by creating an alias using the resolve.alias configuration option.

2 manifest

In a typical application or site built with webpack, there are three main types of code:

  • Source code written by you or your team.
  • Any third-party libraries or “vendor” code that your source code will depend on.
  • webpack’s runtime and manifest, which manage all module interactions.

Using CommonsChunkPlugin you can separate vender and manifest to fully utilize the cache.

2.1 Runtime

The runtime, and its accompanying manifest data, is essentially all the code that webpack uses to connect modular applications in the browser at runtime. runtime contains the loading and parsing logic needed to connect modules as they interact with each other. This includes the connection of loaded modules in the browser and the execution logic for lazy loaded modules.

The implementation of loading and parsing modules mainly implements the __webpack_require__ method. __webpack_require__ can be interpreted as webpack’s reference to the require method implemented by Nodejs to use CommonJS modules.

2.2 Manifest

When the compiler (compiler) starts executing, parsing and mapping the application, it retains the detailed gist of all the modules. This collection of data is called “Manifest”.

When the package is completed and sent to the browser, the module is parsed and loaded at runtime via Manifest. Regardless of which module syntax you choose (es6/CommonJS), those import or require statements are now converted to webpack_require methods that point to a module identifier. Using the data in the manifest, the runtime will be able to query the module identifier and retrieve the module behind it.

For example, in a SPA application, click on a link to jump to another route, and you will notice that the browser automatically downloads the chunk files corresponding to this module. These files are known by using the data in the manifest.

3 Hot module replacement

3.1 Concepts

The Hot Module Replacement (HMR - Hot Module Replacement) feature will replace, add or remove modules without reloading the entire page (as distinct from live reload). There are several main ways to significantly speed up development:

  • Preserve application state that is lost when a page is completely reloaded.
  • Save valuable development time by only updating changes.
  • Adjust styles more quickly - almost equivalent to changing styles in the browser debugger.

3.2 Basic Processes

The main process is as follows:

  1. webpack-dev-server starts the local service, and the client uses websocket to realize a long connection, the client requests the initial resources.
  2. webpack-dev-server listens for code file changes, when the developer modifies the code and saves it, webpack will recompile and generate new files to include:
  • hash value
  • An updated manifest(JSON). manifest includes the new compiled hash and a catalog of all chunks to be updated.
  • One or more updated chunks (JavaScript).
  1. The server pushes the hash stamp of the current compilation to the client via websocket.
  2. The client’s websocket listens for file changes and pushes the hash stamp, and compares it to the previous one. If it is consistent, it will go to the cache. If it is not consistent, it will determine if it supports hot update, if it supports it, it will fire the webpackHotUpdate event, if it does not support it, it will refresh the browser directly.
  3. webpack related modules will listen to the webpackHotUpdate event, call module.hot.check method. HMR runtime request Manifest and chunk file. 6.
  4. HMR runtime calls hotAddUpdateChunk to dynamically update the module code and then calls the hotApply method to perform the hot update.

webpack-dev-server is a small Node.js Express server that uses webpack-dev-middleware to serve webpack packages. In practice, it will start an express static resource web server at localhost:8080 (or other ports) and automatically run webpack in listen mode and listen for resource changes in real time via the socket.io service and refresh the page automatically (hot update). HMR can be used as an alternative to LiveReload during development. webpack-dev-server supports hot mode, which attempts to use HMR to update before attempting to reload the entire page.

4 Code Splitting

There are three common methods of code separation:

  • Entry starting point: manually detach code using the entry configuration.
  • Prevent duplication: Use the CommonsChunkPlugin to de-duplicate and split chunks. The optimization.splitChunks.maxSize configuration solves the problem of a particularly large chunk.

    CommonsChunkPlugin is used to avoid duplicate dependencies between them, but no further optimization is possible. Starting with webpack v4, the CommonsChunkPlugin was removed in favor of optimization.splitChunks.

  • Dynamic import: separates code via inline function calls in modules. (lazy loading of components using react-loadable dynamic loading of components)

4.1 dllPlugin and external plugins

The DllPlugin and externals are essentially solving the same problem: avoiding packaging certain external dependency libraries into our business code, and instead providing those dependencies at runtime.

DllPlugin

  • Meet the requirements of front-end modularization
  • webpack configuration is slightly more complex, you need to pre-package the required dll resources, and configure the appropriate plugin at build time
  • The premise of using dlls is that these external dependencies generally do not need to change. So, if a change occurs someday, then the project needs to be rebuilt, which is more troublesome.
  • Watch out for manifest.json naming conflicts.

external

  • Not quite in line with the modularization idea of the front-end, the required external libraries need to be accessible in the global environment of the browser
  • External libraries can be upgraded without rebuilding the project if they are compatible with the previous API, just update the links.
  • webpack configuration is a little easier, but also need to package the required external libraries into the required format, and reference them in the runtime (if the module provides a cdn address, you can use it directly)

5 tree shaking

tree shaking is a term commonly used to describe the removal of unreferenced code (dead-code) from JavaScript contexts. It relies on static structural features in the ES2015 module system, such as import and export.

The new official release of webpack 4, extends this detection capability by using the "sideEffects" attribute of package.json as a flag to provide the compiler with an indication of which files in the project are “pure (pure ES2015 modules)”, from which unused portions of the file can be safely removed.

To use tree-shaking, you need to do the following:

  • Use ES2015 module syntax (i.e. import and export).
  • Ensure that no compiler converts ES2015 module syntax to CommonJS modules (this is also the default behavior of @babel/preset-env in the popular Babel preset - see documentation).
  • In the project package.json file, add a “sideEffects” attribute. If none of the code contains side effects we can simply mark the property as false. If they do then an array can be provided, as in antd’s package.json.

    1
    2
    3
    4
    5
    6
    7
    8
    
      {
        "sideEffects": [
          "dist/*",
          "es/**/style/*",
          "lib/**/style/*", "*.less", "lib/**/style/*",
          "*.less"
        ],
      }
    

    A side effect is defined as code that performs special behavior on import, rather than just exposing an export or multiple exports; for example, polyfill, which affects global scoping and does not normally provide an export.

  • Enable minification (code compression) and tree shaking by setting the mode option to production.

6 Optimization strategies

webpack can do it:

  • Code compression (uglify)
  • code splitting (split-entry multi-page applications, splitChunks to prevent duplication, dynamic imports)
  • tree shaking
  • Packaging infrequently updated modules separately (dllPlugin), or to cdn (externals)

Other methods:

  • Code optimization: <link> style files in the header, <script> at the bottom of the <body>, etc.
  • Reduce requests, merge requests
  • Nginx configuration gzip
  • SSR (server-side rendering)

7 Reference

This post is licensed under CC BY 4.0 by the author.