Bundling is an indispensable part of building modern Javascript app. Webpack, Rollup, Parcel-bunder are some of the big name bundlers. For the most part, bundling has been a magical process: just give the bundler the entry, the output, add some other config, and POOF! - suddenly your bundle.js is ready.
In this post, I will explain what a bundler is and why it is a good thing to use one - we will do it by creating one from scratch.
A bundler is a tool that puts your entry code along with all its dependencies together in one JS file.
Why would we want to use it? Can't we just upload the entire files and directories of our project and not go through extra step?
Here are two reasons:
import
and export
syntax is a recent convention in ES6. Not all browser supports it yet.I hope these are enough reasons to want to use a bundler. Let's move on to understanding how a bundler works
Best way to understand how something works is to build/ tinker it.
Before we start, let's go through the basic of what our project will look like.
Introducing Bandler. The tiniest, cutest, awesomest bundler you have ever seen (ok, you can name it whatever. That's just what I named my bundler).
Bandler will have a structure like this:
entry.js
-> module1.js
-> module2.js
The entry will be called entry.js
. It will have one dependency, module1.js
, which has a dependency, module2.js
.
Our project will use ES6 module syntax(import
/export
). Our task is to extend the module support to older browser. We have to transpile the ES6 syntax into something all/ most browsers can understand.
Here are 8 steps how should do it:
entry.js
import
declarationsimport module1 from './module1.js'
in entry, ./module1.js
is a dependency and we will map this with a unique ID)If it looks complicated, don't worry, because it is not.
In this section we'll do the setup: start a new directory for our project, cd
into it, and install some libraries.
mkdir bundler-playground && cd $_
Start npm project.
npm init -y
Install some additional libraries:
@babel/parser
to parse our code and returns an AST object@babel/traverse
to traverse/ walk through our AST object; this will help us look for all import declarations@babel/core
to transpile ES6 -> ES5resolve
to get full path of each dependency (ex: turn ./module1.js
into something like /User/iggy/project/bundler-playground/module1.js
)npm install --save @babel/parser @babel/traverse @babel/core resolve
Create a new index.js
in root, and add import these guys:
const fs = require("fs");
const path = require("path");
const parser = require("@babel/parser");
const traverse = require("@babel/traverse").default;
const babel = require("@babel/core");
const resolve = require("resolve").sync;
In this section, we will:
filePath
with unique ID (to be referenced later)import
s used)Here is the code for this section.
let ID = 0;
function createModuleInfo(filePath) {
const content = fs.readFileSync(filePath, "utf-8");
const ast = parser.parse(content, {
sourceType: "module"
});
const deps = [];
traverse(ast, {
ImportDeclaration: ({ node }) => {
deps.push(node.source.value);
}
});
const id = ID++;
const { code } = babel.transformFromAstSync(ast, null, {
presets: ["@babel/preset-env"]
});
return {
id,
filePath,
deps,
code
};
}
We got the file content using readFileSync()
. Then we parsed the content to get AST information. Once AST is acquired, we traversed the AST and look for all import instances using ImportDeclaration
API. Lastly, we transpiled our code from ES6 using babel core's transformFromAstSync
.
For ID, we used a simple incrementing number (it's better to use random GUID, since it is a demo, ID++
will do)
With this, we have ourselves a nifty module information consisting of a unique ID, list of all dependencies (all imports), and the code inside that module. Next, we iterate the process for all relevant modules to create a dependency graph.
Dependency graph is a collection of interrelated modules used in our app, starting from entry point.
Here is a code for this section.
function createDependencyGraph(entry) {
const entryInfo = createModuleInfo(entry);
const graphArr = [];
graphArr.push(entryInfo);
for (const module of graphArr) {
module.map = {};
module.deps.forEach(depPath => {
const baseDir = path.dirname(module.filePath);
const moduleDepPath = resolve(depPath, { baseDir });
const moduleInfo = createModuleInfo(moduleDepPath);
graphArr.push(moduleInfo);
module.map[depPath] = moduleInfo.id;
});
}
return graphArr;
}
We will be using an array type for our dependency graph. We start by pushing our entry info first.
Then we iterate through dependency graph elements (starting with entry).
const baseDir = path.dirname(module.filePath);
const moduleDepPath = resolve(depPath, { baseDir });
const moduleInfo = createModuleInfo(moduleDepPath);
graphArr.push(moduleInfo);
Here we use path.dirname
and resolve
to get full path of each module, get the info using the full path, and push that info into our dependency graph array.
Note these lines:
module.map = {};
...
module.map[depPath] = moduleInfo.id;
Here we add an additional attribute map
inside our moduleInfo
object. This attribute will be used on next step as a lookup to map each module with unique identifier. For example:
module | ID |
---|---|
entry.js | 0 |
module1.js | 1 |
module2.js | 2 |
etc | n |
In the end, we end up with an array of module infos of all dependency used in the entire project.
Now that we have dependency graph, the last step is to pack them together.
function pack(graph) {
const moduleArgArr = graph.map(module => {
return `${module.id}: {
factory: (exports, require) => {
${module.code}
},
map: ${JSON.stringify(module.map)}
}`;
});
const iifeBundler = `(function(modules){
const require = id => {
const {factory, map} = modules[id];
const localRequire = requireDeclarationName => require(map[requireDeclarationName]);
const module = {exports: {}};
factory(module.exports, localRequire);
return module.exports;
}
require(0);
})({${moduleArgArr.join()}})
`;
return iifeBundler;
}
First, we create a factory pattern over the code of each module. It pass an export
and require
. Keep these 2 arguments in mind. We are keeping the map from previous step.
return `${module.id}: {
factory: (exports, require) => {
${module.code}
},
map: ${JSON.stringify(module.map)}
}`;
Second, we created an IIFE to run the entire dependency graphs together. The next part might be confusing - I struggled to understand this part initially, but with patience, it will make sense!
const iifeBundler = `(function(modules){
const require = id => {
const {factory, map} = modules[id];
const localRequire = requireDeclarationName => require(map[requireDeclarationName]);
const module = {exports: {}};
factory(module.exports, localRequire);
return module.exports;
}
require(0);
})({${moduleArgArr.join()}})
`;
${moduleArgArr.join()}
)modules
require(id)
function. This function has two effects:require(map[requireDeclarationName])
. This translates to something like require(1)
- recalling the mapping function earlier, turns into require('./module1.js')
factory(module.exports, localRequire)
module.exports
- although it is initially empty ({exports: {}}
), after running factory()
, the value of this module.exports
is the exports
value inside factory
we created earlier (think about it)The final code for this blog can be found here to compare code.
The full code will look something like this:
const fs = require("fs");
const path = require("path");
const parser = require("@babel/parser"); // parses and returns AST
const traverse = require("@babel/traverse").default; // AST walker
const babel = require("@babel/core"); // main babel functionality
const resolve = require("resolve").sync; // get full path to dependencies
let ID = 0;
/*
* Given filePath, return module information
* Module information includes:
* module ID
* module filePath
* all dependencies used in the module (in array form)
* code inside the module
*/
function createModuleInfo(filePath) {
const content = fs.readFileSync(filePath, "utf-8");
const ast = parser.parse(content, {
sourceType: "module"
});
const deps = [];
traverse(ast, {
ImportDeclaration: ({ node }) => {
deps.push(node.source.value);
}
});
const id = ID++;
const { code } = babel.transformFromAstSync(ast, null, {
presets: ["@babel/preset-env"]
});
return {
id,
filePath,
deps,
code
};
}
/*
* Given entry path,
* returns an array containing information from each module
*/
function createDependencyGraph(entry) {
const entryInfo = createModuleInfo(entry);
const graphArr = [];
graphArr.push(entryInfo);
for (const module of graphArr) {
module.map = {};
module.deps.forEach(depPath => {
const baseDir = path.dirname(module.filePath);
const moduleDepPath = resolve(depPath, { baseDir });
const moduleInfo = createModuleInfo(moduleDepPath);
graphArr.push(moduleInfo);
module.map[depPath] = moduleInfo.id;
});
}
return graphArr;
}
/*
* Given an array containing information from each module
* return a bundled code to run the modules
*/
function pack(graph) {
const moduleArgArr = graph.map(module => {
return `${module.id}: {
factory: (exports, require) => {
${module.code}
},
map: ${JSON.stringify(module.map)}
}`;
});
const iifeBundler = `(function(modules){
const require = id => {
const {factory, map} = modules[id];
const localRequire = requireDeclarationName => require(map[requireDeclarationName]);
const module = {exports: {}};
factory(module.exports, localRequire);
return module.exports;
}
require(0);
})({${moduleArgArr.join()}})
`;
return iifeBundler;
}
console.log("***** Copy code below and paste into browser *****");
/* create dependency graph */
const graph = createDependencyGraph("./entry.js"); // wherever your entry is
/* create bundle based on dependency graph */
const bundle = pack(graph);
console.log(bundle);
console.log("***** Copy code above and paste into browser *****");
If we run node ./index.js
, we'll get something like
(function(modules){
const require = id => {
const {factory, map} = modules[id];
const localRequire = requireDeclarationName => require(map[requireDeclarationName]);
const module = {exports: {}};
factory(module.exports, localRequire);
return module.exports;
}
require(0);
})({0: {
factory: (exports, require) => {
"use strict";
var _module = _interopRequireDefault(require("./module1.js"));
var _module2 = _interopRequireDefault(require("./module2.js"));
function _interopRequireDefault(obj) { return obj && obj.__esModule ? obj : { "default": obj }; }
(0, _module["default"])();
(0, _module2["default"])();
},
map: {"./module1.js":1,"./module2.js":2}
},1: {
factory: (exports, require) => {
"use strict";
Object.defineProperty(exports, "__esModule", {
value: true
});
exports["default"] = void 0;
var _module = _interopRequireDefault(require("./module2.js"));
function _interopRequireDefault(obj) { return obj && obj.__esModule ? obj : { "default": obj }; }
var module1 = function module1() {
(0, _module["default"])();
console.log("hello from module1!");
};
var _default = module1;
exports["default"] = _default;
},
map: {"./module2.js":3}
},2: {
factory: (exports, require) => {
"use strict";
Object.defineProperty(exports, "__esModule", {
value: true
});
exports["default"] = void 0;
var module2 = function module2() {
console.log("Hello from module2!");
};
var _default = module2;
exports["default"] = _default;
},
map: {}
},3: {
factory: (exports, require) => {
"use strict";
Object.defineProperty(exports, "__esModule", {
value: true
});
exports["default"] = void 0;
var module2 = function module2() {
console.log("Hello from module2!");
};
var _default = module2;
exports["default"] = _default;
},
map: {}
}})(function(modules){
const require = id => {
const {factory, map} = modules[id];
const localRequire = requireDeclarationName => require(map[requireDeclarationName]);
const module = {exports: {}};
factory(module.exports, localRequire);
return module.exports;
}
require(0);
})({0: {
factory: (exports, require) => {
"use strict";
var _module = _interopRequireDefault(require("./module1.js"));
var _module2 = _interopRequireDefault(require("./module2.js"));
function _interopRequireDefault(obj) { return obj && obj.__esModule ? obj : { "default": obj }; }
(0, _module["default"])();
(0, _module2["default"])();
},
map: {"./module1.js":1,"./module2.js":2}
},1: {
factory: (exports, require) => {
"use strict";
Object.defineProperty(exports, "__esModule", {
value: true
});
exports["default"] = void 0;
var _module = _interopRequireDefault(require("./module2.js"));
function _interopRequireDefault(obj) { return obj && obj.__esModule ? obj : { "default": obj }; }
var module1 = function module1() {
(0, _module["default"])();
console.log("hello from module1!");
};
var _default = module1;
exports["default"] = _default;
},
map: {"./module2.js":3}
},2: {
factory: (exports, require) => {
"use strict";
Object.defineProperty(exports, "__esModule", {
value: true
});
exports["default"] = void 0;
var module2 = function module2() {
console.log("Hello from module2!");
};
var _default = module2;
exports["default"] = _default;
},
map: {}
},3: {
factory: (exports, require) => {
"use strict";
Object.defineProperty(exports, "__esModule", {
value: true
});
exports["default"] = void 0;
var module2 = function module2() {
console.log("Hello from module2!");
};
var _default = module2;
exports["default"] = _default;
},
map: {}
}})
Copy/Paste that into browser and you'll see
Hello from module2!
hello from module1!
Hello from module2!
Congratulations! We have just built an entire bundler... from scratch!!
In addition to creating an ES6 bundler, I attempted to create a bundler that bundles either CJS and ES6, Bandler (NPM)
I won't go too deep here - but in addition to using babel parser and babel traverse, I used detective
library that specifically searches and lists all CJS requires (ex: require('./your/lib.js')
) instances from a project. I saw that Babel does not have CJS syntax declaration here.
Can you think of some other ways to make CJS and ES6 bundler?
Popular bundlers
Inspirations for this post
Readings on bundlers