Writing a JavaScript codemod with JSCodeShift

A Codemod script is a code refactoring automation script used to refactor and improve code quality at scale. Using codemod scripts, a single engineer can rewrite code across the entire code base, replacing what would be multiple hours of tedious work. The React team, for example, wrote a codemod script that converted Facebook's thousands of ES6 class components into functional components, saving thousands of man-hours.

Requirements

The basic requirement of a codemod script is having a robust test suite. This would enable you to ensure that the codemod script is safe and does not introduce regressions.

A monitoring platform to catch errors is also essential to catch unforeseen errors as a result of the faulty deployment. A good to have is the ability to quickly revert deployments, as it reduces the fallout of the bad deployment.

How a codemod script works

A codemod script works by manipulating the Abstract Syntax Tree (AST) of the code you are going to rewrite. The example below is the AST for console.log("hello world");:

{
  "type": "Program",
  "body": [
    {
      "type": "ExpressionStatement",
      "expression": {
        "type": "CallExpression",
        "callee": {
          "type": "MemberExpression",
          "object": {
            "type": "Identifier",
            "name": "console"
          },
          "property": {
            "type": "Identifier",
            "name": "log"
          },
          "computed": false,
          "optional": false
        },
        "arguments": [
          {
            "type": "Literal",
            "value": "hello world",
            "raw": "\"hello world\""
          }
        ],
        "optional": false
      }
    }
  ],
  "sourceType": "module"
}

You may need to use a different parser depending on the flavor of JavaScript you are writing. For instance, you would be using the TypeScript parser for TypeScript and Babel for modern ECMAScript.

Once we know the structure of the AST of the code that we want to modify, the codemod script will traverse it and modify it based on our requirements. Once the script finishes traversing the AST, it will convert the AST back into text form and overwrite the existing code.

In pseudo code form, the codemod script is simply doing the following:

function codemod(code) {
  const ast = parse(code);
  manipulate(ast);
  return ast.toString();
}

Codemod with JSCodeShift

JSCodeShift is a tool to help with running codemod scripts by bundling up many parsers and providing a more convenient API.

Before diving in, let's take a look at the 3 types of classes we will need to familiarize ourselves with first.

Nodes

A Node is a tree node of an AST, the most basic building block we will be working with. A Node contains information on what type it is and its value. For example, the string "foo" can be represented as such:

{
  "type": "StringLiteral",
  "value": "foo",
  "raw": "\"foo\""
}

Documentation for Node depends on the parser. Using https://astexplorer.net/ is a good way to explore the shape of various nodes.

NodePath

A NodePath is a wrapper around a Node, which provides helper methods to make manipulation possible, such as traversal up the AST. Documentation on the methods cannot be found on JSCodeShift because NodePath is ast-type's NodePath. For example, the string "foo" NodePath can be manipulated as such:

fooNodePath.node; // gets the Node
fooNodePath.parentPath; // gets the parent NodePath
foodNodePath.prune(); // removes the NodePath
foodNodePath.replace(jscodeshift.stringLiteral("bar")); // replaces "foo" with "bar"

Documentation for NodePath can be found in https://github.com/benjamn/ast-types#nodepath

Collections

A Collection is an array of NodePaths. It exposes even higher-level methods to help with manipulation such as filtering, mapping, and finding elements. This is the highest level of encapsulation that you can work with.

collection.find(jscodeshift.Identifier); // find all identifier NodePaths
collection.forEach((nodePath) => nodePath.prune());
collection.closestScope(); // finds the scope for paths in the collection

Documentation for Collections can be found in the following files:

https://github.com/facebook/jscodeshift/blob/master/src/Collection.js

https://github.com/facebook/jscodeshift/blob/master/src/collections/Node.js

https://github.com/facebook/jscodeshift/blob/master/src/collections/VariableDeclarator.js

https://github.com/facebook/jscodeshift/blob/master/src/collections/JSXElement.js

Writing a basic codemod

Now, let's try to write a codemod that removes React.memo from React functional components.

// before
const App = React.memo(() => <div>hello world</div>);

// after
const App = () => <div>hello world</div>;

We will be using astexplorer, a web-based repl environment to iterate on this codemod: https://astexplorer.net/#/gist/4ba4f7a9ea987d152fb6165c6464e06a/7b720fd6939f954387d8bc626025272b25db9cc4

astexplorer.net

Astexplorer's UI may look confusing at first. From the top-left, in the clockwise direction, we have the input code, AST viewer, output code, and codemod script respectively. You would be editing from the bottom-left codemod script most of the time.

The template script looks something like this:

export default function transformer(file, api) {
  const j = api.jscodeshift;
  const root = j(file.source);
  // TODO: manipulate ast
  return root.toSource();
}

j is a common abbreviation of the jscodeshift namespace. By default, jscodeshift parses the code into a manipulatable Collection. Once transformed, we write the Collection back into human-readable code.

Finding React.memo

Using astexplorer's AST viewer, we find that React.memo calls have the node type of CallExpressions. First, let's find all the CallExpressions using the Collection's API:

const callExpressions = root.find(j.CallExpression);

This gets us all of the CallExpressions in the file, but we only want React.memo calls, so let's filter this down by using the property callee on CallExpressions.

const reactMemoCalls = root.find(j.CallExpression).filter((nodePath) => {
  const callee = nodePath.node.callee;
  return callee.object.name === 'React' && callee.property.name === 'memo';
});

Removing React.memo calls

Now that we have the collection that we want, the next step is to replace the React.memo calls with the functional component inside it. We know from React.memo's API that the component is the first argument. This can be found in the arguments property on CallExpressions.

reactMemoCalls.forEach((nodePath) => {
  const component = nodePath.node.arguments[0];
  nodePath.replace(component);
});

And that is it! You should see the transformed code on the bottom-right corner of astexplorer, which is the result we wanted from the start.

The full code can be found in https://astexplorer.net/#/gist/4ba4f7a9ea987d152fb6165c6464e06a/89f4c6594dcc01f1daf5fde56c1d2b178dac7baa.

Conclusion

As we can see, codemods are not very hard to write and they get easier with more practice. Once mastered, they can become a powerful toolbox in your arsenal to refactor code at scale. This can help you improve code quality and increase your impact as an engineer.