Building Modern Cross Browser Web Extensions: Introduction (Part 1)

Published: January 20, 2025 Last Updated: January 20, 2025

If you are already familiar with web extensions and want to jump directly to the implementation, you can skip this post and go to the next one

Introduction

Extensions have become an integral part of modern browsers. While web development has evolved vastly over the last few years, development of extensions has mostly remained the same. This multi-post series will explore the development of cross browser web extensions using modern tools and frameworks.

Why Build Web Extensions?

Web extensions offer a unique blend of platform independence, a large user base, and feature rich APIs. They have become as significant as mobile apps in today’s world. Following are some of the reasons why you should consider building web extensions:

Platform Independence: Web extensions are built using web technologies, which makes them platform independent. With modern tools, you can build an extension once and deploy it across multiple browsers.
Large User Base: Browsers have a large user base, which means your extension can reach a large number of users. Chrome itself has over 3 billion active users.
Feature Rich APIs: Browsers provide a rich set of APIs that can be used to build powerful extensions. You get a well-established and robust platform - no need to worry about the backend or the infrastructure.
Persistent Availability: Extensions are installed by the user and stay with them (even across devices). This means you can build a personalized experience for the user. This is somewhat equivalent of an installed desktop app or a mobile app.
Browser Integration: Browser extensions can act as a window to your core software. Todoist and Grammarly are really good examples of this - users can easily access functionalities of the core software from the browser itself.
Monetization: Extensions can be monetized in various ways — from ads to premium features, or as a paid addon to the core software.

Development Challenges

Extensions have been around for a long time, but the development process has not changed much. It’s not a surprise that most of the material on the web for extension development focuses on building from scratch. This is not necessarily a bad thing, but it makes things difficult down the road. While it may work for simple extensions, building any sufficiently complex extensions can quickly become unmanageable without better tools and practices.

The transition between manifest versions adds complexity. Chrome’s push toward Manifest V3 requires significant architectural changes - from replacing background scripts with service workers to adapting to stricter API limitations. With Firefox maintaining Manifest V2 support, developers must maintain compatibility across different specifications.

Browser-specific implementations create additional hurdles. While most of the APIs are similar between the popular browsers, there might be some differences at some places, and each extension store has a different review process. An extension working perfectly in Chrome might need substantial modifications for Firefox, while Safari’s extension support is more limited. Cross browser support is often not considered from the initial phase - either making the transition difficult, or requiring multiple separate versions.

Besides these, conveniences like hot reloading and automated publishing workflows are often missing from traditional extension development.

Moving Forward With Modern Frameworks

Several modern frameworks are available for building extensions. These include CRXJS, which takes a vite-plugin approach, and supports React, Solid, Vue and Vanilla JavaScript. However, it requires more manual configuration than other available options, and provides just enough tooling to get started.

Plasmo offers a more opinionated approach, with out of the box support for React, Typescript and other frameworks. It has a long list of features and is one of the good options to check out.

WXT, which is a relatively newer framework, is quickly gaining popularity. WXT provides a good developer experience, with some wrappers around core browser APIs. We will be building our extension using WXT in this series. The comparison of these three can be found here.

While each framework offers its own advantages, WXT provides the best balance of features and flexibility for our needs. Its comprehensive tooling, framework-agnostic approach, and growing ecosystem make it an excellent choice for modern extension development. Let’s first understand some key terminology before diving into implementation.

Understanding The Terminology

There are a few terms that you should be familiar with before we dive into building extensions:

1. Manifest

The manifest file contains the configuration and metadata of the extension. It includes the name, version, permissions, and other details of the extension. It is the only required file, and must have the name manifest.json. MV2 (Manifest V2) and MV3 (Manifest V3) are successive versions of the browser extension manifest format, with MV3 being the latest iteration, introducing significant changes in security, permissions, and functionality.

2. Background Script / Service Worker

This is a script that runs in the background and can be used to listen to events, make network requests, and perform other tasks. Background scripts have access to sensitive browser APIs, although they do not have direct access to DOM. The term ‘Service workers’ is used with Manifest V3, while ‘Background script’ is used in Manifest V2.

3. Content Script

Content scripts are scripts that run in the context of web pages (as if injecting a script in page itself) . They can be used to manipulate the DOM, interact with the page, etc but have limited access to browser APIs. We have to use message passing to communicate with the rest of the extension. One important point to note is that content scripts run in an ‘isolated world’ by default (can be run in ‘main’ world too), meaning the javascript environment of the page and extension are different. Thus, the webpage as well as the content scripts of all the extensions run in isolation and cannot access other’s context.

4. Popup

The popup is a small window that appears when the user clicks on the extension icon. It can be used to display information, settings, or other UI elements.

5. Options Page and Browser Action

The options page is a full HTML page that is used to display the settings of the extension. It can be used to configure the extension, set preferences, etc. Browser Action is the button that the extension adds to the browser toolbar. It can be used to trigger actions, open the popup, etc.

Both Chrome and Firefox have excellent documentation on building extensions. You can refer to the Chrome Extension Docs and Firefox Extension Docs for more details.

Anatomy Of An Extension

The following diagram shows the anatomy of an extension:

It illustrates how the different components of an extension interact with each other, as previously discussed.

In the next few posts, we will explore :

Project setup using WXT, TailwindCSS and Shadcn and understanding the project structure and configuration
Content Scripts and building isolated UIs for the extension
Background scripts and messaging
Storage and Permissions

We will build an extension along the way using the concepts as we learn.

Reply via email

Building Modern Cross Browser Web Extensions: Project Setup (Part 2)