Tag: Chrome

  • Invoker Commands in WordPress

    Invoker Commands in WordPress

    Recently, I looked more closely into the new Invoker Commands API and what it means for WordPress.

    There is a lot happening on the web platform right now, with tons of new features and capabilities landing in all browsers every month. For example, this January saw Promise.try and a new Device Posture API to detect devices with foldable screens. Interop 2024 also just concluded, with exciting new additions such as text-wrap: balance for balanced headlines and Popover for creating overlays declaratively using HTML. Interop 2025 is already coming soon and I am looking forward to seeing more features that help move the web forward.

    The Invoker Commands API is one of these new web platform APIs. It provides a way to declaratively assign behaviors to buttons to control interactive elements on the page. Think of a button that allows you to play or pause a video somewhere else, simply by adding some HTML attributes. Naturally, I set out see how I could use this intriguing new API in WordPress.

    The Invoker Commands API

    The Invoker Commands API is declarative, which means you use HTML attributes to use it, rather than writing imperative JavaScript code. It introduces two new HTML attributes:

    1. commandfor
      Add this to a <button> element and pass the ID of the interactive element to control.
    2. command
      Specifies the action to perform on the interactive element.

    Browsers already support some commands out of the box for native elements like <video>, <dialog> or Popover.

    Here are a couple of examples:

    Controlling a dialog

    <button commandfor="mydialog" command="show-modal">Show modal dialog</button>
    
    <dialog id="mydialog">
      <button commandfor="mydialog" command="close">Close</button>
      Dialog Content
    </dialog>Code language: HTML, XML (xml)

    Controlling a video/video

    <button type="button" commandfor="my-video" command="play-pause">Play/Pause</button>
    <button type="button" commandfor="my-video" command="toggle-muted">Mute/Unmute</button>
    
    <video id="my-video"></video>Code language: HTML, XML (xml)

    Custom commands

    The Invoker Commands API also enables you to add custom commands for your own interactive components on a page. They work essentially the same as built-in ones, with the exception that they must start with a double dash (--). Example:

    <button commandfor="my-img" command="--rotate-left">Rotate left</button>
    <button commandfor="my-img" command="--rotate-right">Rotate right</button>
    <img id="my-img" src="photo.jpg" alt="[add appropriate alt text here]" />Code language: HTML, XML (xml)

    But how do you tell the browser what --rotate-left and --rotate-right should do? Here’s where JavaScript comes in.

    When you click on the button with such a command, a JavaScript event is dispatched on the controlled element (the invokee), in this case the <img> element. You can then listen to this event and perform the desired action:

    myImg.addEventListener("command", (event) => {
      if (event.command == "--rotate-left") {
        myImg.style.rotate = "-90deg";
      } else if (event.command == "--rotate-right") {
        myImg.style.rotate = "90deg";
      }
    });Code language: JavaScript (javascript)

    Browser support

    A quick note on browser support: at the time of writing, this feature is available behind a flag in Chrome, Firefox, and Safari.

    A polyfill is also available, and it’s very small (~2KB minified + gzipped).

    Introducing Block Invokers

    With the basics covered, let’s switch to WordPress. When I first learned about this API and saw the examples, I automatically thought of a button block that controls a video block or a details block. It struck me as a really obvious use case, so I was keen to figure out how to make it happen. Additionally, I was curious to learn how this new feature relates to the Interactivity API in WordPress, which also uses a declarative way to achieve interactivity on a web page.

    I created an experimental Block Invokers WordPress plugin and all my findings can be found on GitHub. There you can also find a link to directly test it yourself using WordPress Playground.

    Providing invoker commands in a block

    I wanted to way to provide information about supported commands during block registration. Imagine this being part of the block.json metadata:

    {
        "$schema": "https://schemas.wp.org/trunk/block.json",
        "apiVersion": 3,
        "name": "my-plugin/awesome-video",
        "title": "Awesome Video",
        "category": "media",
        "parent": [ "core/group" ],
        // ...
        "commands": {
          "play-pause": {
                "label": "Play/Pause",
                "description": "If the video is not playing, plays the video. Otherwise pauses it."
          },
          "toggle-muted": {
                "label": "Toggle Muted",
                "description": "If the video is muted, it unmutes the video. Otherwise it mutes it."
          }
        ]
    }Code language: JSON / JSON with Comments (json)

    To achieve this, I used the block_type_metadata filter to declare the supported commands for all suitable blocks. For instance:

    • The core/video and core/audio blocks support play-pause, play, pause, and toggle-muted commands.
    • The core/details block supports the toggle, open, and close commands.
    • The core/image block supports custom --show-lightbox and --hide-lightbox commands.

    In the editor, I then added a new panel to the button block to configure commands. You can select from a list of all existing blocks on the page and their supported commands (if they have any). This way you can say “I want to perform command x on this particular block y.

    Block Invokers plugin UI in the WordPress block editor. It shows a first dropdown to select the block to control. The second dropdown allows you to select the invoker command to perform on the block.
    Block Invokers editor UI

    Now, the crucial part is to add a unique ID to both the chosen target block and also the button block, as it is required for the command attribute to establish the connection between the two elements. The editor stores the ID in a custom block attribute. On the frontend, the plugin adds it to the right elements using the HTML API.

    Because Invoker Commands is a declarative API using only HTML attributes, this is already enough to achieve basic interactivity like playing a video, without any JavaScript required.

    Reducing the amount of JavaScript needed on websites is a great example of democratizing performance.

    Supporting custom invoker commands

    However, I also wanted to support custom commands, and for those you need JavaScript event listeners. The image lightbox functionality was an ideal candidate to try this out with dedicated --show-lightbox and --hide-lightbox commands.

    Here’s where the Interactivity API comes into play. The image block has an extensive and complex store configuration with lots of state, actions, and callbacks. I was able to tap into that and add a new event listener using another HTML attribute: 'data-wp-on--command="actions.handleCommand". The action in the block’s view.js script is then very minimalistic as it can reuse existing logic to perform the right actions depending on the received command:

    handleCommand( event ) {
      switch ( event.command ) {
        case '--show-lightbox':
          actions.showLightbox();
          break;
        case '--hide-lightbox':
          actions.hideLightbox();
          break;
      }
    }Code language: JavaScript (javascript)

    This is like the myImg.addEventListener example earlier, just written in the Interactivity API way.

    Now, one could probably implement the previous examples with only the Interactivity API. However, that would require you to load additional JavaScript and write some JavaScript yourself for something that the browser supports out of the box.

    Also, it would require you to interact with another block’s data store, which isn’t really meant to be extended by other plugins or provide backward compatibility.

    The way I see it, the Invoker Commands API and the Interactivity API can actually work together very well. It allows a block to define a public API of supported interactions that other blocks or components can tap into. These can be either custom actions or default ones supported by the browser. Again, the latter would then work with zero additional JavaScript.

    Conclusion

    While the Invoker Commands API is still experimental, it already works very well and is very intuitive to use, even in a WordPress context. Blocks are the perfect place to make use of this new technology and I think this should be further explored. The Interactivity API is a great companion to Invoker Commands, so the two are not mutually exclusive.

    Browser support is very promising as well, and with the polyfill there is no reason to not look into this new feature already.

    My Block Invokers plugin is on GitHub if you want to check it out in more depth.

  • Web AI for WordPress

    The web platform team in Chrome is working on built-in AI features, where the browser provides AI models, including large language models (LLMs), to enable on-device AI for browser features and web platform APIs. This is a game changer and a huge opportunity for WordPress to democratize AI-assisted publishing. Let me tell you why.

    If WordPress does not want to fall behind its competitors, it must seamlessly provide typical AI features users nowadays expect from a publishing platform. There are already various AI WordPress plugins, but they all come at a cost—both literally and metaphorically. These plugins all rely on third-party server-side solutions, which impacts both your privacy and your wallet.

    Web AI has several key benefits over a server-side approach. It brings models to the browser, protecting sensitive data and improving latency.

    Web AI is the overarching term for the ability to run AI solutions in the browser using JavaScript, WebAssembly, and WebGPU. This space was pioneered by libraries such as TensorFlow.js and Transformers.js. Using those tools, websites can download models and run tasks of their choice directly in the browser.

    Chrome’s built-in API

    If everyone has to download and update these models all the time, this doesn’t really scale and isn’t really sustainable. That’s where Chrome’s built-in AI steps in. It is just one form of client-side AI or Web AI.

    With the built-in AI, your site or web app will be able to run various AI tasks against foundation and expert models without having to worry about deploying and managing said models. Chrome achieves this by making Gemini Nano available through dedicated web platform APIs, running locally on most modern computers.

    Note: This functionality is currently only available in Chrome Canary. Join the early preview program to learn how to access those early-stage built-in AI features and provide feedback.

    At the moment, the Chrome team expects the built-in AI to be beneficial for both content creation and content consumption. During creation, this could include use cases such as writing assistance, proofreading, or rephrasing. On the consumption side, typical examples are summarization, translation, categorization, or answering questions about some content.

    Early preview program participants will receive more detailed information about the following APIs:

    To give you an example, using the prompt API is pretty straightforward:

    const session = await ai.languageModel.create({
      systemPrompt: "You are a friendly, helpful assistant specialized in clothing choices."
    });
    
    const result = await session.prompt(`
      What should I wear today? It's sunny and I'm unsure between a t-shirt and a polo.
    `);
    
    console.log(result);
    
    const result2 = await session.prompt(`
      That sounds great, but oh no, it's actually going to rain! New advice??
    `);Code language: JavaScript (javascript)

    The other APIs are similarly straightforward to use. By the way, if you are an avid TypeScript user, there are already type definitions which I helped write.

    Eager to build something with this new API? The Google Chrome Built-in AI Challenge challenge invites developers to explore new ground by creating solutions that leverage Chrome’s built-in AI APIs and models, including Gemini Nano. Cash prizes totaling $65,000 will be awarded to winners.

    Web AI Advantages

    • The browser takes care of model distribution and updates for you, significantly reducing the complexity and overhead.
    • Everything is processed locally on-device, without sending your sensitive data elsewhere, keeping it safe and private.
    • No server round trips means you can get near-instant results.
    • You can access AI features even if you’re offline or have bad connectivity.
    • Save money by not having to use expensive third-party services or sending large models over a network.

    This is not the first time where doing things client-side is better than on the server.1 You could of course consider a hybrid approach where you handle most use cases on-device and then leverage a server-side implementation for the more complex use cases.

    Using built-in AI in WordPress

    If WordPress core wants to offer AI capabilities to each and everyone of its users, it can’t jeopardize user’s privacy by relying on expensive third-party services. The project also does not have the resources or infrastructure to maintain its own API and running AI inference on a shared hosting provider in PHP is also not really something that works.

    That’s why Web AI and particularly Chrome’s built-in AI are a perfect match for WordPress. With it, users benefit from a powerful Gemini Nano model that helps them accomplish everyday tasks.

    The modern and extensible architecture of the WordPress post editor (Gutenberg) makes it a breeze to leverage Chrome’s built-in AI. To demonstrate this, I actually built several AI experiments. They enhance the user experience and accessibility for both editors and readers.

    Here’s an overview:

    Provide tl;dr to visitors

    Screenshot of a "Summarize post content" button on a blog post, powered by Web AI.
    Web AI makes it easy to summarize content with the click of a button — all done on-device.

    Uses Chrome’s built-in summarization API to provide readers a short summary of the post content. The UI is powered by WordPress’ new Interactivity API.

    Writing meta descriptions based on the content

    Using the dedicated summarizer API, the content can be easily summarized in only a few sentences.

    “Help me write”

    Block toolbar in the WordPress editor with "Help me write" features, using Web AI to do things like rephrasing or shortening text.
    AI-powered “Help me write” quickly became an essential feature for many users.

    Options for rewriting individual paragraphs à la Google Doc, like rephrasing, shortening or elaborating.

    Generate image captions / alternative text

    Uses Transformers.js and Florence-2 to generate image captions and alternative text for images directly in the editor. Also integrated into Media Experiments, which supports video captioning too.

    Generate a memorable quote

    A slight variation on the summarization use case, this extracts a memorable quote from the article and gives it some visual emphasis.

    Assigning tags & categories to blog posts

    Suggest matching tags/categories based on the content, grabs a list of existing terms from the site and passes it to the prompt together with the post content.

    Sentiment analysis for content / comments

    Using a simple prompt to say whether the text is positive or negative. Could be used to suggest rephrasing the text à la Grammarly, or identify negative comments.

    Summary

    From summarizing content to more complex examples like automatically categorizing posts, the currently explored use cases peek at what’s possible with Web AI in WordPress. Their main advantage lies in their combined strength and the deep integration into WordPress, making for a seamless UX. Chrome’s built-in AI also makes this functionality ubiquitous, as every WordPress user could leverage it without any browser extension, plugin, or API. This is just the beginning. In the future, more complex AI features could push the boundaries of interacting with WordPress even further.

    1. I’m of course talking about client-side media processing. ↩︎