Client-side media processing in WordPress

AI-generated image of a robot compressing images

At WordCamp US 2024 I gave a presentation about client-side media processing, which is all about bringing WordPress’ media uploading and editing capabilities from the server to the browser. Watch the recording or check out the slides. This blog post is a written adaption of this talk.

This was a long overdue step after my announcement tweet in December 2023 went viral. After that, I talked about it on the WP Tavern Jukebox podcast and in an interview with WPShout. You might also remember an old post I wrote about client-side video optimization. This is a 10x evolution of that.

A lot has changed since then. Not only have I built new features, but I also completely refactored the Media Experiments plugin that was all part of. For WordCamp US, I chose to put more focus on the technical aspects of media handling in WordPress and the benefits of a new browser-based approach.

If you haven’t seen it before, here’s a quick glimpse of what browser-based media processing allows us to do:

Contributors wanted

You can find everything covered in this article in the Media Experiments GitHub repository. It contains a working WordPress plugin that you can easily install on your site or even just test with one click using WordPress Playground.

The goal is to eventually bring parts of this into WordPress core itself, which is something I am currently working on.

To make this project a reality, I need your help! Please try the plugin and provide feedback for anything that catches your eye. Or even batter, check out the source code and help tackle some of the open issues.

Let WordPress help you

The WordPress project has a clear philosophy, with pillars such as “Design for the majority” and “Decisions, not options”. There is one section in that philosophy which particularly stands out to me:

The average WordPress user simply wants to be able to write without problems or interruption. These are the users that we design the software for

WordPress philosophy

In my experience, when it comes to uploading images and other types of media, WordPress sometimes falls short of that. There are still many problems and interruptions. This is even more problematic nowadays as the web is more media-rich than ever before.

Minimizing frustration

Who hasn’t experienced issues when uploading media to WordPress?

Perhaps an image was too large and uploading takes forever, maybe even resulting in a timeout. And if it worked, the image was so large that it degraded your site’s performance.

Maybe you were trying to upload a photo from your phone, but WordPress doesn’t support it and tells you to please use another file. Or even worse, you upload a video, it succeeds, but then you realize none of the browsers actually support the video format.

In short: uploading media is a frustrating experience.

To work around these issues, you start googling how to manually convert and compress images before uploading them. Maybe even reducing the dimensions to save some bandwidth.

If you use videos, maybe you upload them to YouTube because you don’t want to bother with video formats too.

Maybe you switch your hosting provider because your server still takes too long to generate all those thumbnails or because it doesn’t support the newest image formats.

This is tedious and time consuming. WordPress should be taking work off your shoulders, not making your lives harder. So I set out to make this better. I wanted to turn this around and let WordPress help you.

At State of the Word 2023, WordPress co-founder Matt Mullenweg said the following about WordPress’ mission to Democratize Publishing:

We take things that used to require advanced technical knowledge and make it accessible to everyone.

Matt Mullenweg

And I think media uploads is a perfect opportunity for us to apply this. And the solution for that lies in the browser.

WebAssembly

Your server might not be capable to generate all those thumbnails or to convert a specific image format to something usable. But thanks to your own device’s computing power and technologies such as WebAssembly, we can fix this for you.

With WebAssembly you can compile code written in a language like Rust or C++ for running it in browsers with near-native performance. In the browser you can load WebAssembly modules via JavaScript and seamlessly send data back and forth.

At the core of my what I am showing you here is one such WebAssembly solution called wasm-vips. It is a port of the powerful libvips image processing library. That means any image operation that you can do with vips, you can now do in the browser.

Vips vs. ImageMagick

Vips is similar to ImageMagick, which WordPress typically uses, but has some serious advantages. For example, when WordPress loads vips in the browser it can always use the latest version. Whereas on the server we have to use whatever version that is available.

Sometimes, those are really old versions that have certain bugs or don’t support more modern image formats like AVIF. For hosts it can be challenging to upgrade, as they don’t want to break any sites. And even if ImageMagick already supports a format like AVIF, it could be very slow. Vips on the other hand is more performant, has more features, and even for older formats like JPEG it uses newer encoders with better results.

Client-side vs. server-side media processing

Sequence diagram for server-side media processing
Sequence diagram for server-side media processing

Traditionally, when you drop an image into the editor or the media library, it is sent to WordPress straight away. There, ImageMagick creates thumbnails for every registered image size one by one. That means a lot of waiting until WordPress can proceed. Here is where timeouts usually happen.

Eventually, once all the thumbnails are generated, WordPress creates a new attachment and sends it back to the editor. There, the editor can swap out the file you originally dropped with the final one returned by the server.

Compare this to the client-side approach using the vips image library:

Sequence diagram for client-side media processing
Sequence diagram for client-side media processing

Once you drop an image into the editor, a web worker creates thumbnails of it. A web worker runs in a separate thread from the editor, so none of the image processing affects your workflow. Plus, the cropping happens in parallel, which makes it a super fast process. Every thumbnail is then uploaded separately to the server. The server only has to do little work, just storing the file and returning the attachment data.

You immediately see all the updates in the editor after every step, so you have a much faster feedback loop. With this approach, the chances for errors, timeouts or memory issues are basically zero.

New use cases

The Media Experiments plugin contains tons of media-related features and enhancements. In this section I want to highlight some of them to better demonstrate what this new technology unlocks in WordPress.

Image compression

As shown in the demo at the beginning of the article, a key feature is the ability to compress or convert images directly in the browser. This works for existing images as well as new ones. All the thumbnails are generated in the browser as well.

Bonus: Did you see it? The plugin automatically adds AI-generated image captions and alt text for the image. This simply wouldn’t be possible on a typical WordPress server, but thanks to WebAssembly we can easily use AI models for such a task in the browser.

You can also compress all existing images in a blog post at once. The images can come from all sorts of sources too, for example from the image block, gallery block, or the post’s featured image.

In theory you could even do this for the whole media library. The tricky part of course is that your browser needs to be running. So that idea isn’t fully fleshed out yet.

Smart thumbnails

By default, when WordPress creates those 150×150 thumbnails it does a hard crop in the center of the image. For some photos that will lead to poor results where for example it cuts off the most relevant part of the picture, like a person’s head.

Vips supports saliency-aware image cropping out of the box, which looks for things like color saturation to determine a better crop.

Comparison of default vs. smart image cropping
Comparison of default vs. smart image cropping

At first you might think it is just a minor detail, but it’s actually really impactful. It just works, and it works for everybody! You will never have to worry about accidentally cropping off someone’s face again.

HEIC Images

If you use an iPhone you might have seen HEIC/HEIF images before, as it uses that format by default. It is a format with strong compression, but only Safari fully supports it.

Thanks to WebAssembly, WordPress can automatically convert such images to something usable. In this demo you will first notice a broken preview, as the (Chrome) browser doesn’t support the file format. But then it swiftly converts it to a JPEG, fixing the preview, and then uploads it to the server.

Bonus: this also works for JPEG XL, which is another format that only Safari supports.

Upload from your phone

In the above video I used an HEIC image which I previously took on my iPhone and then transferred to my computer. And from my computer I then uploaded it to WordPress. But what if you cut out the middleman?

In the editor, you can generate a QR code that you scan with your camera, or a URL that you can share with a colleague. Here, I am opening this URL in another browser, but let’s pretend it’s my phone. On your phone you then choose the image you want to upload. After that, it magically appears in the editor on your computer.

Hat tip to John Blackbourn for the idea!

Video compression

Media compression and conversion also works great for videos. When I record screencasts for this post, they will be in the MOV format, which doesn’t work in all browsers.

Thanks to ffmpeg.wasm, a WebAssembly port of the powerful FFmpeg framework, WordPress can convert them to a more universal format like MP4 or WebM. The same works for audio files as well.

This solution also generates poster images out of the box, which is important for user experience and performance.

Bonus: just like for image captions, AI can automatically subtitles for any video.

Animated GIFs

Sometimes you’re not dealing with videos though, but with GIFs. Who doesn’t like GIFs?

Well, the thing is, GIFs are actually really bad for user experience and performance. Not only are they usually very bad quality, they can also be huge in file size. Most likely you should not be using animated GIFs.

Every time you use the GIF format, a kitten dies
As Paul Bakaus once said: Gifs must die

The good news is that animated GIFs are nothing but videos with a few key characteristics:

  • They play automatically.
  • They loop continuously
  • They’re silent.

By converting large GIFs to videos, you can save big on users’ bandwidth. And that’s exactly what WordPress can and should do for you.

In the following demo, I am dragging and dropping a GIF file from my computer to the editor. Since it is an image file, WordPress first creates an image block and starts the upload process.

Then, it detects that it is an animated GIF, and uses FFmpeg to convert it to an MP4 video. This happens in the blink of an eye. As it’s now a video, WordPress replaces the image block with a special GIF variation of the video block that’s looping and autoplaying. And of course the video is multiple times smaller than the original image file. As a user, you can’t tell the difference. It just works.

Media recording

Compressing videos and converting GIFs is cool, but one of my personal favorites is the ability to record videos or take still pictures directly in the editor, and then upload them straight to WordPress.

So if you’re writing some sort of tutorial and want to accompany it with a video, or if you are building the next TikTok competitor, you could do that with WordPress.

Bonus: You probably don’t see it well in the demo, but thanks to AI you can even blur your background for a little more privacy. Super cool!

Challenges

Client-side media processing adds a pretty powerful new dimension to WordPress, but it isn’t always as easy as it looks!

Cross-origin isolation

On the implementation side, cross-origin isolation is a tricky topic.

So, WebAssembly libraries like vips or ffmpeg use multiple threads to speed up the processing, which means they require shared memory. Shared memory means you need SharedArrayBuffer.

For security reasons, enabling SharedArrayBuffer requires a special configuration called cross-origin isolation. That puts a web page into a special state that enforces some restrictions when loading resources from other origins.

In the WordPress editor, I tried to implement this as smoothly as possible. Normally, you will not even realize that cross-origin isolation is in effect. However, some things in the editor might not work as expected anymore.

The most common issue I encountered is with embed previews in the editor.

So in Chrome, all your embed previews in the editor continue to work, while in Firefox or Safari they don’t because they do not support iframe credentialless when isolation is in effect.

I hope that Firefox and Safari remedy this in the future. Chrome is also working on an alternative proposal called Document-Isolation-Policy which would help resolve this as well. But that might still be years in the future.

Open source licenses (GPL compatibility)

Another unfortunate thing is that open source licenses aren’t always compatible with each other. This is the case with the HEIC conversion for those iPhone photos.

Being able to convert those iPhone photos directly in the browser before sending them to the server just makes so much sense. Unfortunately, it’s a very proprietary file format. The only open source implementation (libheif) is licensed under the LGPL 3.0, which is only compatible with GPL v3. However, WordPress’ license is GPLv2 or later.

That means we can’t actually use it 🙁

The good news is that we found another way, and it’s even already part of the next WordPress release!

However, this happens on the server again instead of the browser.

This is possible because on the server the conversion happens in ImageMagick (when compiled with libheif), and not in core itself, so there’s no license concern for WordPress.

The downside of this approach is that it will only work for very few WordPress sites, as it again depends on your PHP and ImageMagick versions. So while this is a nice step into the right direction, only with the client-side approach can we truly support this for everyone.

The next steps

All of these challenges simply mean there is still some work to do before it can be put into the hands of millions of WordPress users.

While this project started as a separate plugin, I am currently in the process of contributing these features step by step to Gutenberg, where we can further test them behind an experimental flag.

We start with the fundamental rewrite of the upload logic, adding support for image compression and thumbnail conversion. After that, we can look into format conversion, making it easier to use more modern image formats and choosing the format that is most suitable for any given image. From there, we can expand this to videos and audio files.

Finally and ideally, we expand beyond the post editor and make this available to the rest of WordPress, like the media gallery or anywhere else where one would upload files.

I am also really excited about the possibility of making this available to everyone building a block editor outside of WordPress, like Tumblr for example.

Democratizing publishing

With client-side media processing we make a giant leap forward when it comes to democratizing publishing.

As mentioned at the beginning, the average WordPress user simply wants to be able to write without problems or interruption. By eliminating all these problems related to media, users will be able to create media-rich content much easier and faster.

Thanks to client-side media processing, we can greatly improve the user experience around uploads. You benefit from faster uploads, fewer headaches, smaller images, and less overloaded servers. Also, you no longer need to worry about server support or switch hosting providers. Smaller images and more modern image formats help make your site load faster too, which is a nice little bonus.

Convinced? Check out the GitHub repository and the proposed roadmap for the Gutenberg integration.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *