Using PDF.js web worker cross domain (CORS)

August 7, 2014 No Comments by Richard

I recently have starting doing a lot of work with emscripten as a possible pure javascript solution to allow GradeCam’s technology to work on browsers without NPAPI or ActiveX support. Part of this effort has required putting a lot of code into web workers so as not to destroy the responsiveness of the web page. This has actually worked much better than I anticipated, but it raised an interesting challenge: GradeCam licenses their technology to third parties by means of a javascript API, which means we need to be able to load all of our tech from our own servers, even if the code is running on another domain.  This sounds like “business as usual”, except for one problem:

Web Workers don’t work cross-domain on many browsers!

PDF.js

One of the pieces we will be using is PDF.js — we’re using it to basically convert a PDF to an image, one page at a time.  It’s a great little library, and I’m really impressed with it!  However, we ran into the same problem again — PDF.js doesn’t work if I load it from a different domain, because the web worker won’t load!

Blob URLs

The solution comes in the form of the “Blob” object, which I had not previously used.  With a fairly little bit of code, you can create a Blob URL that can be used to create a web worker from any arbitrary javascript you have in memory. This incidentally works for any web worker, but can be used with PDF.js as well. So basically all we need to do is load the file via AJAX and we can create a web worker with it.  The following example code makes use of the built-in Promise objects in PDFJS.


    var cachedJSDfd = null;
    function loadWorkerURL(url){
        if (cachedJSDfd) { return cachedJSDfd; }
        cachedJSDfd = PDFJS.createPromiseCapability();
        var xmlhttp;
        xmlhttp=new XMLHttpRequest();

        //the callback function to be callled when AJAX request comes back
        xmlhttp.onreadystatechange=function(){
            if (xmlhttp.readyState==4 && xmlhttp.status==200){
                var workerJSBlob = new Blob([xmlhttp.responseText], {
                    type: "text/javascript"
                });
                cachedJSDfd.resolve(window.URL.createObjectURL(workerJSBlob));
            }
        };
        xmlhttp.open("GET",url,true);
        xmlhttp.send();
        return cachedJSDfd.promise;
    }

This function accepts the URL for the web worker javascript and returns a Promise that resolves to the URL that refers to the loaded data.  We can then use that somewhere else to initialize PDFJS to use that blob instead of trying to load the web worker itself:


    function initWebWorker() {
        return loadWorkerURL('http://www.domain.com/path/to/worker.js')).then(function(blob) {
            PDFJS.workerSrc = blob;
            return PDFJS;
        });
    }

When you go to create your PDFJS file, just do something like this:


    function openPdf(url) {
        return initWebWorker().then(function(PDFJS) {
            return PDFJS.getDocument(url);
        });
    }

Call that instead of calling PDFJS.getDocument itself and it will load your web worker before trying to use PDFJS. Mission accomplished!

Using the trick with other Web Workers

Other web workers will just accept the URL returned by loadWorkerURL (above stored in ‘blob’) instead of the normal URL to a javascript resource and will otherwise work as normal.

Configuring CORS (Cross Origin Resource Sharing)

In order for AJAX to work requesting the javascript source from your other domain, you need to configure the CORS headers in your web server.  Here are the headers that I’ve added for my resources, since I want them accessible from any location:

{
    "access-control-allow-headers":"Content-Type, If-Modified-Since",
    "access-control-allow-methods":"GET, HEAD",
    "access-control-allow-origin":"*"
}

Post a Comment

Your email is never published or shared. Required fields are marked *