Modern Browsers

Today Google released a doodle, it was well executed, fun and I think that Robert Moog would have enjoyed it. But this isn’t about the doodle, it is about a small piece of text that Google shows just beneath it.

Upgrade to a modern browser and see what this doodle can really do.

This piece of text (and a link to the Google Chrome download page) shows up in all non-Chrome browsers, implying that they are not ‘modern’.

I have been guilty of using the term as well, once during feature sniffing, mostly to mean ‘not Internet Explorer older than X’ and I am sorry about that, especially since Microsoft seems to be taking development of Internet Explorer 10 very seriously.

But we should investigate what ‘modern’ means to Google, someone hinted that you need to support the non-standard Web Audio API to be detected as ‘modern’ so I booted up WebKit Nightly (built with the Web Audio API enabled) and went to Google and got same message.

Guessing that it was a lot simpler, I switched back to Firefox and changed the UA string to something that looked like Chrome and suddenly the message is gone, switch back to the default UA string and it shows up again.

Just to make sure that I didn’t do anything wrong, and because I was curious I investigated other browsers. Internet Explorer does get the message and the doodle, so does Opera. The iPhone and most silly UA strings do not get the doodle, and therefore does not get the message either. I don’t have any Android devices, but I assume that you would either get both the message and the doodle, or neither depending on if your device supports Flash.

In the end, the conclusion is that a ‘modern browser’ according to Google is a browser which sends ‘Chrome’ as its UA string and supports Flash or the Web Audio API.

Can we instead on production sites standardize on something like “this site requires (experimental) features not yet present in your browser” (Thanks @getify for the idea) and a link to instructions on how they can update their browser, or if it is a browser specific feature, information about the feature and why it isn’t yet supported in their browser of choice.

Note

If you are trying to reproduce results, make sure that you’re using google.com in English, the text about a ‘modern browser’ doesn’t show up otherwise.

Problems with a ‘pure Javascript’ implementation of H.264

I have written a lot of audio decoders in Javascript, and helped write a few more. I have never tackled video for a few reasons, and I’ll try to sum up why there will probably never be one implemented in ‘pure’ Javascript, and the methods with which I think it will be implemented instead.

Even the most high-end Audio codecs are also designed to work on really low-end DSP devices. ALAC (Apple Lossless Audio Codec) for example, decodes stereo fine in software on one of the 90 MHz ARM 7TDMI cores in the original iPod. AAC requires a bit more, but it is still within the reach of software on a relatively slow processor, like a Pentium or G3. A modern ARM processor can decode MP3 at a clock-speed of mere 10MHz, and with a bit more, AAC, which essentially is the most demanding codec that you’ll meet on the web.

Video codecs on the other hand are an entirely different story. The 2.4GHz Core 2 Duo in my laptop (a Macbook Pro) has serious problems decoding high-end (1080p Hi10P for example) H.264 in with FFmpeg. My desktop, a reasonably modern Xeon quad-core, handles these videos fine using FFmpeg, but with significant load. Note that this is with an implementation that is hand-optimized with assembly. To improve the situation, we cannot depend on hardware support either, because it is often out of date. No graphics card in my collection support this profile in hardware yet for example.

On top of these problems, there are some serious limitations in Javascript/ECMAScript that makes it a bad platform for video decoding. And while it is a very cool demo of emscripten, these are some of the reasons why I don’t think that Broadway.js will ever be able to decode H.264 in any sort of sane capacity using merely emscripten and some minor optimizations done by hand without a radically different Javascript engine to support it.

Floating-point

Essentially all operations in Javascript operate on floating-point numbers, and this is not likely to change in the future. For audio codecs, this is not really a problem since they tend to be designed in a way that you can implement them as both fixed-point and floating-point.

Video codecs on the other hand tend to rely a lot on fixed-point for optimization, H.264 is even optimized to avoid needing floating-point as much as possible. Even the discrete cosine transform and motion compensation in H.264 is modified to operate on fixed-point numbers instead of floating-point.

The reason for this is that modern processors can often process fixed-point operations much faster, especially the 8 and 16 bit operations that are the most common. These short integer instructions often have at least 4 times faster thoroughput than double precision floating-point. Certain complex instructions like division make the difference irrelevant, and in many cases require fallback to floating-point, but these operations are extremely uncommon in H.264.

SIMD

This is before the SIMD penalty is added for Javascript, because current Javascript engines utilize only scalar operations, a significant part of the execution hardware (1/2 to 1/4) spends most of the time idling.

Most decoders utilize these SIMD instructions, which gives them access to 8-16 times more throughput per core for simple operations. And on top of that, there are special instructions for optimizing MPEG codecs, giving a quite measurable speedup on top of that, which you are unlikely to be able to utilize without hand-optimized code.

Threads

To provide the final blow against current Javascript, there is very little possibility for shared memory multicore programming in a browser. Workers are not good enough to do this, I haven’t actually measured this as I do not plan to implement a video decoder with workers, but I think that the cost of communication and latency is currently too high for it to make sense.

Only using a single core on a processor that has 2-8 is another problem that would keep a Javascript implementation from ever competing with a native implementation.

Solution

There are two obvious solutions to all of these problems that are being prototyped on the web right now, WebCL and Rivertrail. Both of these are designed to solve the threading problem mainly, which is likely not the biggest issue, but it is still significant.

Rivertrail could solve most of these problems since it is currently based on Intel’s OpenCL runtime, which has good optimizers. It isn’t designed for this very specialized task, and while it does allow you to reduce precision, it doesn’t allow direct access to integer or media instructions, but it is a much better option than pure Javascript and with the addition of an integer API to Javascript, this could easily turn into the preferred method.

WebCL (OpenCL for the web) on the other hand, already solves most or all of these problems since it is essentially a massively parallel C with SIMD and device-specific extensions. It even allows for the GPU to pick up most of the burden, which is in many cases preferable to running on the CPU due to the extra computational power available.

There are probably other solutions as well, but just hoping that single-threaded Javascript with double precision floating-point will ever be enough is naïve and counter-productive in my opinion. Especially on mobile devices, which have special concerns, WebCL is in a great position to solve these too in the future.

And while I would love to be proved wrong about this, I don’t think I will be for a long while, and at that point, there will be more advanced codecs and higher resolutions around to target.

Music Hack Day Amsterdam

I think I might have one of the best jobs in the world, last weekend the Official.fm Labs team went to Music Hack Day: Amsterdam, and I must say, I had a most wonderful time.

If you have a chance to ever visit a MHD, you really should, the investment is only time and the rewards can be immense, for me, that reward was meeting some really nice people

I met some old and new friends, Per and Kalle from Youtify were there hacking and did some great progress adding profile pages and allowing you to follow other people’s playlists. If you haven’t been using Youtify for your Youtube music needs, you should try it, it is really nice.

The guys from Zvooq, who in addition to being awesome people, seem to have great ideas on how to improve the situation for artists and music consumers. I hope I’ll be able to test their Zvooq soon, but for now it seems to only be available in Russia and a few other countries.

And a lot of other people, I unfortunately didn’t get the name of most of them, but they were really nice and one of the most interesting things about Music Hack Day is that people of significantly different skill sets interact with each other. I learned a bit about music theory, and I think I gave some nice hints about what I know.

The location, Nederlands Instituut voor Mediakunst, was a fantastic venue, I didn’t see all of the exhibits, but they seem to be doing some serious hacking even when we’re not around so if you have the opportunity, visit them and take the tour.

The organizers, Roeland P. Landegent and his team, deserve credit as well. It was a well organized event, and all our hacker needs were fulfilled.

The only bad thing about the whole thing, I got stuck digging a deep rabbit-hole for myself, so I won’t release my work yet (I plan to convert it into an XPCOM component as soon as I get some time away from schoolwork.) But it is essentially an implementation of the ‘Simple Audio’ API that I wrote earlier, but now it is renamed Mio (澪) because ‘Simple Audio’ didn’t give the right vibe, the API isn’t designed to be simple, it is designed to be flexible and powerful.

The rest of the Official.fm Labs team were working on a sort of voice controller for games, where through pitch and intensity you controlled your character. They were having some problems with the short-time Fourier transform, but otherwise I think they got most of it nailed down now, they managed to get a demo working at least.

A final open question, Spotify, when is there going to be a Music Hack Day Sweden?

Tweeting about Computer Science education

I did send out a tweet today, and I realized only in retrospect that some people reacted very negatively on that. I am sorry about that, my intention was not to insult anyone, or their education.

This was the tweet, modified for formatting, the parenthesis contains a clarification that I also posted on Twitter.

Why do we educate computer scientists to get (obtain) developers?

We wouldn’t educate structural engineers to get (obtain) masons…

I did not intend to insult anyone, but I can see why it did and how I did it.

It was not an attack on computer science education at all, it was merely a comment on that a lot of people seem to think that you should (and need to) study computer science to become a developer.

Dijkstra once commented,

Computer science is no more about computers than astronomy is about telescopes.

And he summarized my intention in a way that is a bit less hostile.

What I meant was that computer science does not teach you how to code, how to write documentation, how to use source control, write good issues, etc. I am not good at a lot of these things, and I doubt that if I wrote a thesis in computer science and got a degree in CS, I would magically learn these skills.

No, you need to learn that somewhere else. And no amount of datastructures or natural language processing or theory about compilers and so on will ever make you good at these skills which I consider essential before I would ever consider myself a good software developer.

I am not even sure if I am good enough to call myself a developer; I am at most a hacker. I might be naïve, but I think that if I practice, I will get better. With practice I think I can get to the level that I can call myself a software developer without feeling that I have serious holes in my skillset.

Even if I magically learned what was required for every computer science course here in Lund, I think I would still feel that I have those same holes. I would definitely be a better hacker and I would certainly be a much better computer scientist, but I still would not be able to write awesome documentation, or write eloquent code.

But computer science lecturers teach you about computer science, and computer science is not only about computers and code, as Dijkstra said.

Computer science is a wonderful branch of science that has produced an immense amount of value, and do not even consider that learning things will make you a worse developer, especially not computer science. Learning things will always make you better, especially as a software developer, and learning computer science is awesome.

But what Dijkstra said about Computer Science is not true about software development. Software development is about computers and people.

For me, a software developer is someone that produces tools that turn people into better people, more productive people, happy people; they turn theory into actual working programs, they are people who generate value. And only to some extent is it through writing code, some developers design the structure, some write documentation, some test applications, and so on.

To some extent is it through writing code, some developers design the structure, some write documentation, some test applications, etc. And in a lot of these situations, a computer science educations can be really helpful.

A small part of those skills are picked up at university studying computer science (or anything else) and if I could pick up the rest while studying, I would really love to, but I do not think that the current system of education is good at teaching all of these skills that are necessary for being a good software developer.

I am sorry if I offended anyone, if you think I am wrong, please leave a comment.

Note

  1. Computer Science is a horrible name, in Swedish we sometimes call it ‘Datalogi’ which is a less horrible name.

Testing numerical accuracy of browsers

According to the standard, only the arithmetic operations in Javascript need to be correctly rounded, the functions in Math does not have any accuracy requirements.

But out in the real world, browsers are a bit better than that, we have a feeling that the functions in Math are reasonably accurate, but if you need to be convinced (like me) then you should look at https://github.com/JensNockert/accuracy.js which fuzz tests most of the operations in Math that have a tendency to be inaccurate.

If you want to be even more convinced, generate more test cases using generate.rb.

Ps. sin, cos and tan are missing, their periodicity makes them hard to fuzz using this technique.

Update

  1. I fuzzed on Windows as well, and Chrome on Windows does not provide sqrt with correct rounding, a bug has been filed. Firefox and Opera provide as much precision on Windows as on OS X.

Simple Audio

We have been discussing a lot of audio at Official.fm Labs, and since we’re working with audio in different ways and have different views on what should be a first step; I am throwing out a proposal for them, and for you.

In addition to this one, there are at least two more proposals (which are a lot less sketchy and have partial implementations) for real-time audio on the web, https://dvcs.w3.org/hg/audio/raw-file/tip/streams/StreamProcessing.html from Mozilla’s Robert O’Callahan and https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html from Google’s Chris Rogers.

Both proposals are designs based on graphs, containing various nodes. But their proposals are many pages long and I don’t have the time or energy for that, so I will try to show you that audio in the browser can be described on a napkin (using both sides).

My idea is that audio is not really that complicated, you do not need a significant amount of routing or specialist code in the browser because you can do all of that in Javascript to once you have a few of the basic building blocks, and then when there are performance issues or other limitations, you add the more advanced features.

A first implementation should provide a seed, not a forest. To do that, we need a way to read and/or write audio samples to a stream and receive events relating to that stream.

This is essentially what the Mozilla Audio Data API does, but with a more expressive API.

Note that I wrote this in a few hours, so nothing is fixed, especially not names. Also, as an OS X zealot, I have been inspired quite a bit by Core Audio and any similarities are probably not coincidence.

Usage Scenarios

  • Playing short sounds with low latency and accurate timing: Useful for games and similar applications where sounds react to user interaction.
  • Playing longer audio segments: Useful for example music players, or other streaming uses.
  • Capture audio from a microphone: Useful for all sorts of audio conferencing needs, for example something like Teamspeak or Skype in a browser.
  • Bypassing the lack of codec support in the HTML5 media elements: Useful for anyone with files in any format that is not supported in all browsers, like MP3, AAC, Vorbis, ALAC, FLAC, etc.

These are the usage scenarios that I am considering at first, and I think it covers 90% of the applications need audio (right now). Because if we look at what traditional applications that include audio are most popular, we notice that teleconference, games and music almost certainly will come up on top for almost every user.

And while I think the last 10% are absolutely awesome as well, (an HTML5 digital audio workstation for example) I think that they can wait a bit until we have the basics before we go on to the really mind-blowing stuff.

Features

  1. Writing audio to devices.
  2. Reading audio from devices (adding support for audio or video elements should be trivial).
  3. Accurate timing, relatively low latency.
  4. Events from the audio subsystem.
  5. Easy to implement.
  6. Designed for future extensibility instead of providing a kitchen sink now.

Accurate timing and low latency is very important for certain kinds of games, if you need to wait 100ms from Mario hitting the coin until the sound starts playing, players will be confused and the experience will be bad. For multimedia, the audio needs to be in sync with whatever other things are happening.

Events are required, all applications should be able to act correctly on hot-plugging events and so on. For example, if a user uses an application where you can call landlines, without a microphone, then plugging in a microphone should directly enable audio input without a reload.

The same thing should be supported if for example a USB headset is connected while on a music site, the site should be able to react to this, and play through the headset instead.

It should also be relatively easy to implement, and in the future extend for more advanced functionality.

API Overview

  1. An AudioContext, referring to the whole state of the audio subsystem of the browser.
  2. An AudioStream interface, representing a single audio stream, which can support both input and output.
  3. An AudioStreamDescription interface, a description of the data flowing in the stream.
  4. An AudioBuffer interface, contains data for a set of channels.
  5. An AudioTimeStamp interface, contains data detailing a specific point in time, relative to the clock driving a specific stream.

The Audio Context

The audio context is essentially singleton, you can create multiple contexts, but they just masquerade for the global state, the streams available should be the same in all contexts.

interface AudioContext {

readonly attribute AudioStream[] streams;

readonly attribute AudioStream defaultInputStream;

readonly attribute AudioStream defaultOutputStream;

}

Create a context via

var audio = new AudioContext()

Attributes

  • streams: An array of AudioStreams that are available for output or input.
  • defaultInputStream: An AudioStream representing the default input device, or null if there are no input devices.
  • defaultOutputStream: An AudioStream representing the default output device, or null if there are no output defices.

Events

  • NewStreamAvailable: Contains the new stream.
  • DefaultInputStreamChanged: Contains the new default input stream, and the old default input stream. It is triggered when for example, the user plugs in a microphone.
  • DefaultOutputStreamChanged: Contains the new default output stream, and the old default output stream. It is triggered when for example, the user plugs in a pair of headphones.

Discussion

To begin with, the only exposed streams would be the defaultInputStream and the defaultOutputStream, but more advanced applications like for example, a web-based digital audio workstation, could require additional streams to support a large amount of channels, or to provide for example DJ with two different outputs.

The AudioContext should be accessible from a worker, allowing audio processing to be done in a separate context from the rest of the application to provide latency sensitive applications with a more stable environment, less affected by garbage collection pauses.

There are a lot more things that would be interesting to send events about from an audio perspective, but which possibly should not be in the audio context, the first thing that pops to mind is an event when a device returns from deep sleep, to allow applications to prevent the accidental output when resuming for example a laptop.

In addition, there should be a method that allows you to create streams from media elements, allowing the programmer to post-process the audio in for example a video.

Audio Stream Description

An audio stream description is a description of the current, or in the future, the desired state of an audio stream and are designed to hold a lot of information that is useless in the normal case of uncompressed linear PCM.

interface AudioStream {

readonly attribute DOMString identifier;

readonly attribute double sampleRate;

readonly attribute DOMString[] channels;

readonly attribute short bitsPerChannel;

/* Only for formats with a fixed frame-size */

readonly attribute long bytesPerFrame;

/* PCM specific attributes */

readonly attributes DOMString sampleType;

readonly attributes DOMString endian;

readonly attributes DOMString aligned;

readonly attributes boollean interleaved

}

Attributes

  • identifier: Always ‘Linear PCM’ for now
  • sampleRate: The sampling rate in samples per second
  • channels: The canonical name of each channel
  • bitsPerChannel: The number of useful bits in each sample
  • bytesPerFrame: The number of bytes in each frame, including padding

PCM specific attributes

  • sampleType: ‘float’, ‘signed-integer’, ‘unsigned-integer’
  • endian: ‘big’, ‘little’
  • aligned: ‘packed’, ‘high’, ‘low’
  • interleaved: boolean

Notes

Different codecs need different attributes, and if an attribute does not make sense for a specific stream, then it should not include it in the description.

To begin with, there is only need for Linear PCM, since that is the format of almost all modern hardware. But in the future, more complex format descriptions would be required, especially to describe more complex formats that could be extracted from for example a media element.

Another thing that might need a change is the channel descriptions, some sort of location or something could be useful, so maybe they should be changed from strings to objects.

In some future, with bitstreaming of audio, or codecs exposed to Javascript, more complex features might be required from the stream description.

Audio Time Stamp

A time stamp object is simply represents a point in time, relative to the clock for a specific direction in a stream.

interface AudioTimeStamp {

readonly attribute AudioStream stream;

readonly attribute DOMString direction;

readonly attribute Date hostTime;

readonly attribute double sampleTime;

}

Attributes

  • stream: the audio stream that this time stamp is relative to
  • direction: ‘output’ or ‘input’, the direction of the stream that the timestamp is relative to
  • hostTime: a date, the time when the first sample will be played, or when the first sample was captured
  • sampleTime: Number of samples passed since the stream started (as a double, since a Javascript integer would only hold precision for 3 hours at 192kHz, a double on the other hand is fine for 1500 years).

Notes

Some additional time measurment systems could be included, like the relative time in seconds, etc. But it feels unnecessary or first implementation.

Audio Buffer

The simplest object, you do not create these yourself. They are passed to you when you need them.

interface AudioBuffer {

readonly attribute ArrayBuffer data;

readonly attribute AudioTimeStamp timeStamp;

readonly attribute long channels;

}

Attributes

  • data: An ArrayBuffer containing data, or into which you need to write data.
  • timeStamp: The time at which the buffer ‘starts’, or null.
  • channels: The number of channels interleaved in this buffer.

Notes

This is the construction I am least sure about, currently the channels could be inferred from the stream description, and the entire object could be replaced by a simple ArrayBuffer. I am not sure if there is any requirements for extensibility either, Google has some extra properties for these that while useful, could also be inferred from the stream description.

Audio Streams

An audio stream represents a stream of audio data to/from a device or media element. The simplest way to get a stream is to get the default ones,

interface AudioStreams {

readonly attribute AudioStreamDescription input;

readonly attribute AudioStreamDescription output;

}

Attributes

There are two interesting attributes on the stream,

  • input: The input format.
  • output: The output format.

for most streams, only one is not null.

Events

  • inputDescriptionUpdated: contains the stream, the new AudioStreamDescription.
  • outputDescriptionUpdated: contains the stream, the new AudioStreamDescription.
  • processAudio: contains the stream, an array of AudioBuffers to read from, and an array of AudioBuffers to write to.

Notes

There is a lot of room for improvements here, reconfiguring the stream is an obvious first step. Another thing is actually exposing the device that the stream belongs to, which could include a device name and so on, but there are possible privacy aspects concerning that.

It is possible that output and input streams should be split into two different interfaces, but both APIs are essentially equivalent for most purposes.

Garbage Collector

It can be important that the processAudio event is triggered just before garbage collection (depending on the length of the collection, and the buffer states) to allow the application to fill all buffers before the collection pause to minimize the risk for underflow.

If the application allocates significant amounts of memory during this callback, the garbage collector could trigger anyhow, but a careful programmer should be able to create pauseless playback in this manner.

Battery

Running on mobile devices is important, and the API can easily handle different devices and usage scenarios, when a low-powered device needs audio, it simply provides larger buffers to the applications, increasing latency, but to compensate, it does not need to power up the processor as often.

In addition, a low-powered device could provide streams with lower sampling rates, which would in some cases could reduce the amount of processing that would be required.

Security

I am not sure how you should request input device access from the user, and it is possibly out of scope of the API, but in a browser, the user needs to be asked for permission before any input device is activated or a massive privacy breach is bound to happen.

In addition, if additional information about audio streams were provided (like the audio hardware name) then it could be an information leak that when combined with other information, could uniquely identify a user.

Inside of something like Node.js, no additional permissions compared to a regular native applicaton would be required, so all of the API could be accessed by default.

Advantages

Compared to the Web Audio proposal

  • It is a lot simpler to understand, and work with for basic Javascript audio playback.
  • Supports input.

Compared to the Media Stream proposal

  • It is a lot simpler to understand, and work with for basic Javascript audio playback.
  • Allows Javascript to generate samples outside of a worker.

Disadvantages

  • It doesn’t support a lot of built-in effects for example.
  • It does not provide a way to interact with media elements (but this could easily be fixed with some extra work).
  • It is a lower-level API that might require libraries to provide a higher-level API, for example for constructing processing graphs or an API for playing short effects for games.

Notes

If you have any good ideas for names or otherwise, throw them in my direction on Twitter (@jensnockert), jens@aventine.se or here.

All events should be implemented with something like the DOM events and addEventListener, to make the system work as well together with normal web applications as possible.

Also, the API is not restricted to browsers outside of the interaction with media elements, a Node.js implementation should be possible and would be useful for certain desktop and server applications.

TBD 2012, Malmö, Sweden

I spent the weekend at the TBD 2012 Hackathon at Djäkne Kaffebar in Malmö.

It was a great event full of great people. And even though it was the first event in the series, it was a really well organized event in almost every way, the food was great, the location was good, the wifi was not too bad etc.

I really hope that it can become a regular event. From my perspective Malmö and Lund seems to be getting a lot hotter when it comes to software development, and these kinds of event are a really nice way to meet people and brings a lot of value for all developers in the region.

During the event I mostly worked on Youtify, a service for being able to consume music from the cloud in a more structured manner. I added an Official.fm backend for it, and fixed some small bugs.

Meeting the Youtify guys was a wonderful opportunity, both the service and the team are great, and it felt great to be able to help them a bit, and I am excited about being able to meet them again at Music Hack Day - Amsterdam.

During the time not spent coding I mostly talked to other people, and one of the most interesting conversations during the weekend was with filmmaker Simon Klose, who is currently finishing a documentary about the Pirate Bay called The Pirate Bay: Away From Keyboard, but at TBD he was instead trying to revolutionize the way documentaries are consumed.

The proof of concept that his group produced, called the ‘Linkontrol’ intended to improve the experience of the film by combining the film and hypermedia in a really slick way and was quite awesome to say the least. It used popcorn.js and you can read a bit about it at his blog.

A mere textual description does not really describe the concept, or demo, and if you meet any of them, do try to get a demo, it was quite exquisite, and took the audience award of course (they really deserved it.)

But there were two small things that were a bit odd, making the atmosphere a bit tense.

First, during the presentations, one team called ‘Tunafish’ forced us to take down the stream and ‘sign’ a verbal non-disclosure agreement since their idea was ‘secret’. The other thing that annoyed me was that the big prize was money.

I do not really mind people working on commercial projects on hackathon, I do not even mind if they just move their work there. But I kind of expect people that work on secret stuff to keep any secret details secret by simply not telling anyone about them instead of holding a presentation about it, and then force people to not tell others.

About the money, I wish they had just converted it to some kind of token or luxury that you usually do not buy. A nice bottle of wine, chocolate, medals, t-shirts, or maybe some sort of experience for the winning team. When you do something for money, it feels a lot like work, and I am not at a hackathoon to work, I am there to have a good time.

In the end, it was a great time, and I will most likely be there if it is repeated, but if those two small things got fixed, then it would be most awesome.

Environment and Feature Detection

A warning, most of these extensions may have extreme security issues currently, they are prototypes after all. Use a separate browser instance and profile for this series.

If you already have a working install of WebCL and River Trail, you can skip this part. Do not assume that you have them just because you have the latest version of your browser, because they are not available yet in any browser without a special build or extension.

Preparation

The first step is to install OpenCL drivers for all your devices.

If you are running OS X (Snow Leopard or Lion) then all OpenCL drivers are already installed and you’re good to go.

If you are running Linux or Windows, then you might need to install some OpenCL drivers. If your CPU is supported by both the AMD and Intel SDKs then I recommend you to install both.

If you have an nVidia or AMD graphics card, you probably already have OpenCL drivers installed for them (they are included in the graphics drivers), but you should make sure they are the latest version.

Now we’re onto a few semi-optional tools that you probably want, but can avoid them if you want to,

Git is my version control system of choice, and you will probably want to check out the repositories of examples and on Github.

Coffeescript is a thin wrapper around Javascript that I happen to like, it has a bit more Pythonesque syntax and is really nice. A lot of the support libraries, and a few of the examples will be written in Coffeescript, and you might want to be able to recompile them.

Installing WebCL

Now we can start installing the WebCL prototypes.

If you are running Linux or Windows, then you need to first install a 32-bit version of Firefox 10 and make sure that you have installed Firebug into your new profile, then you can install the Nokia WebCL prototype.

If you are running OS X, then you need to install the Samsung WebCL prototype based on WebKit. It is a bit complicated since you need to compile it from scratch.

Just follow the included readme, after a while into the build, you might meet some compilation errors but they are easily fixable.

Installing River Trail

To allow River Trail code to be accelerated via OpenCL you need to install the River Trail extension.

On Windows or Linux it really does need the Intel OpenCL driver, the AMD OpenCL driver or the driver for your graphics card is not enough. But if your computer does not support the Intel OpenCL then you can still execute River Trail code using a normal Javascript engine.

On OS X, the built in OpenCL drivers are fine.

Detecting WebCL and River Trail

So, after installing all that, we need a simple way to check that it is working. The easiest way is to checkout https://github.com/JensNockert/tools-for-the-next-generation with git (or download an archive of the repository from Github).

Under “01 - Feature Detection” there are two html files, webcl.html and rivertrail.html that contain feature detection code. Try both to make sure that your setup works. If everything installed correctly, it will look something like this for River Trail,

And something like this for WebCL,

The code is not really that spectacular, but feel free to check out the source and see my horrible DOM manipulation code. (Hook me up with a pull request if you enjoy that kind of stuff)

About OpenCL

Make sure you have webcl.html open in a browser, and make a small note of the structure of the information.

The first level in the output, “Apple” in my screenshot is the OpenCL platform name and underneath all OpenCL devices corresponding to that platform (but a single piece of hardware can be devices under multiple platforms.)

In OpenCL there are two domains where code can execute, the host (in WebCL this is the browser) or on a device which is connected to a host. The code we run on the host we call the ‘Application’ and on the code on devices we call ‘Kernels’.

And as we will learn in future lessons, calling kernels is different from how we call normal functions from the host. Another important thing to note about kernels is that they are not written in Javascript but a high-performance variant of C.

Summary

To summarize on what you should install,

  • Browser capable of WebCL
  • Browser capable of accelerated River Trail

and make sure they work. The rest is mainly sugar that could help you reach that goal.

Notes

I will be using Firefox most of the time, but the example code that does not depend on a specific feature should be portable to most major platforms (Firefox, Chrome, Safari and Opera.)

Any specific feature dependencies will be noted in the corresponding article (and please point it out in the comments if it is not.)

Edits

  1. Updated for Firefox 10

Presenting Hydra

I just want to present Hydra, a small library I developed to enable applications to have a unified interface to the WebCL prototypes, even before the specification is ready.

I hope that within a year or so, that it will be useless, but for now it is pretty nifty. And allows me to use the same example code for Windows, Linux and OS X in my “Tools for the next generation of Web Applications” series.

In addition to providing a unified interface, it also fixes some small bugs in the different prototypes.

It is available on Github https://github.com/JensNockert/hydra under the Simplified BSD license, you’re essentially allowed to use it for whatever you want.

If you find any bugs or have a feature request; add an issue on Github or send me a tweet (@jensnockert).

Tools for the next generation of Web Applications: Introduction

I do not know how the web will evolve in the future, I don’t think that anybody knows how the web of 2020 will look like, or what applications will be popular then.

But regardless of the direction the web evolves, we will undoubtedly see more and more complex client-side applications being developed. And a lot of the applications that were traditionally native applications will probably migrate to the browser within this time frame.

The migrating applications might include everything from games to large simulations to image and video editing, and everything inbetween. Your imagination is hopefully the only limit to what you will be able to achieve.

Because betting on the web is one of the safest bets to take, it is simply the platform that is most accessible to people today, and the platform people care about the most.

The goal of this series of articles is to give you some insight into some techniques, frameworks and tools that might be useful to build this new generation of applications, or to allow you to improve your current applications.

The tools that I am most interested in are tools that enable a new class of applications that we earlier could not build in the browser without plugins, and we’ll primarily focus on the set of these are almost purely performance increasing.

For example,

  • Faster Javascript engines
  • Typed Arrays
  • SIMD Intrinsics
  • Workers
  • River Trail
  • WebCL
  • WebGL (for computing, not graphics)

But to understand these new tools of the web, we need to understand the native libraries and features that power them.

The faster Javascript engines of the future, typed arrays and any SIMD intrinsics are designed to accelerate each thread of your applications. And do so by allowing us to utilize each processing core in a better way, and program ‘closer to the metal’.

River Trail and WebCL utilizes OpenCL to allow a piece of code to run on multiple processor cores, and in the case of WebCL, allow your code to run on graphics cards and even specialist OpenCL accelerators right from your browser.

If you haven’t heard of OpenCL, it is a framework for heterogenous computing designed by Khronos (who are also maintaining the OpenGL standard), and allows you to execute kernels written in a high-performance variant of C on just about any processor around. OpenCL supports everything from large clusters down to small embedded systems.

Single-threaded

Currently there are two engines that I enjoy to code for: the new Spidermonkey with type inference introduced in Firefox 9 which is really nice, and the V8 / Crankshaft engine used in Chrome. But the Javascript engines of the future has a lot more in store for us, and all of them are already picking up steam.

For example, Mozilla is currently working on Ionmonkey. Ionmonkey is a new whole-method JIT for Spidermonkey that hopefully brings some significant speedup for many types of code (and especially the type of code that we are interested in). It isn’t ready yet, but we can already see some benchmarks here and follow how it develops.

Internet Explorer 9 introduced the new Chakra engine, which has some interesting features that will probably migrate to other engines. For example, it compiles code on a separate thread, allowing it to load code faster and start executing it quicker. And I am convinced that Internet Explorer 10 will introduce features that will allow Internet Explorer to defend its position as the most widely used browser.

One of these features that will be included in the next version of Internet Explorer (but is already supported in all other major browsers) is one of the most significant API developments in high-performance Javascript during the last few years: typed arrays. Typed arrays behave in most respects like regular Javascript arrays, but they have a fixed type and length. This on one hand gives Javascript programmers a nice way to interact with binary data and on the other hand gives the Javascript engines a lot more opportunities for optimization.

The Google Chrome team also introduced NaCl (Native Client) the last year, which is a reasonably interesting proposition from a performance standpoint, since it allows you to replace some of your Javascript with native code. It seems like you should be able to implement an OpenCL to NaCl compiler, which could be very interesting. Unfortunatly since it uses binaries instead of code, it is very hard to inspect the scripts, unlike in Javascript.

The two other browser vendors, Webkit (Apple and friends) and Opera recently shipped new browser engines, and support all the engine-level features that we currently expect, and are very likely to stay competitive in the future.

But there are a lot of other features that are in the planning stage. A pet feature of mine, for example, are SIMD intrinsics. SIMD (Single-Instruction Multiple-Data) is a method for improving computational throughput in modern processors by performing the same operation in parallel on multiple pieces of data.

These SIMD intrinsics are very simple functions that essentially map down to a few simple SIMD assembly instructions. The Javascript engine would be aware of how these functions work, and generate special optimizations for them.

This is mainly an optimization, but it would also allow us to write more easily readable code when manipulating ‘strange types’ in Javascript, for example, long (64-bit) integers.

While there are currently not even any proposals of how these SIMD intrinsics should behave, there is still a high probability that we will see something along those lines in a future revision of the Javascript language.

This leads us to more complex parallelization features that introduce more than one thread of execution.

Multi-threaded

Workers are currently the only way of executing Javascript in parallel that is widely supported in current browsers, but they are not really designed for the task. They are designed to allow for background tasks, but are not really suitable for computation on their own.

But make sure that you do not forget about them, because they are a good fallback and can be a force multiplier when combined with more advanced features.

River Trail utilizes OpenCL to execute Javascript kernels on a multi-core CPU using a friendly API. I am quite convinced that it will be a popular choice in the future.

The most compelling feature of River Trail is that it is tightly integrated with the browser, and therefore allows for a lot more optimization than WebCL (or OpenCL) allows. Don’t be surprised if a future River Trail implementation outpaces WebCL significantly on short kernels where OpenCL imposes a too high communication overhead.

Another interesting thing is that River Trail can be combined with a lot of other performance increasing features in Javascript, for example SIMD intrinsics, which (like the SIMD features in OpenCL) could significantly increase performance and readability for certain kernels.

WebCL is essentially the big brother of River Trail, exposing the full OpenCL API to the web programmer, and allows you to use unmodified OpenCL kernels in your application. It is essentially the more flexible version of River Trail, and is designed to allow you to use any OpenCL accelerator in the system, including graphics processors and so on.

WebCL is also the API that we will be using the most throughout this series, mainly since the compilers are mature, and it is also the language and framework that I am most familiar with.

But River Trail has some interesting opportunities that we won’t see in WebCL since could be tighter integrated into the browser at a future point. For example could an implementation of River Trail significantly reduce the communication overhead required to run kernels on the CPU, which is currently quite significant in OpenCL.

WebCL currently has the advantage that the infrastructure is a bit more mature on the kernel side, on the Javascript side both technologies are noticably not ready.

WebGL on the other hand has the advantage that it is reasonably mature, and allows execution on just about every graphics card available.

But I generally wouldn’t recommend using it for computation though unless you have very specific requirements, since even the simplest tasks can easily turn extremely complex unless you’re very good at GLSL and WebGL. It is simply not designed for computation, only graphics.

Conclusion

There are many tools and frameworks already available in a pre-release form for us to play with, and the best way to get used to them is to actually use them. Just be be aware of the changing and non-final nature and use this time to your advantage, most developers won’t start using these tools until they are almost ready, and by then it is to late to influence their growth.

In addition to tools that ‘merely’ grant us faster performance, we have a lot of tools that simply allow us to do a lot of things that we could not do before, but those are interesting enough to get their own introductions when we meet them later in the series.

The next episode will be a shorter one, and contain instructions on how to set up our development environment on Windows or Linux. For example, installing different OpenCL drivers, WebCL plugins and River Trail.

Notes

There are currently at least three different implementations of WebCL,