Encoding multimedia into text!

Yesterday, I got to see a nice photograph of the (controversial) ball lightning. So, I immediately liked the photograph and felt like I should download it. Well, it’s not directly downloadable, especially when the author has asked flickr to disable the option for his/her download.

In flickr, in order to protect the image content, they put a wrapper around the images which shields the image from your easy <dragging-to-desktop> action. Right-click options won’t recognize the image because you’re not on the image! You can think of it like this, whenever you try to “interact” with the image, you’re always a few layers above the image itself, that your browser can’t detect it! (in this way)

But, there’s always a workaround. Before that, let me tell you something. There’s simply no difference between browsing & downloading in the internet (technically). Because, whenever you view something (say, this photograph), you’re already seeing the downloaded copy of the file. This is done by your browser. That’s why it needs the cache. If you can see something, then you can download it. No one could stop you from downloading it because, in order to view it, you have to download it!

Okay, now back to the topic…

The workaround is to search for the image in the source HTML tree. It’s time-consuming, yeah. But, luckily Firefox has this wonderful “Inspector” which can point out “what corresponds to what exactly” in a webpage. On hovering over the HTML source code, you can see Firefox indicating what the chosen tag represents in the actual page. For instance, here’s how I got the photograph’s link… (Kid’s stuff)

You can see that the image is about 6 layers beneath the surface!

So, that’s how you get a file. While I was digging inside those stuff, I noticed something strange. In the place of one of the image sources, I found something that goes like

 ...

It was really long as though it had about a 100k characters or something. When I pasted that 100 kB “horror” into Firefox’s address bar, I get the image of that ball lightning. This really bugged me, but not for long! Soon, it reminded me of hex editors – the only tools that can go to the microscale, the scale of binary digits and probably the scale of information stored in a computer![1]

Hex editors display the contents of a file in hexadecimal format (0, 1, 2, …, F). Unlike other high-level applications (say, notepads which are restricted to very few genres of files), they allow the user to edit the raw contents of a file, which means they can literally edit everything, regardless of whether it’s a text, an image, a video or an executable file!

But, it’s up to the user as to whether he knows what he’s exactly doing…

[1]: Now, I specifically mentioned “microscale” because that’s where you’ll find all the magnetic domains jumping around whose spins represent the “ones & zeros” of binary!

So, what’s going on?

A few minutes of googling turned me up the “Data-URI scheme” which had the syntax

data:[<mime type>][;charset=<charset>][;base64],<encoded data>

So, this data-URI scheme is a method used to transmit the MIMEs (different kinds of files like text, image or graphics) in a webpage. What it does is that it simply encodes the MIME using base-64. It’s just what you do for base conversion in a computer. It converts everything (from ASCII) to binary and joins the 8-bit blocks. Then, it splits the stuff into equal 6-bit blocks and converts it to decimal index which is then transformed to base-64 using a look-up table.

This look-up table has 64 characters (64 bytes) starting from ‘0’ which represents ‘A’, 1-B, 2-C and so on. Then, the lowercase letters and finally the numbers. All of this accounts to 62 bytes. The last two index 62 & 63 use ‘+” and ‘/’ just to make it up to 64. Because, it has to be 64! ASCII is base-256, so 256 = 28 (256 characters in 8-bit blocks) whereas base-64 is 64 = 26 (64 characters in 6-bit blocks).

This encoding & decoding is done when the files are gonna be transmitted between two nodes (say, server & user). If the server sends you the encoded data, then your browser at the receiving end is clever enough to decode it (the speed depends on your bandwidth, and the size of data). And, it should be noted that not all browsers support this scheme. As far as I know, Firefox, Chrome, Opera & Safari support it.

For example, try copying this code into your browser’s address bar… (double-click to select)


I’ve done nothing but encoded a HTML file using base-64. Like I said, browser your browser does the decoding job. By the way, not all websites really use this scheme. But, most of the websites seem to have either implemented it already, or they’re gonna implement it soon. It’s fair to say that this thing is still under utilization.

How is this even related to hex? It’s just a matter of the base used. Any base can be used to represent information. What I meant back there was, just like hex (or binary) which is used to represent literally everything in a computer, this scheme is used to represent the files in the internet!

Why do we need this?

Simply, to improve performance at the cost of memory!

Whenever you load a webpage, the embedded data (the basic skeleton of a webpage) loads first. Then, it sends a request to the server when & wherever it encounters a MIME source. Imagine the load on the server when this happens on a global scale. Requests coming from everywhere! This increases the traffic, thereby the server’s response time, and hence the rate at which your webpage is loaded!

There are other methods to solve this problem by packing the files into a single file so that a single request is sent for every page regardless of the number of mimes. But, that too can be horrible on a global scale! But, this data-URI scheme eradicates this problem.

Its completely text, and its bound to the page itself! So, this text data is loaded along with the page! No further requests are needed because the decoding is done by your browser nicely and the data is displayed then & there. A familiar example would help here…

Imagine a situation of copying/moving a folder full of a ten thousand 1 kB files from one drive to another. And, imagine the same situation again, but this time, the files are compressed into a single zip file (10 MB). The process happens relatively much faster in the latter case.

Because in the former situation, each file requires a response from the drive and the process ends once the file is copied/moved. Another request is then sent for the next file. A lot of requests & responses! In the latter case, one request is enough, because it’s a single file. Simply put, there’s a big difference between pushing something continuously and pushing it discretely.

Does this mean we can convert everything to texts?

Yeah, but at some cost…

It’s a common notion that text format consumes less space compared to other formats. If that’s what you think, then it’s because you haven’t handled big data so far! The original size of that HTML above is 114 bytes whereas the encoded thing is about 170 bytes. The file size is (always) increased by 33% for this 8-bit to 6-bit conversion. So, improving performance at the cost of memory!

Well, everything has its ups & downs! We all go through it…

Still, everyone go for this scheme. Because, it’s the fastest one ever designed! And, there’s always some solution to a problem (at least a workaround). This 33% can be reduced to 2-3% if the gzip compression is used. And, that’s what they do for large files.

When I read about this topic, I thought of creating a base-64 encoder in javascript for demonstration. Well, it consumed less time than I had first thought. Because, I thought of writing the algorithm for this conversion (should be easy, just a look-up table). But, I soon figured out that javascript already offers a reader that encodes data whenever it encounters a file. With MDN docs & Stackoverflow, I was able to finish that Base-64 Encoder in about 3 hours… Javascript – it’s simply amazing!

Anyways, you can play with it now. But, remember that it’s javascript. And so, it runs entirely on your browser! Throw in large files and you’ll end up crashing your browser. Try dragging a ~500 kB file into the drop space. Copy the encoded data and paste it in your notepad. See how long it takes for the notepad to get the text from the clipboard… Like I said, it’s BIG DATA!


Tagged: , , ,

One thought on “Encoding multimedia into text!

  1. Aishwarya Sankar January 15, 2015 at 1:57 PM Reply

    Good One 🙂 🙂

Wanna Reply?

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: