Flow chart of how to choose what media storage format is best for you.

The Real Movie Magic (is Data Storage)

Optical, solid-state, and magnetic, oh my! It’s easy to take for granted the technology that brings us our pictures, and it’s supposed to be that way. The home entertainment industry has, over decades, engineered the act of watching to be as effortless and engaging as possible. Gone are the days of carefully positioning your TV antenna and connecting to the internet through your phone line. But beneath the shiny screens and recommendation algorithms lies a thankless foundation: good old, cold, hard data storage.

Why care about data storage? Especially old, deprecated mediums that no one uses anymore?

No matter how mundane and obsolescent they seem to us now, behind every historical innovation in media storage was an unsuspecting team of nerdy engineers that inched forth the vanguard of modern technology. Just as today’s scientists at Microsoft are (allegedly) crafting Majorana 1 to harness the power of quantum computing, their R&D forefathers toiled away in labs to develop the best, most affordable consumer electronics. What’s now sitting dormant in library archives and dusty home consoles used to be a hot commodity, an innovation, a pump of a piston in an economic engine that makes and breaks stock portfolios.

A collage of New York Times headlines from the 1970s regarding magnetic tape.
1970s NYT article headlines from its Archives. Sign up for free newspaper accounts through the library!

And just because the VHS and DVD have been usurped by “superior” mediums, their engineering is still a marvel. Like we study the material science of millennia-old Roman concrete and admire the artistry of prehistoric artifacts made by cultures we know nothing about, understanding technology of yore is to connect with the history of human innovation.

So, all of this vapid philosophizing begs the question: how does an iridescent plastic disc contain 90 minutes of images and audio? What’s so special about the black tape in a VHS? What is an .mp4 file, really, and how does Netflix “stream” movies?

The DVD

The Digital Versatile Disc is a type of optical disc, a category also including CDs and Blu-Rays. Optical discs all encode their data in basically the same way, and are similar to their analog forefather, the vinyl record. Vinyl records “store” audio in a spiral groove which has in it microscopic deviations that represent sound waves. When a needle is placed into the groove and travels through it at a continuous speed, sound is produced by the needle’s vibrations and amplified.

Rather than storing the entire sound wave, optical discs transcode audio into a binary signal, or ones and zeroes. This allows for a reduction in necessary precision as well as the storage of virtually any type of digital media. Those ones and zeroes take form as tiny pits and lands—low spots and high spots, like inverted braille—that spiral around the disc and are read by a laser.

As the disc spins, the laser reflects off of the pits and lands towards a prism, which redirects the beam to a sensor. The depth of the pits is close to the wavelength of the laser, so when it moves between a pit and a land, the sensor detects a change in the intensity of the reflected light. The disc reader interprets a change between pits and lands as a one, and no change of light within a certain time increment as a zero.

DVDs have a reading speed of about 11 Mb/s, meaning that eleven million ones and zeros are encoded from the disc every second. With a max capacity of 734 MB, DVDs can theoretically have up to 18.5 billion pits on them.

Numerous advancements have increased the max capacity of optical discs. Blu-ray systems use a smaller wavelength of light to read discs, allowing for smaller and denser pits and lands. Data can also be dual-layered, meaning that two sets of pits and lands are stacked on top of each other. To read the disc, the laser slightly shifts its focal point to the depth of the target layer.

While two layers provide almost all the data fidelity an average moviegoer wants, recent experiments developing high-capacity optical discs have stacked hundreds of layers atop one another to reach a storage capacity of 1.6 petabytes in a single disc. Just one could store AU Library’s entire DVD collection (numbering roughly 17,000) with room to spare.

The VHS

The Video Home System uses magnetic tape to store data. Introduced in the mid 1970s, the VHS was one of many magnetic storage formats, being proceeded by popular consumer products like Betamax and a swath of audio-only cassette tapes. VHS is an analog medium, meaning rather than encoding data in binary, it stores waves with varying frequencies and amplitudes that represent audio and video by making use of standards previously developed to transmit television signals.

In North America, NTSC was the most commonly used broadcasting system in the later half of the 20th century, and it’s what most American VHS tapes are encoded in. Broadcasting systems like NTSC and its overseas competitor PAL can be thought of as different languages. Donde esta la biblioteca? and where is the library? convey the same meaning, but their languages use different syntax and protocols to represent that meaning.

Almost any screen you look at today uses pixels to digitally display images. When using analog signals, it’s easier to represent images with scan lines. Each scan line can be thought of as an entire row of pixels which varies in color and luminosity horizontally. Though these scan lines are one continuous data stream, when displayed on contemporaneous televisions like the CRT, scan lines are quantized into pixels due to technical limitations.

Below is a diagram of the signal that transmits 1 scan line. The wavy bit in the middle represents how bright each part of the line should be from left to right.

A CRT television translates this signal by shooting a beam of electrons towards a phosphorescent screen at a varying intensity, which looks like this in slow motion:

Slow-motion video of a CRT's scan lines drawing Mario.
Yes, that’s Mario.

Now, back to the VHS.

The magnetic tape in a VHS contains a thin layer of iron oxide particles, each of which can be polarized in a certain direction by exposure to a magnetic field.

At first, each of these particles has a magnetic field pointing in a random direction. By passing a current through a tape head which is modulated by the video signal, the portion of the tape under the head can be polarized in accordance to the signal.

This process isn’t dissimilar to a Zen garden, where a rake draws lines into the sand which remain there until drawn over.

Diagram of the tracks on magnetic tape.
Via IASA

Video signals are high-bandwidth, meaning they contain a lot of data. In order to fit all those data onto that tiny tape, the signal is broken up into diagonal tracks which are stacked next to each other. It’s sort of like why we print books onto different pages rather than one long page (looking at you, Kerouac).

Diagram of a magnetic tape head.
Via PCMag

This format is great for efficiency, but it means that multiple signals are stacked on top of one another. In order to read them as one continuous signal, VHS systems use helical scanning, where the head is tilted to the angle of the tracks and spins incredibly fast (up to 30 meters per second!) to read them.

Magnetic tape is susceptible to degradation, both over time and from frequent use. As a result, a VHS stored in perfect archival conditions has a shelf life of about 25 to 30 years, and the more it’s played, the fuzzier its image gets. The medium’s mere mortality is something of an existential threat to archivists; digitization efforts of many collections are currently underway.

Just as Moore’s Law qualified the recent development of 1.6 petabyte DVDs, modern science has also favored the humble magnetic tape. Scientists at IBM have been chasing the answer to a simple yet ineluctable question: how can we cram as many ones and zeroes into a VHS tape as permitted by the laws of physics? Two years ago, those Big Blue braniacs managed to fit 317 gigabytes onto one square inch of specially-engineered magnetic tape. A single cartridge of the stuff, IBM claims, could store up to 580 TB of data (about a third of 1.6 PB). They propose the new format could be useful in archiving (or at least delaying the death of) an estimated 345,000 exabytes of data that currently exist on magnetic tape.

MP4

Video files come in all sorts of flavors: .mp4, .mov, .mkv, .avi, .m4v, and many more. While these might seem comparable to picture formats like .jpg and .png, which determine how an image is encoded, video file types are usually just containers that, well, contain the actual video file type.

An image file is relatively simple: in addition to the image itself, the file might also contain metadata and a color profile. A video, however, is more complex, with a single .mp4 file possibly containing image data, audio, embedded captions, chapters, thumbnails, and more. For this reason, video file formats are called containers.

The more important determinant of video quality and file size is the codec, which compresses the data into a manageable size. Video compression works much like image compression, wherein rather than storing information for each pixel, similar pixels might be clustered and represented in a more efficient manner.

Codecs can be either lossless or lossy. Lossless codecs reduce “statistical redundancy,” meaning they make the file size smaller without removing any information. This makes lossless compression a reversible process, like how IKEA packages their furniture disassembled to save space. Lossy compression averages similar data: it might cluster pixels of similar color into one blob of the same color, or check whether an area of video changes from frame to frame, only encoding that area again if it has changed by a large enough margin. By only rewriting parts of the image that significantly change, and only storing unchanged portions once, the video file size can be significantly reduced.

Animated GIF depicting how video compression algorithms assign motion vectors to points in the frame.
A visualization of motion compression. Here, each frame is broken down into a grid, and similar pixel clusters that move across the grid (e.g. the man’s face) are interpreted as moving objects. Such objects are assigned motion vectors (directions) so that their path can be stored rather than redundantly storing the same data in each frame. The garage door handle in the center is essentially a still image until the man walks in front of it, meaning that it can be encoded as a single image rather than as a sequence of redundant images. Via Tywen Kelly.

Lossy codecs employ a variety of techniques to minimize file size while preserving quality, like chroma subsampling, which compresses the color of pixels more than their luminance. Discrete cosine transform (DCT), a compression method widely used in image and audio, breaks down image blocks into a sum of basic cosine wave patterns of different ‘speeds’ or frequencies. To grossly oversimplify, rather than transcoding a checkerboard pattern as [white, black, white, black…], DCT effectively identifies that the pattern is mostly made of a high-frequency back-and-forth cosine wave, plus very little of any ‘slower’ wave patterns. It then stores how much of each ‘wave type’ contributes to the original image.

Streaming

Until now, we’ve been talking about media that is possessed in its entirety. When you download an .mp4 or buy a DVD, you have the whole thing. But if you only plan on watching a YouTube video or a movie once, why bother with an enormous video file?

When you feed your dog, you can’t just plop the 40 pound bag of Kibbles ‘n Bits on the ground and let him ration it out. You know he can’t be trusted not to eat himself to death. Instead, you feed him one meal at a time, day after day, Kibble by Bit.

This is precisely how streaming platforms work. Rather than sending your computer the entire video file at once, a streaming platform’s server sends you a sequence of packets, each containing a few seconds of video that you watch one-by-one. Once you’ve watched them, those packets vanish from your RAM just as quickly as they instantiated.

Diagram explaining the architecture of HTTP live streaming protocol.
The architecture of HLS streaming via Yang et al. (2014).

Streaming is convenient in that, with modern download speeds, you can watch a movie without waiting for the whole thing to download first. If your connection slows down, protocols like HTTP Live Streaming (commonly HLS) accommodate your watching with an adaptive bitrate.

A bitrate is just what it sounds like: how many bits (ones and zeroes) are being transferred to your computer over a unit of time. A 1080p YouTube video clocks in at about 4.5 megabits per second, and a Blu-Ray disc can encode up to 40 mbps.

The streaming platform adapts the bitrate to your limited bandwidth by compressing the media more, thus reducing the quality while avoiding buffer time. Popular streamers like Netflix further reduce buffer time with dedicated content delivery networks, or CDNs.

Most people have a vague idea that when they stream something online, it’s coming from a distant server in a data center somewhere. A CDN is an entire network of these servers, located in spots that best suit their user base. Cloudflare, for example, is a CDN service.

Illustrative map of a content delivery network.
An example of a CDN via CloudFlare.

As the world’s largest streaming platform, Netflix comprises a ton of internet traffic. If you’re a paying customer and decide to watch Llamageddon while on vacation in, say, Attu Station, Alaska, Netflix is going to have to pay for the traffic required to get you your Llamageddon. Oftentimes, it’s cheaper for them to host their own servers in areas of high demand.

Netflix’s proprietary CDN is called Open Connect. They partner with internet companies to shove modular storage units filled with their most popular programs into any old server rack or electrical box that they can fit in. As a result, when you’re sitting at home watching Devil in Ohio, you might be streaming it from a server a block away from your house.

Photograph of a Netflix Open Connect Appliance server.
A decommissioned Netflix Open Connect Appliance circa 2013. Via VICE.

Just for fun, I used an Open Connect Appliance locator tool to see which Netflix servers are available in the AU area. I have access to five, two of which are in the DC area, and the other three ping to Princeton, NJ, New York City, and Cambridge, MA.

While streaming platforms are great tools for discovering and watching new media, some data hoarding consumers see disadvantages to the subscription model.

When you buy a physical copy, or a download of a movie, you own it forever. It’s a single monetary exchange. Platforms like Netflix are more like movie theaters: you’re paying for a temporary experience.

Even if you “purchase” access to a movie or show on a pay-per-view platform like Amazon Prime Video, your indefinite access to what you think you own isn’t guaranteed. In fact, Amazon’s terms of service state that access to your purchased digital content can be revoked at any time. And in the (unlikely) case of bankruptcy or server failure, your purchase disappears.

If you decide to rage against the virtual machine and try to screen capture or otherwise download the streamable content on a platform, they will often go to lengths to prevent you from doing that with a series of obstacles collectively referred to as DRM (digital rights management).

DRM is sort of like those reflective anti-paparazzi jackets that ruin a photograph if they’re taken with flash. If DRM software detects a screen-recorder or unusual traffic activity coming from your computer, it can prevent the content from being streamed to you at all.

As media delivery systems evolve, consumers change how they spend their money, and the entertainment industry is forced to adapt. Today, streaming platforms are pouring funding into their own studios, often at a remarkable loss, in the hope that they’ll produce a library of exclusive content attractive enough to establish themselves in the market. Sometimes that’s a good thing, like when it gives us Severance, but other times, it means paying indefinitely for a service that you only watch one show on.

Concluding Remarks

As it stands, the world is obsessed with solid state storage, the apotheotic semiconductor sorcery that enables the modern digital age in all of its media-streaming glory. But who’s to say that, like magnetic tape replaced punch cards and the CD obsolesced the 8-track, we won’t one day be watching a movie encoded into the germ line of a bioengineered shrub that’s growing in the backyard?

When popular tech improves and old mediums are deprecated, it’s easy for the information they store to fall to the wayside. An estimated 70% of American-made silent films, which were produced on volatile nitrate film, are considered lost media by the Library of Congress. Modern digital information faces the same challenge: what’s uploaded on the internet is not, in fact, there forever. If humanity couldn’t even manage to keep track of a few stone tablets or golden plates or the original moon landing footage or a Breaking Bad Flash game from 2009, then I find perpetuity of digital information to be a dubious claim when it’s comprised of nothing but a few electrons in an atomic box.

Video streaming protocols and the compression algorithms therein are strange and mathy products of engineering, but they are also the fundament of how we communicate. Data storage, dry and dull as it may be, is the business of anyone who cares about the preservation of human culture.

Posted in Technology and tagged .