Evolution of containersKiran Koduru • Dec 5, 2017 • 5 minutes to read
I know, I am a little late to chime in on this discussion on the evolution of containers but, lately I have been fascinated in learning how we started using Docker for software deployments. Surely there must have been other inputs in the creation of virtual environments. So I started reading, and went down a rabbit hole to find some answers.
In the beginning we ran batch processes, then we moved to timesharing, and finally years later, we found ways to share resources in a distributed computing world. Ever wonder how did we even get here?
This post are my notes on the evolution of container technology and what led us to what is known today as, Docker.
Creating inaccessible fortresses around compute resources has been something software developers have yearned for ages. In a hardware assisted virtualization environment you would provision a host OS on top of a guest virtual machine. This host OS allots resources to your guest OS programmatically.
One of the pioneers in the world of hardware assisted virtualization is Amazon Web Services(AWS). AWS allows 2 types of virtualized environments for your EC2 instances, namely paravirtual(PV) and hardware virtual machine(HVM) instances. PV environments require a kernel modification which allows for hardware access to the host OS near native speeds. Whereas HVM environments, require a CPU that supports virtualization. They both run on top of a hypervisor that allows for creation of guest virtual machines. Since the rise in HVM technology in the recent years, AWS suggests on using HVM instances over PV when picking EC2 instances.
What I would like to talk about today is, software virtualization. It’s the use of the underlying libraries on the host OS that provide isolation of environments. I won’t go highlighting every step in the history of containerization, but would love to walk through some important stages that have led to containers as we see them today.
The year was 1979. Version 7 of Unix was being developed and along with it came chroot. Initially, it was used to create virtualized copies of softwares to make building and testing easy, but was later included in the kernel as we see today. The chroot system call allows to set the root directory for a given user. Obviously, it shouldn’t be run on a super user, who can break out of a chroot jail quite easily.
An early use of the term “jail” as applied to chroot comes from Bill Cheswick creating a honeypot to monitor a cracker in 1991.
Checkwich, Bill (http://csrc.nist.gov/publications/secpubs/berferd.pdf)
Introduced in the year 2001, Linux VServer provided resource isolation for file system, CPU time, network addresses and memory for Linux instances running on the same physical machine. Each of these resource partitions are called security contexts. It allowed for processes to not accesses anything outside their partition and was initially shipped with kernel patches that had VServer installed.
Open Virtuozzo (Open VZ)
In 1999 Alexander Tormasov proposed an idea to Sergey Beloussov about containers as a set of processes with namespace isolation, file system to share code/RAM and resources. Though the term containers wasn’t coined until 2005, they started to develop something container like into the Linux kernel around the year 2000. It was not until 2002, that they released a version for the Linux operating system.
OpenVZ used a single patched Linux kernel where guest and the host OS shared the same kernel versions. Something that could be considered a drawback, if you wanted to work with different versions of the kernel.
Sun Microsystems(now Oracle) introduced Solaris Zones in the year 2005 in Solaris 10. It provided a completely isolated virtual server in a single OS instance. Each zone as they were known, was an isolated execution environment. Even though each application shared the same operating system, it seemed like they were running on their own machine. These containers were light weight since they separated physical hardware with logical server management. It was also called chroot on steroids in the later years.
While working for Google, Paul Menage and Rohit Seth developed what was then called process containers in the year 2006. They renamed it to cgroups (control groups) in late 2007 to avoid confusion with the term container. In their implementation, processes were grouped together and those groups shared memory, CPU and disk IO. Soon after, cgroups was merged into the Linux kernel on January 2008.
If you would like to implement your own cgroups then this video by Justin Weissig is a great starting point.
Linux containers - LXC
LXC was created by a group of engineers at IBM around 2008. It works as a combination of cgroups and isolated namespaces. It leverages the internal kernel API for namespaces, chroot, cgroups etc. for container process creation. The goal of Linux containers has been to allow running standard linux installations without the added kernel dependency.
If you would like to start building your own Linux containers then here’s a link to the getting started guide.
In the year 2013 Docker made it’s debut at my favorite conference, PyCon. Docker’s founder Solomon Hykes gave a lighting talk at PyCon while working for dotCloud. Docker, again makes use of isolated namespaces to provision processes, networking and filesystem resources. Docker used lxc as the execution driver in the past to make calls to the Linux kernel’s internal API but lxc was later replaced by libcontainer which frees Docker from any dependencies in it’s new releases.
That’s it for now but I’d like to add that, there’s been progress in the container world past Docker. Hope this has been useful to you as much as it has been for me. I have also added reference links below, do check them out to learn more. Containerization could be the next step in allowing you to help build the next hosting company, and I hope I have helped you push you in that direction.
 Hardware Virtual Machine (HVM) and Paravirtualization (PV)
 AWS Linux AMI Virtualization Types
 Linux VServer Changelogs Version 0.0
 Open VZ History
 High Availability and Disaster Recovery: Concepts, Design, Implementation - Klaus Schmidt
 Wikipedia - LXC
 Introducing execution drivers and libcontainer
 LXC Features
Illustration courtesy Vecteezy
I am writing a book!
While I do appreciate you reading my blog posts, I would like to draw your attention to another project of mine. I have slowly begun to write a book on how to build web scrapers with python. I go over topics on how to start with scrapy and end with building large scale automated scraping systems.
If you are looking to build web scrapers at scale or just receiving more anecdotes on python then please signup to the email list below.