Wednesday, July 14, 2010

Ubuntu Enterprise Cloud: Explaining the "Cloud"

Ubuntu Enterprise Cloud (UEC) depends heavily on KVM as the hypervisor and Eucalyptus as the elastic cloud solution.

In this post, a brief explanation of the Eucalyptus solution will be given.
[Disclosure: I just read a conference paper from Eucalyptus and a user guide to write this post... Some info. may not be detailed or having mistake. If there's any mistake, please point it out directly. I will later setup a private cloud for testing soon.]

Here is the architecture of Eucalyptus (direct linking from user guide).

There are few components in the architecture:
  1. Cloud Controller, CLC (Interface with user)
  2. Cluster Controller, CC (Sits in between CLC and NC, governing a cluster of node)
  3. Node Controller, NC (Live in a node)
  4. Walrus Storage Controller, WS3 (Keeping VM's kernel, root filesystem, and ramdisk)
  5. Storage Controller, SC (The datastore)
Indeed, the very basic setup of UEC requires two machine. One of them MUST have Intel-VT / AMD-V enabled CPU for hardware virtualization acceleration (requirement of KVM indeed). So, let's say the first machine without Intel-VT / AMD-V CPU is named "uec-master" while another machine with the CPU is named "uec-node".

The Node Controller is going to be installed in the machine uec-node. NC is a software package that communicates with the KVM installed in uec-node. The communication is carried via libvirt. The "elastic" VM instances are going to be deployed onto uec-node running on top of KVM.

Other four controller: CLC, CC, WSC, SC can be installed on another machine uec-master. CLC is the software package that interfaces with user. CC is the package that masters a set of nodes (talking to NC directly for operations). WSC is the package to simluate Amazon S3 and maintaining the VM instance kernel, root filesys, and ramdisk. SC is the package to manage the actual datastore (volume or file space to be mounted) used by VM instances.

To setup VM instances, user have to first prepare the VM kernel and root filesystem (there're tools existed to aid you). This preparation is done via KVM. That's to say client machine used to prepare VM image would probably have Intel-VT/AMD-V CPU. After packaging the kernel and root fs, user can "upload" the package via CLC to WSC.

When user want to allocate resource for the VM instances, user have to assign a datastore for the instances. The datastore will be kept in SC. Once prepared, user issues instance-start to CLC and the CLC will forward the request to CC. CC will pick NC to serve the request; NC will finally load the the VM image from WSC and mount the volume from SC.

Thus, there will be 1 or more instances sharing the same volume from SC. The data persistence uses AoE or iSCSI protocol (which i have no idea at all yet :P).

So where does "elastic" come from? VM instances (CPU and memory resource) can be added to/removed from the cloud dynamically. SO elastic, man~ Apps running on VM instances have no idea of the CPU, memory, and the actual datastore. SO virtual, man~

Note that ... "any" Amazon S3 and EC2 client application would work with Eucalyptus as they share the same SOAP interface (REST interface for datastore).


*** EDIT 2010-07-17 *** When a volume is attached to 1 vm instance, it cannot be attached to other vm instances at the same moment.

8 Responses to “Ubuntu Enterprise Cloud: Explaining the "Cloud"”

Ben Lau said...

Do it mean that all the user will first connect to Cluster Controller , and then the access will be deliver to other node?

mr.kschan said...

What do you mean by user "access"?

If you mean accessing the VM instances, user have to access them by RDP, VNC, SSH, etc.

But, if you mean registering / starting VM instances, use do it via Cloud Controller (CLC).

Cluster Controller (CC) manages a set of nodes.

Ben Lau said...

For example , if you run a web service, an user is going to visit the web site. The data flow will be..?

mr.kschan said...

Ben, you pointed out the missing paragraph to explain the networking config in this post (which i intentionally miss it out :)

Indeed, eucalyptus allows 3 kinds of network settings. But, all of these network settings still allow the VM instances to acquire a network interfaces.

The interfaces are "bridged" by Node Controller (NC) on Node machine. So, The data will flow directly to those VM instances...

A particular network setting is a managed "private" network. Cluster Controller (CC) will "own" a set of "real" IP addresses and lease it to the VM instances on NC (by dhcp).

Ben Lau said...

I see. I understand the architecture now. thx~

Few more questions. Do it provide any thing like BigTable in App Engine for distributed database? So that it may have two instance running that connect to the same database?

And do it have anything work like a load balance , so that it can take the advantage of having multiple instance in a same time?

mr.kschan said...

i think Amazon EC2 and Eucalyptus is a kind of infrastructure-as-a-service cloud.

It indeed lents VM instances and charges the VM... So, BigTable or such have to be hold within VM ... controlled by user but not provided by service provider.

Service provider like AppEngine is a kind of platfom-as-a-service cloud. User can only control application (code level) but not the VM instances running on the cloud.

For load balancing, if you're talking about the load of an availability zone... i believe there will be balancing in deploying VM instances to different nodes.

If you noticed that i mentioned AoE and iSCSI for storage... they are network protocol for remote HDD access. There's no database for VM and instead filesystem.

jayasathya said...

i want to use xen hypervisor on ubuntu 10.04 UEC since my node controller system is no-vt terminal. how configure xen instead of the default kvm

mr.kschan said...

@jayasathya, i didn't have experience with your mentioned setup. I think you may seek suggestions from

© 2009 Emptiness Blogging. All Rights Reserved | Powered by Blogger
Design by psdvibe | Bloggerized By LawnyDesignz