Saturday, November 20, 2010

Getting back XO-ing

Just finished a contest for the secondary school students ... now, i'm available for the XO :)

I met XO 2-3 years ago and I was involved in a project for XO. It's for the kids to practice oral English.

Given several vocabularies to the kids, ask them to pronounce it and we grade it :) The project involved porting an aged library from Windows to Linux and making a GUI for kids. No matter how well the grading is ... here is the current interface :)



So, kids can type in the vocabulary and speak to the XO... and XO tell them ... how well they are.

Here's a lovely video telling how they use it :)



OK. Kids are interested... So Why not making them more interested ^^? Stuffs come to my mind ... can we have a pretty face to them for the feedback about the grade? can we speak to them about the feedback? Those make me get excited with the Speak activity. And ... i just wanna integrate the whole thing with Speak...

Here's a prototype interface for it :)



Anyway, why do i do XO? several things come to my mind.
- it's less than USD$200
- its battery long-lasting
- it can be used under sunshine/sun-light
- it can connect to a Wifi access point 1KM away
- it can run Flash
- it gives kids High DPI screen (their eyes will be fine in using it)
- it gives kids Email
- it gives kids Game (not yet an AngryBird out there)
- it gives kids Wiki
- it gives kids Internet
- it gives kids Calculator
- it gives kids IM, WebCam, VOIP
- it gives kids Music, jamming
- it gives kids E-book
- it gives kids Scratch for learning programming (or just story-boarding)
- it gives kids Painting, Drawing
- it gives kids a Maze ... (many kids love playing it ... both boys and girls :)
- it gives kids ...
:
:
:
- it runs F/OSS
- And, it's less than USD$200

Friday, October 15, 2010

Blog Action Day 2010 - Clean Water

When u brush your teeth, do u ever think of the amount of clean water u used?

Let me introduce my "philosophy" :P

Here is a little green cup that i use to contain the water when i brush my teeth. I started using it since my childhood. It's around 2x years old :)



When u compare it to some normal sized container ... the red one in the photo ... u will know how little is my green cup.



Someone asked me why i keep using such a little cup? Don't u need to fill the cup more times than a bigger one.

My philosophy is that ... when u used a bigger cup, u may waste MUCH MORE clean water. This is the same as the plate or bowl that holds food. If u get a big plate, u may fill much more food with it. This is because our habit is to fill something "full". So, the smaller the container, the less you can fill with it. Most of the time when u fill something ... u will fill more than u need. If u agree ... try look for a smaller container :)

support Blog Action Day!

Sunday, October 10, 2010

學習 、工作

好耐無寫自己"內心"o既野 (小弟語文能力唔好, 唔識形容乜野係"野" ... 寫下寫下, 可能係內心獨白吧) ... 襯今日10.10.10, 又咁岩同一位朋友傾過一d認真話題... 有感而發... 雖然好眼訓, 但都要寫.... 因為.. 曾經在友人看過一篇 blog post (沒記錯的話是友人的 "森路歷程", 錯了勿插 :P), 開場白大概是 "曾經聽友人說過, 回家了, 累了. 若是有 blog "要" post, 再累也要先 post 後休息. 不然, 過了那興致, 之後想再 post 都不可能了"... 可能怕是失去靈感了吧.

今次, 小弟想講下, 小弟 常常常常 聽到的一句話. "工作的時候,好多時會學到新野". 我想問: 什麼是工作? 什麼是學習?

如果你曾經簽過一份工作合同, 上面應該會有.. 如 "You are expected to put your best effort on accomplishing the tasks assigned to you" 咁o既說話. 那... 在我心中點睇乜野係工作? 如果我是 interviewer, 又如果有一位求職者跟我說: "我進這公司, 接這工作, 是希望學到新野..." 我一定好串咁問佢: "你來學野? 想學乜野? 我呢到無乜教材".

係我心目中, 學習 跟 工作 完完全全是兩回事...

工作, 你老闆肯定想你將最好o既野做出黎. 如果你認為岩岩學完o既野, 做出黎可以交差, 那還是不是最好的? "best effort" 了? 其實這沒有答題, 因為每個人追求的 best 都不一樣. 而我想係到表達o既係... 好多時, 工作是有時限的. 做得黎又要學, 學得黎又要做.. 你學得又有幾深入呢? 工作上, 遇到一些從沒遇過的東西, 給你遇上了, 是你多了一點這東西的 經驗, 是你 "認識" 多了一樣東西. You "know" something new. 那是真正的"學習"嗎?

學習, 到底係我心目中又是什麼??

不知是好運還是惡夢... 小弟畢業作 (即 fyp) 遇上了 e-Learning. 那時看了一本書.. 大概是講 adult learning... 什麼是最好的學習?? What is the BEST LEARNING? 最好的學習是來自內心的 渴求 (難聽d叫 慾望, 即desire) .. 自發的, 有興趣的, 渴求得到的. (如果你睇番小弟所寫 ... 有關在 barcamp 2010 裡, 小弟有幸聽了 有關Consciousness 的 session... 要想知自己在做什麼, 那便要分清楚是 your desire, need, fear, or joy 哦) 工作上 所謂的學習 ??? Is it desire? Is it fear?? Or is it a NEED??? Perhaps, it may be joyful :P (someone LOVES works all day, right?) 如果你工作上所"學習"的是你所"渴求"的, Congratulation! 但是, 又有幾可, 你工作上遇到的新事物, 是你所渴求認識的那東西?? 工作好多時會有d框架 bound 住你... 請不要退而求其次的說: "那東西不能用到工作上嗎? 沒打緊, 我也想學這東西"... 那到底什麼是你 desire 的?? 又或者, 你又真係好鬼想學這 "2nd Priority" 吧 ... 嘿嘿... anyway.. 不要太貪心... "吾生也有涯,而知也無涯,以有涯隨無涯,殆矣". (在拋拋書包麼!? Remember.. my blog's name is "Emptiness Blogging"... somehow Taoism... lol)

你要學習你所 desire 的東西, 你將會付出很多, 但不一定有相對應的回報. 但係, 你又會戇戇居居咁走去完成佢... (小弟就咁戇戇居居左個 year4 + 1年 cs master degree, 諗番都好鬼戇居-.- 浪費咁多時間同精神, 仲要 put 多左 effort... but, so what? i did that. XD )

Hope you get what is Learning.

其實, 我認為互聯網是比避孕套更偉大o既發明, 因為你可以透過 超連結 (都幾超下架..) 去將有關係o既野 "連埋". 對學習來說帶來極大方便. (回想起.. 寫paper時, 找 reference 中的 reference 來看看, 沒有 hyperlink , 可以找死你)

到最後, 我想講.. 其實我寫呢篇野果時已經係 10月 11號 零晨時份 :P 改改 post date 而已. 上述只是個人見解, 如有得罪, 多多包容.

Sunday, September 19, 2010

barcamp: Consciousness

Simply speaking, Consciousness is the study about the relation between our mind and our body. Sometimes, we physically situated at somewhere but our minds go to somewhere eles... A PolyU professor shared about Consciousness in barcamphk 2010!

One key point to understand why we do something is to differentiate the following four:
- fear
- desire
- need
- joy

For example, we went to the student canteen of PolyU for lunch yesterday... Why?
From my point of view, we need to lunch and that's our need but we can lunch anywhere instead of the student canteen. So, this is not caused by need. Is it joyful? Yup, pretty joyful when chatting with other barcampers... but we can chat anywhere. Is it the desire? or is it fear? Desire to have free lunch? or fear of paying for a lunch? em... 50/50 in this case :P but I think it should be my desire of free lunch :P

Monday, September 13, 2010

tour to planet pandora

actually ... not going to planet Pandora but 张家界 and 凤凰城 :P

just wanna post some shots here :)






indeed, you can find out more shots in my facebook album ... (yep, i dont pay flickr as i'm poor :P)

http://www.facebook.com/album.php?aid=247441&id=508149973&l=93f5e6b411
http://www.facebook.com/album.php?aid=247438&id=508149973&l=839a5bea37
http://www.facebook.com/album.php?aid=247428&id=508149973&l=83e7bf4e09

PLus ... wanna post 2 pictures of the 军声砂石画 (being introduced during the tour)... that's pretty new to me.



This kind of art gives EXTREMELY HIGH "dpi". Yes, dot per inch.... Since it uses raw materials like woods, sands and rocks as the painting materials, the texture of the objects in the painting is pretty close to the texture of real objects. The following is a close-up of the bottom right corner of the painting above and you can find that there are many sands.

Thursday, August 26, 2010

GoF State Pattern implemented in Javascript UI component

In these recent weeks, i'm writing a javascript UI component that can shrink/expand and hide/show. I call it "browser" in later paragraphs.

It's a pretty easy-to-write component where it only has 3 different states as shown below.

The first state is a hidden state which the "browser" hides itself somewhere in the page.


When clicking on those boxes shown, the "browser" shows itself and presents an abstract view of data.


When selecting a little box of the list on the right, the "browser" enters a state that displays detail view of data.


Whenever in either the abstract or detail state, once the grey-out area or the little close button [x] got a click on it, the browser goes back to hidden state. And, the browser can switch between the abstract state and the detail state by clicking on some buttons.

So, it sounds strict forward and thus my very first naive attempt was to implement it by just using the event-driven approach which i explicitly controlled the hide/show and shrink/expand line-by-line with the help of some if-else statements.

But after doing that for 2-3 days... the mess becomes not-that-easy to manage and little UI bugs come out from those event handler blocks. I had to trace those handler blocks and to find out where the UI effect went wrong. So, i decided to rewrite the entire mess with the GoF State Pattern (spending half an hour or so in implementing that) and post it here :P

As i wrote, the "browser" has some states {hidden, abstract, detail} and each state will transit to each another triggered by some events. It is pretty nice that the situation aligns to the state pattern perfectly. So, what i need to do with javascript is to implement a browser object that holds a state object where the state object references to either a hidden or abstract or detail state object.



The above class diagram shows some of the details of my implementation. The enter() function of HiddenState is to hide the browser while the paint() function of it is to show the browser and transits the browser state from hidden to abstract. When AbstractState is set, it's enter() function displays the abstract content. Upon some events like button click, browser's roll() function is called and delegated to AbstractState roll() function where DetailState is going to be entered.

The full implementation of this "browser" is available at github (line 323-559). Although it's pretty long (i already skipped implementing the inheritance :P), the "browser" now is simply mixing the State pattern and UI event handlers for ui transition, effects, and content displaying. If you trace further down the source, you will find my naive strict forward implementation of the "browser" where i believe it's pretty a mess to manage :P

Tuesday, August 24, 2010

hacker...

in these few days, i was packing my desktop in school as i'm leaving it very soon. From the desktop, i found something which is GREAT and got my focus.

flying back to 2009 autumn, there was a zeuux summit held in CityU. It was my pleasure to see Richard Stallman and Akira Urushibata there.

And, what i found in my desktop is Akira's excellent presentation "slide". Enjoy it :D



























Tuesday, August 03, 2010

python decorator for input validation

I was reading how decorator in python can be used to state machine... And, wonder, can this also apply to input validation for web framework like web.py.

I tried the following:
def getInput():
    ''' simulate web.input() for web.py framework '''
    return {
        'x': 'banananaa',
        'y': None,
        'z': 'zz',
    }

def validate_required(rules):
    ''' validation decorator '''
    def wrapper(method):
        ''' wrapping the actual handler '''
        def validate(*args, **kwargs):
            ''' validation take places according to rules '''
            inputs = getInput()
            for k in inputs.keys():
                if k in rules: f = rules[k]
                else: continue
                input = inputs[k]
                if not f(input):
                    out = 'Invalid input %s - %s' % (k, input)
                    print out # or raise Exception here to stop execution
            return method(*args, **kwargs)
        return validate
    return wrapper

class Handler:
    rules = {
        'x': lambda(x): len(x) > 0,
        'y': lambda(y): y is not None,
        'z': lambda(z): z is not None and len(z) > 3,
    }

    @validate_required(rules)
    def POST(self):
        print 'do something'

# simulate request handling
h = Handler()
h.POST()

Saturday, July 17, 2010

Ubuntu Enterprise Cloud: Experiencing the "Cloud" #2

Continue from last post, after solving the booting problem of vm instance (The cause of this problem is just my silly mistake which i ask vm to boot from a kernel image instead of vm image); several observations are obtained.


Observation-1
---
If you write to the root filesys of vm instance, the wrote data will not be saved to WSC when the instance is terminated. But, euca2ools provides "euca-bundle-vol" utility to "upload" a local filesys of an instance to WSC. That's to say ... u have to create another root filesys copy on WSC to save your write.


Observation-2
---
If a volume attached to 1 vm instance, it cannot be attached to other at the same moment unless it is detached. So, if you wanna host a shared data pool on eucalyptus, you have to use several vm instances to host nosqldb like mongodb or cassandra. And each of these instances has dedicated volume attached. Save your data via NoSQL :)


Observation-3
---
With eucalyptus managed network setting, network access to vm instances is controlled by security group. A security group will maintain a set of in-bound rules likes below:


PERMISSION admin default ALLOWS tcp 22 22 FROM CIDR 144.214.0.0/16
PERMISSION admin default ALLOWS tcp 22 22 FROM CIDR 10.2.0.0/16
PERMISSION admin default ALLOWS icmp 0 0 FROM CIDR 10.2.0.0/16
PERMISSION admin default ALLOWS icmp 0 0 FROM CIDR 144.214.0.0/16


For out-bound rules, setup firewall within the vm instances. Eucalyptus does not manage that.


Observation-4
---
As with eucalyptus managed private network, vm instances may use private IP address. To access them, you have to be connected to an instance at first and use the private IP address as locator. That's to say, you need to have at least 1 public IP address that the outside world can connect to an instance.

If you configured to use vlan enabled managed network, vm instances of different security group will have different subnet assigned. The virtual network isolation is done by this feature. To allow two subnet to communicate, add in-bound rules to the security group.

(But i'm still trying the network config, may have update on this later)



Personal opinion in managing the cloud with client tool
---
Hybridfox is great! But euca2ools with just cli is simple and even more great. I personally prefer euca2ools.


What's next??
---
Go ahead to host private AppEngine - appsacle. But, don't know whether i can get it managed with such a limited "availability" zone

Friday, July 16, 2010

Ubuntu Enterprise Cloud: Experiencing the "Cloud" #1

Continue from last post, this post documents my experience in setting up the ubuntu cloud.

Here is the resources i used to conduct the experiment:

* machine1 [Physical] - PentiumD 3GHz (core x2, VT-enabled), 2Gb RAM (512Mb x4), 200Gb HDD, NIC x1
* machine2 [vSphere VM] - Xeon X5560 2.8GHz (core x4), 2GB RAM, 100Gb HDD, NIC x2
* machine3 [vSphere VM] - Xeon X5560 2.8GHz (core x4), 2GB RAM, 200Gb HDD, NIC x1
* USB Thumb 1Gb x1
* CR-RW x2

To mostly align the architecture, the role of machines follow.
* machine1 - NC
* machine2 - CLC, WSC
* machine3 - CC, SC

And, these machines are connected by a single subnet (private network) 10.1.0.x while machine2 with another NIC connected to a "public" network.

Setup of the machines "strictly" follows the user guide except the setup of NTP server. A public IP address is provided to CC as the "elastic" address of VM instances. The availability zone is resulted as follow.



Each row of the table describes a particular type of VM that can be created. The availability of particular type can be found in the "free/max" column. The "max" is computed according to the number of CPU in NC machines by default (in this case, exactly 2). The number of CPU in NC can be shrank or grew according to config (see edit 2010-07-19). The cpu, ram, disk of particular type can be configured via the web interface of CLC.

After the installation of controllers, it's time to prepare the VM image. VM image has to be prepared by user and uploaded to WSC. The preparation requires KVM in UEC. If you don't wanna prepare your own image, just download it from ubuntu as shown.



After uploading the kernel, initrd ramdisk, and root file system (the vm image) to WSC, that's the time to use Hybridfox to start VM instances. Selects the uploaded image and launches VM instances according to that image.



When NC receives the request to launch VM instances, it retrieves the VM image from WSC. This is the "pending" phase of VM instance. Once image is loaded, VM instance gets booted and entering "running" phase. When VM instance receives shutdown request, it enters "shutting down" phase and finally goes to "terminated" phase. Phase change follows.



In this experiment, i try to launch 2 instances so that there will not be enough "elastic" IP addresses for the instances. My observation is that when launching the first instance, it acquires the "elastic" IP address. When launching the second instance, a complain is shown and asking to launch the instance with "private" address. Indeed, Eucalyptus CC contains a DHCP server and manages VLAN for "private" address. The "elastic" IP address can be detached from and attached to any instances at any time frame.

Let's get back to the last figure showing Hybridfox.



As you may noticed, the console output of the VM instance in the figure shows error message during the boot time. That's my next problem need to be solved.

***EDIT 2010-07-17*** The boot problem is solved... the cause is that i made a mistake to ask a VM to boot from a kernel image, instead of a vm image :P Now, i can ssh to the vm instance ^^"

***EDIT 2010-07-19*** The max of "Free/Max" of a node can be configured by NC's config MAX_CORES, MAX_MEM, and SWAP_SIZE, according to this post.

***EDIT 2010-07-20*** For MAX_MEM configured more than actual physical memory, sorry. See this post.

---

Something happened during the setup...

I was trying to burn ubuntu-1004-amd64 to one of the CD-RW. When using the CD-RW to install ubuntu onto the physical machine, it blamed that the disc was corrupted. Then, i tried burning another CD-RW... same corruption happened. Lastly, i used USB thumb as live-usb and got no file corruption.... As a result, i spent about 2 hours in doing this.

This is the first time i managed a machine with 2 legs (NIC x2) ... Some mis-configurations existed and thus slowing down the entire experiment.

Wednesday, July 14, 2010

Ubuntu Enterprise Cloud: Explaining the "Cloud"

Ubuntu Enterprise Cloud (UEC) depends heavily on KVM as the hypervisor and Eucalyptus as the elastic cloud solution.

In this post, a brief explanation of the Eucalyptus solution will be given.
[Disclosure: I just read a conference paper from Eucalyptus and a user guide to write this post... Some info. may not be detailed or having mistake. If there's any mistake, please point it out directly. I will later setup a private cloud for testing soon.]

Here is the architecture of Eucalyptus (direct linking from user guide).


There are few components in the architecture:
  1. Cloud Controller, CLC (Interface with user)
  2. Cluster Controller, CC (Sits in between CLC and NC, governing a cluster of node)
  3. Node Controller, NC (Live in a node)
  4. Walrus Storage Controller, WS3 (Keeping VM's kernel, root filesystem, and ramdisk)
  5. Storage Controller, SC (The datastore)
Indeed, the very basic setup of UEC requires two machine. One of them MUST have Intel-VT / AMD-V enabled CPU for hardware virtualization acceleration (requirement of KVM indeed). So, let's say the first machine without Intel-VT / AMD-V CPU is named "uec-master" while another machine with the CPU is named "uec-node".

The Node Controller is going to be installed in the machine uec-node. NC is a software package that communicates with the KVM installed in uec-node. The communication is carried via libvirt. The "elastic" VM instances are going to be deployed onto uec-node running on top of KVM.

Other four controller: CLC, CC, WSC, SC can be installed on another machine uec-master. CLC is the software package that interfaces with user. CC is the package that masters a set of nodes (talking to NC directly for operations). WSC is the package to simluate Amazon S3 and maintaining the VM instance kernel, root filesys, and ramdisk. SC is the package to manage the actual datastore (volume or file space to be mounted) used by VM instances.

To setup VM instances, user have to first prepare the VM kernel and root filesystem (there're tools existed to aid you). This preparation is done via KVM. That's to say client machine used to prepare VM image would probably have Intel-VT/AMD-V CPU. After packaging the kernel and root fs, user can "upload" the package via CLC to WSC.

When user want to allocate resource for the VM instances, user have to assign a datastore for the instances. The datastore will be kept in SC. Once prepared, user issues instance-start to CLC and the CLC will forward the request to CC. CC will pick NC to serve the request; NC will finally load the the VM image from WSC and mount the volume from SC.

Thus, there will be 1 or more instances sharing the same volume from SC. The data persistence uses AoE or iSCSI protocol (which i have no idea at all yet :P).

So where does "elastic" come from? VM instances (CPU and memory resource) can be added to/removed from the cloud dynamically. SO elastic, man~ Apps running on VM instances have no idea of the CPU, memory, and the actual datastore. SO virtual, man~

Note that ... "any" Amazon S3 and EC2 client application would work with Eucalyptus as they share the same SOAP interface (REST interface for datastore).

Questions?

*** EDIT 2010-07-17 *** When a volume is attached to 1 vm instance, it cannot be attached to other vm instances at the same moment.

Tuesday, June 08, 2010

Quicksilver String Ranking Java Port

I search the phrase "Quicksilver String Ranking Java Port" in google... cannot find some interesting result...

so i just wrote one - http://github.com/mrkschan/qs-score-java/blob/master/QSString.java

One point to note is that ... In Java (v6)... String is a final class that cannot be extended... so i have to wrap a string and make this class less convenient.

Rationale for this: i'm writing a string filter for a eclipse plug-in which i wanna use a fast and excellent string ranking algorithm. Quicksilver is the way to go :)

Saturday, May 08, 2010

香港紅十字會的來信

內容: 請繼續支持捐血救人 :) 下為你新的捐血卡 :P



其實睇下張舊卡, 就知道原來我兩年先捐得五次... 數量什少 =.=... (男性一年可以捐四次的呢)



玩開 FB 的朋友, 可以加 此App 留意紅十字會最近消息

Tuesday, April 13, 2010

Given a sequence of increasing values, find ...

The rationale of this post is hoping to find someone give me hints on a problem: Given a sequence of increasing values, find the first value which has a significant change in the sequence.

To visualize the problem: consider this sequence [1, 1, 2, 4, 6, 14, 24]
the "value" that has a significant change may be 14.

This can be plotted in a chart.

The slope in between value 6 and 14 gives a significant change, obviously.

Indeed, this is a problem that i encountered when implementing the DBScan algorithm.

DBScan algorithm is a "density based" clustering algorithm. It takes two parameters - radius of a circle (eps), and the number of points in the circle which the circle can be considered as dense (minPts).

You may think DBScan as simple as... Taking any point, a "sphere" is formed with the radius (eps) center at that point. The circle may cover other points in a N-dimensions space. If the number of points covered by the "sphere" is larger than minPts, it is dense enough to form a cluster. And that cluster further expands with the points covered by that "sphere".

Finally ... points that are "closely" distanced will form a cluster and all points in the N-dimension space will form "many" cluster.

An important parameter in DBScan algorithm is the radius. From the wikipedia, the radius can be estimated by k-distance graph. First, let's define what is k-distance. Given a point, k-distance is distance between the k-th closest point of the given point. A k-distance graph is plotted by getting k-distance of all points in the N-dimension space, sorting them and plot it out as shown above. From the graph, it is simple to get the sharp change k-distance as the eps and using k as the minPts.

But, my concern is raised since it's not practical to plot 100 graph and read them all if i wanna try out k from 1 to 101. As consequence, i wanna have a algorithm to get the k-distance without looking into a visual graph.

One idea flow to my mind is to compare the slopes of neighbouring k-distance. If k-distance of pt-b is much larger than that of pt-a (the point just before pt-b), then k-distance of pt-b may probably be the selected sharp change k-distance. However, what is "much larger than" in this method? A parameter of floating point? Thus, i think this method is not that practical.

Another idea is to use the statistics - mean and standard deviation. Mean is the average while standard deviation is the square root of the variance. Variance is the average difference from the mean value of every point. So, standard deviation split the k-distance graph piece by piece fairly. If this is the case, I can compute the mean and make use of standard deviation to obtain a k-distance in several split away from the mean. By this method, the parameter is the number of split away from the mean value. An integer that can loop nicely :]

My implementation of this method is hosted on github #line23. Anyway, i believe there must be better method to obtain k-distance "automatically"... please comment :]

Wednesday, March 31, 2010

Future Chromium bundles Flash ... Why.

There's a breaking news this morning from google's chromium blog - Bringing improved support for Adobe Flash Player to Google Chrome

In the post ... it mentioned..
We believe this initiative will help our users in the following ways:


* When users download Chrome, they will also receive the latest version of Adobe Flash Player. There will be no need to install Flash Player separately.

* Users will automatically receive updates related to Flash Player using Google Chrome’s auto-update mechanism. This eliminates the need to manually download separate updates and reduces the security risk of using outdated versions.

* With Adobe's help, we plan to further protect users by extending Chrome's “sandbox” to web pages with Flash content.


The 1st two points are trivial, but the 3rd? This is the reason why i'm writing this post.

The 3rd generated my interest in knowing more about the sandbox model of Chrome and thus leading me to read their "technical report".

In the following paragraph, i will briefly explain the architecture of Chromium based on reading their "technical report". (Note: I didn't read any source code of the beautiful Chromium project and their "report" was published quite a while ago. Thus, some info. below may be inaccurate or outdated but fairly enough for u to get understand with their sandbox model.)

---
Here we go.
---

In the chromium architecture, the web browser is separated into two core components: the rendering engine and the browser kernel.

Here is the task assignments mentioned in the report.

Rendering Engine: HTML parsing, CSS parsing, Image decoding, Javascript interpreter, Regular expressions, Layout, Document object model, Rendering, SVG, XML parsing, XSLT

Browser Kernel: Cookie database, History database, Password database, Window management, Location bar, Safe browsing blacklist, Network stack, SSL/TLS, Data cache, Download manager, Clipboard

And, both of the components can carry URL parsing and Unicode parsing.

From their design philosophy, the rendering engine is always subjected to a bunch of known / unknown vulnerabilities. Thus, they decided to grant limited privileges to the rendering engine while letting the browser kernel to have the user's privileges.

These two components are communicating with IPC (Inter-Process communication) from a "big" picture in the report and the browser kernel provides API for the rendering engine to use its services.

Before i continue, here is a remark. Their report mentioned:
Chromium's security architecture mitigates approximately 70% of critical browser vulnerabilities that let an attacker execute arbitrary code.


Where is the rest 30%? You will, perhaps, know later.

Whenever a user create a new tab in Chrome, the browser kernel fork() a new rendering engine with a sandbox. What is the sandbox? On windows, it's the Windows Security Manager. The rendering engine runs with a "restricted security token", that is used by the security manager, on a Windows Job Object. Whenever some action is performed by the rendering engine, the manager will check for privileges.

It sounds work ... but some known limitations in this M$ windows sandbox.
[1] FAT32 - FAT32 does not support the access control list and thus user's USB thumb with FAT32 format may be read or write by a compromised rendering engine.
[2] Misconfigured objects - Some application create objects with NULL DACLs. This will let security manager ignore the security token owned by rendering engine. (But NTFS mitigated that)
[3] TCP/TP - Theoretically, rendering engine can create any socket as it wishes... (But Win API require a "handle", which the rendering engine will not hv, to open a socket ... a bug in Win API?? or a feature??)

As a result of using the sandbox, even a rendering engine is compromised by buffer overflow of HTML parser, integer overflow of regex, or what-so-ever related to attack against rendering engine... [a] the attack can only use the restricted sandbox to do the bad ... [b] or invoke the IPC with Chrome's browser kernel ... [c] or if m$ security manager hv bug, sandbox broke too

You may noticed that [a] is assumed to be secured and [c] has nothing to do at all since m$ is "closed". The only problem left is [b] ... attacker insert code to invoke browser kernel to do bad.

Now, let's see how's kernel work.

Browser kernel API handles a few things. User interaction, persistent storage, and networking.

User interaction includes - rendering and user input.
Rendering is handled by the rendering engine. Unparsed HTML is sent to the engine and a bitmap is returned to the browser kernel. The kernel uses the trick like "double-buffering" for the bitmap and presents it onto screen.
User input either occurs on Chrome's body (they call it browser chrome) like location bar or within the web page. For former case, kernel will handle it. For latter case, kernel asks rendering engine to render the input as bitmap again. Even rendering engine is compromised, it cannot receives keyboard signals.

Persistent storage includes - upload and download.
Upload is handled by kernel. When user selects certain file, kernel will ask to render the selection path and keeps records of which file the user granted to the web (engine) to access. Even compromised, the engine cannot touch those files which are not granted yet.
Download is also handled by kernel. Kernel API provides method to invoke a download action which the kernel will blacklist some filename like desktop.ini (M$ vulnerable feature). Even a rendering engine is compromised, the engine cannot ask the kernel to download some file to anywhere on user's disk.

Networking includes URL requests.
Rendering engine itself is allowed to fire requests for protocol like http, https, ftp, etc. But not the file://
File opening is handled by browser kernel when user type in file:// in the location bar. The file is rendered in a dedicated rendering engine.

After the long boring things... you will noticed that ... Chrome actually deals with 3 security threats. [1] Persistent Malware, [2] Transient Keylogger, and [3] File Theft. Other threat like phishing, origin isolation, firewall circumvention, and website vulnerabilities are not in scope of Chromium written in the "technical report".

Last but not least ... why bundle Flash (a "the most widely used web browser" plugin?) There is a small paragraph in the report talks about plug-in. From historical reason, plug-in is a separate host process which execute outside the browser.

In order to maintain compatibility with existing web sites, browser plug-ins cannot be hosted inside the rendering engine because plug-in vendors expect there to be at most one instance of a plugin for the entire web browser. If plug-ins were hosted inside the browser kernel, a plug-in crash would be sufficient to crash the entire browser... each plug-in runs outside of the sandbox and with the user's privileges... For example, the Flash Player plug-in can access the user's microphone and webcam, as well as write to the user's file system (to update itself and store Flash cookies).

...

Chromium also contains an option to run existing plug-ins inside the sandbox. To do so, run the browser with the --safe-plugins command line option. This setting is experimental and might cause instability or unexpected behavior.


So... you know why "With Adobe's help, we plan to further protect users by extending Chrome's “sandbox” to web pages with Flash content." (Flash player like m$ is "closed")

The chromium team phoned adobe and says... let's work together to secure Flash and silent those who against Flash coz of security concern in unknown implementation flaws. H.264 video codec will keep on our Utube. (this paragraph is some "bs" that comes to my mind)

Thursday, February 11, 2010

My 1st jQuery plugin - jQuery.fn.timeSink

My first contact with jQuery was last summer when i was exploring Django. Indeed, i just used jQuery autocomplete plugin only. In this month, I have another project aimed to explore ESAPI and MongoDB. The project is in its very early stage that just allows student submission for a course assignment. As i don't want to use any framework that may distribute my experience with ESAPI and MongoDB, I opt to use pure Servlet and JSP. (Although i prefer python, but my "boss" prefer java anytime.)

This project has a JSP to list out all submissions from student. My first intuition is to provide a keyword filter (jquery-livesearch-plugin that uses quicksilver algorithm for filtering) to assist information lookup. Then my boss ask me.. in a run long, can it be filtered by time? Then, django admin page comes to my mind.



Django admin page has a very nice time-based filter on the right to filter out entries. So, my question is, can I have a time-based filter to work with the nice quicksilver filter? I did a brief search on google and find none such time-based filter for jQuery. Thus... I have to create one for my purpose.

You may goto github to obtain a copy of the so called "time sink" plugin. I call it a "sink" because it uses a time sieve... (a sink always has a sieve to prevent big particles entering the pipe... right?) There is a demo html and you will see how can the plugin be used with two separate html-list.

You may also checkout the JSP that actually migrates the two different filters applying on the same set of items. They cooperate with each other using custom events dispatched from those filters.


Friday, January 29, 2010

耍盲雞@Dialogue-in-the-Dark

耍盲雞 - 兒時玩意. 其中一人矇住雙眼, 其他人以叫聲或其他行動引導之, 矇眼者只要抓到任何一人, 遊戲結束.
(電視劇裡, 通常被抓到的是位美女)

昨夜有幸到 Dialogue-in-the-Dark 的體驗館 (brochure在). 那裡不是跟你玩耍盲雞, 而是分組在一個完全黑暗的地方, 嘗試去感受一下失明人士的生活.

每組人, 都有一根盲公竹以及一位失明人士作嚮導. 在進入第一間房間之前, 到了一個跟嚮導會合的地方. 嚮導說: "我在這裡, 你們聽著聲音來找我吧". 這應該就是我們第一個 "training" 吧. 這裡, 小弟隨了靠聲音"導航"外, 沒其他法子吧...

到了第一個房間, 那裡是一個公園的擺設吧.. 有一些小樹放了在那裡.. 嚮導說: "試找一下有什麼在這, 摸到人也不怕, 這一刻你摸他, 下一刻他摸你呢..." 其實, 只靠摸又怎樣知道是什麼樹呢... 小弟到這一刻都不知道是什麼樹... 可能是鐵樹吧 :p 大家都說了些樹的名字吧.. 嚮導說: "其實我都不知道是什麼呢.." 他從來沒有機會看過樹. 接著, 那房間要走過一條小橋... 大家走上橋時, 嚮導"可惡的"搖晃小橋... well ... What's Happening!? 0rz... 小弟只知道...很暈呢-.-

房間跟房間之間, 有一條小道連接... 嚮導說: "摸著你們左邊的牆一直走" ... well, 左邊是人呢 lol... 當一直摸著牆走時... 手跟我說牆不見了.... 唯一的"導航"工具突然失去 -.- 汗! 嚮導說: "摸清楚少少..." 原來是個 左彎 =]

第二個房間, 到了"馬路"旁... 那裡有間雜貨店吧... 嚮導說: "摸下有什麼"... 除了一些熟悉的水果, 其他東西的不知道是什麼呢 ... 雜貨店對面有一架單車, 組友們說了有單車, 但我找了很久才找到呢 :p 人笨也~ 之後走到 "馬路旁" .... 那盲人輔導線是他媽的難感受得到呢-.- 馬路旁有輛車, 有賣公仔的店, 及有個電話... 嚮導說: "試下打電話" =] 0rz

第三間房間, 是一個音樂廳吧... 坐著就是聽音樂... 但其實... 其間小弟的緊張心情只有些許改善, 因為.. 座著聽的時候... 很期待會有什麼發生呢...

最後一個房間很深感受... 要買賣呢... 那裡有些飲品賣... well 小弟沒有散紙, 摸出銀包時... 摸著頭腦問... 10蚊 or 20蚊紙 收唔收 -.-" 事實上我都不知道自己的紙幣是 10蚊 or 20蚊 =.=..... 憑著自己放錢的習慣... 拿了一張"疑似" 10蚊紙出來 .. 店員肯定的跟我說, 那的確是 10蚊紙, 他拿了包 維x奶 給我.. 完成交易!! 之後找了個座.. 跟嚮導聊聊天... 因嚮導是先天的視力障礙, 自小看不清東西, 要東西放到很近才有些許影像... 小弟又一次發揮白痴精神.. 問了題白痴問題: "你看東西是朦還是黑?" 嚮導: "我根本不知道什麼是朦" ...

過後... 跟友人研究著... 紙幣有什麼 token 給失明人士呢 ~.~

Wednesday, January 27, 2010

五區總辭, 你知多少?

其實, 小弟身邊不斷有人問, 五區總辭, 為何變相"公投"? 小弟現以自己有限知識, 跟各位說一下, 如有錯誤, 請插之!

源由: 泛民眾認為, 立法會議席一天仍有功能組別, 一天沒有"真普選".
原因: 功能組別是由某一群界別人士推選出來的代表人, 就是這樣, 立法會不會是一人一票得出來, 有些人可以多投一票在功能組別 (功能組別好壞, 請各位自行研究, 這就是"公投"的目標)

政府最近提出的"政改"方案, 把地方議席增加了, 同時因要維持比例不變, 功能組別議席相對亦增加. 泛民眾則認為此方案, 根本沒有進步. 因而流出... 五區總辭, 變相"公投" 的說法

到底, 如何"公投"呢? 社民、公民眾訂了一些"準則", 認為補選中, 社民、公民五人得票量總和 少於 其餘黨派中五位最高得票的總和, 社民、公民眾便視之為 大眾市民支持那 "政改"方案, 社民、公民仍留在議會的便支持其方案. 否則, 反對之. (社民、公民眾訂的"準則", 請看wiki)

民建聯那群"禮義廉"則否認此仍"公投"... 因為"公投"結果只是由社民、公民眾自編自導自演. 社民、公民眾則認為, 這"公投"相對一般按比例的民意調查"更科學" (事實只有科學不科學, 沒有更科學... 這意思是更有說服力).. "禮義廉"眾則認為此乃浪費公帑. 而事實上, 這個普選只需要全港 700萬 市民, 每人花 20$ ... (其實 $20 是很多, 足夠小弟在城大兩餐 10$飯 =D)

其實, 那群"禮義廉"何不像自由黨眾.. 表決不參與補選... 而在大費周章?? 大家都明白, 只要沒有人跟社民、公民眾參與補選, 他們很自然是浪費公帑的"罪人" ... 借此大費周章提高自己政治身價的一眾, 真是禮義廉!

小弟個人意見: "政改"的表決, 認同需要廣泛參與的表決.. 而一般按比例的民意調查 (即 statistics 中的 random sampling by population ratio), 不能代表廣泛意見及參與. 如果沒有一個比 變相"公投" 更好的參與方法, 小弟只有支持 "公投".

人是需要有選擇的, 沒有其他更好的選擇時, 選擇那唯一的選擇, 比默默接受更好.

---
要清楚更多更多五區總辭的資訊... 請看 wiki

Sunday, January 17, 2010

Welcome to "China"

Just half an hour ago, i was going back to school. I met a foreign couple. They asked me: "How to go to Wong Tai Sin Temple?" The first sentence I smile and spoke: "Welcome to China". Yes! There is no typo. Welcome to "China".

譚惠珠在今天城市論壇把大家可能忽略的事實再一次翻出來 - "香港的主權不在於民, 中國才是香港的主權國..."
 
© 2009 Emptiness Blogging. All Rights Reserved | Powered by Blogger
Design by psdvibe | Bloggerized By LawnyDesignz