Monday, December 29, 2014

The 5000mph car

If your car dealer offers, for the additional sum of $500, to upgrade your car to a top speed of 5,000mph, would you take it?

When it comes to broadband Internet access, people have been downing the Kool-Aid with exceptional enthusiasm.

In some cities with weak or non-existent consumer laws, ISPs are offering 100Mbps, 200Mbps, 500Mbps and 1Gbps service. Each bigger number option (note I do not say faster option) costs a few dollars more than the smaller one. In some places, the ISPs have stopped offering 100Mbps altogether and start at 200Mbps. The reason couldn't be more obvious. The cost of providing 200Mbps is the same as 100Mbps. The small print in the T&Cs' essentially allows the ISPs to not do anything other than provide the specified speed between the customer and their first node. So, a higher priced product means more profits.

Buying 200Mbps or 500Mbps is exactly the same as buying a 2,000mph or 5,000mph car respectively. People can easily relate to the uselessness of buying a 5,000mph car, but alas most cannot understand the same for a 500Mbps connection. Come on, most people do not have a home LAN capable of 500Mbps. Even most computers don't have an effective IO bus speed anywhere near that.

Taking a spin at is the same as taking your 5,000mph car to a specially engineered race track to check its maximum speed. The moment you leave the race track, your car is as good as one that has a top speed of 80mph. Similarly, departing and going to your typical web site, you will likely experience a typical 2Mbps.

Why do you even need 100Mbps? A web page will render on your computer fully in 1 second if you have 1Mbps. A full HD video can stream without a single jitter at far less than 10Mbps. But today we can't even get interruption free YouTube at a lowly 640x480, and who can afford more bandwidth than Google? The bottleneck may not be at the web server. It can be anywhere along a hundred points from your device to the server. And the cause need not be network speed or bandwidth. It can be something as mundane as a slow disk.

Recently, I managed to get fiber at 100Mbps. It was only possible through a patient battle. The telco won't sell any fiber service below 200Mbps. I only got my wish by refusing to let go of my 6Mbps ADSL line until they relented. I wanted fiber not for its x00Mbps, but for its symmetric bandwidth as I often connect to my servers and cameras at home from outside. As it turned out, the 100Mbps service was slightly cheaper than the ADSL.

Yeah, showed great numbers in excess of 90Mbps both ways. But when I uploaded a file to, which has 300Gbps connectivity, the throughput was 0.4Mbps - something that even my ADSL's miserly 600kbps upstream could handle just fine.

Meaningless numbersMeaningless numbers

January 2015 Update: As if to prove my claim that all these are meaningless numbers, two months after I signed up the 100Mbps service, the telco sent me this:

It took me a while to realize the false generosity. In two years when the minimum service period is up, my plan will continue as a 200Mbps one which costs more!

Friday, December 5, 2014

Serializing that convoluted cookielib.CookieJar

The Python cookielib.CookieJar object is a very convenient feature to manage cookies automatically as you traverse a series of Http web requests back and forth. However, the data structure of the class is a convoluted collection of Python dict.

cookielib.CookieJar has a _cookies property which is a dictionary of a dictionary of a dictionary of cookielib.Cookie.

To understand the data structure in the CookieJar object cj, try:

for domain in cj._cookies.keys():
   for path in cj._cookies[domain]:
     for name in cj._cookies[domain][path]:
       cookie = cj._cookies[domain][path][name]
       print domain, path,, '=', cookie.value

However, the class-defined __iter__ method makes the above effort unnecessary if you just want to find the value of a cookie. The __iter__ method returns a cookielib.Cookie object for each iteration. You can simply go:

for cookie in cj:
    print cookie.domain, cookie.path,, cookie.value # etc

If you want your CookieJar to persist in a file that can later be read back to create a Cookiejar object, the following two methods should work. They require the cPickle and base64 modules.

import cPicker, base64

def writeCookieJarFile(cj, cookieJarFile):
    f = open(cookieJarFile,'w')
    for domain in cj._cookies.keys():
        serialized = cPickle.dumps(cj._cookies[domain])

def readCookieJarFile(cookieJarFile):
    cj = cookielib.CookieJar()
        with open(cookieJarFile,'r') as f:
    except Exception as exception:
        print "readCookieJarFile: %s" % exception
    lines = text.split('\n')
    for line in lines:
        if line=='': continue
        cookieObject = cPickle.loads(base64.b64decode(line))
        firstCookie = cookieObject[cookieObject.keys()[0]].keys()[0]
        domain = cookieObject[cookieObject.keys()[0]][firstCookie].domain
        cj._cookies[domain] = cookieObject
    return cj

Note that cookieObject in the read method above is not a cookielib.Cookie object. It is a dictionary (keyed by domain) of a dictionary (keyed by path) of a dictionary of Cookie (keyed by name).