Posts

Showing posts with the label Python

Line Endings in a Mixed Environment Application

If you have to operate on text strings and files in an application that can be used interchangeaby in Windows and other environments, it can be a bit confusing. Below is what I found (all on Python 3.4). When reading a file into a Python string: File contents Windows Others 'A' \x0D \x0A 'B' 'A\nB' (len=3) '\A\r\nB' (len=4) 'A' \x0A 'B' 'A\n\B' (len=3) '\A\nB' (len=3) When writing a Python string to a file, this is the file content: String Windows Others 'A\nB' 'A' \x0D \x0A 'B' 'A' \x0A 'B' 'A\r\nB' 'A' \x0D \x0D \x0A 'B' 'A' \x0D \x0A 'B' If you copy a file from a non-Windows to a Windows system, the file will not have CR, but the Python app in Windows will read nicely. But if you write it out again, then the new file will have different line endings from the original. If you copy a file from Windows to a non-Windows system...

Serializing that convoluted cookielib.CookieJar

The Python cookielib.CookieJar object is a very convenient feature to manage cookies automatically as you traverse a series of Http web requests back and forth. However, the data structure of the class is a convoluted collection of Python dict . cookielib.CookieJar has a _cookies property which is a dictionary of a dictionary of a dictionary of cookielib.Cookie . To understand the data structure in the CookieJar object cj , try: for domain in cj._cookies.keys(): for path in cj._cookies[domain]: for name in cj._cookies[domain][path]: cookie = cj._cookies[domain][path][name] print domain, path, cookie.name, '=' , cookie.value However, the class-defined __iter__ method makes the above effort unnecessary if you just want to find the value of a cookie. The __iter__ method returns a cookielib.Cookie object for each iteration. You can simply go: for cookie in cj: print cookie.domain, cookie.path, cookie.name, cookie.value # etc I...