Seven deadly sins – Joe Armstrong

1. Code even you cannot understand a week after you wrote it – no comments written in the code
2. Code with no specifications
3. Code that is shipped as soon as it runs and before it is beautiful
4. Code with added features
5. Code that is very very fast very very obscure and incorrect
6. Code that is not beautiful
7. Code that you wrote without understanding the problem

The NULL conundrum

So, you have a text field, named ‘Contact_Person’ defined in a table. This field is not required, and you allow null. Why not define a default value as empty string? It seems to me using an empty string is convenient. I don’t have to check for ‘null’ in the client application.

What is NULL? According to this wikipedia page,

The original intent of NULL in SQL was to represent missing data in a database, i.e. the assumption that an actual value exists, but that the value is not currently recorded in the database

Now, NULL represents UNKNOWN.

Sql actually uses three-valued logic. This article gives some examples, and writes

Accepts TRUE = Reject both FALSE and UNKNOWN
Rejects FALSE = Accepts both TRUE and UNKNOWN

On stackexchange, the argument made is logical that

I would say that NULL is the correct choice for "no email address". There are many "invalid" email addresses, and '' (empty string) is just one. For example "foo" is not a valid email address, "a@b@c" is not valid and so on. So just because '' is not a valid email address is no reason to use it as the "no email address" value

Few more interesting questions -

NULL = NULL = FALSE

NULL is a textual representation of an unknown value. If you have two unknown values, you can’t conclusively state anything about their equality

Since it’s December, let’s use a seasonal example. I have two presents under the tree. Now, you tell me if I got two of the same thing or not.

They can be different or they can be equal, you don’t know until one open both presents. Who knows? You invited two people that don’t know each other and both have done to you the same gift – rare, but not impossible §.

So the question: are these two UNKNOWN presents the same (equal, =)? The correct answer is: UNKNOWN (i.e. NULL).

SQL does not any good forcing one to interpret the reflexive property of equality, which state that:

for any x, x = x §§ (in plain English: whatever the universe of discourse, a "thing" is always equal to itself).

Why does column = NULL return no rows ?

Better CSS

CSS Architecture – In depth post on bad and good CSS. Some points -

  • Good – Predictable, Reusable, Maintainable, Scalable
  • Bad Practices – modifying components based on who their parents are, overly complicated selectors, overly generic class names, making a rule do too much

    All of the above bad practices share one similarity, they place far too much of the styling burden on the CSS.

  • Advice

    The best approach that I’ve found is for the CSS include as little HTML structure as possible. The CSS should define how a set of visual elements look and (in order to minimize coupling with the HTML) those elements should look as they’re defined regardless of where they appear in the HTML

  • HTML Inspector
    CSS Lint – tool to help point out problems with your CSS code
    CSS Lint Source Code

    Parsing and storing date into Mongodb

    Mongodb’s representation of datetime is – number-of-milliseconds-since-epoch. To parse a date string entered in a form

    from datetime import datetime, time
    # getting date from form
    dt = datetime.strptime(request.form['date'], '%Y-%m-%d')
    # adding date for a record
    record = {}
    # need to add time (0,0)
    record['date'] = datetime.combine(dt, time())
    view raw date_py_mongo hosted with ❤ by GitHub

    Useful tools to analyse a website

    SEO compatibility check – gives an overview of whats right and what can be improved
    Google page speed insight – “PageSpeed Insights analyzes the content of a web page, then generates suggestions to make that page faster”
    Yslow – Can add as extension to Firefox or Chrome
    webpagetest.org
    Pingdom
    Optimize images
    Markup Validation Service

    Recently I used these tools (including Firebug) for a client’s WordPress based blog. The areas of improvement -

    1) There was no caching of static assets – added max-age with conditional check (last-modified). Read this explanation
    2) Scaling images – Do not scale images in HTML
    3) Optimizing images – This helped a lot. Images were of much bigger size than needed to be.
    4) Minimize HTTP requests – too many images, so one possible solution is CSS Sprites
    5) Minification of js and css files

    Exploring Counter and defaultdict in python collections

    I needed to count frequency of items in a list. Given a list like below

    some_data = ['a','b','v','c','l','c']

    The ugly way was using the dictionary

    frequency_count = dict()
    some_data = ['a','b','v','c','l','c']
    for item in some_data:
    if item in frequency_count:
    frequency_count[item] += 1
    else:
    frequency_count[item] = 1
    print frequency_count
    view raw freq_dict hosted with ❤ by GitHub

    The end result is
    {'a': 1, 'c': 2, 'b': 1, 'l': 1, 'v': 1}

    But we can use Counter which is much cleaner

    from collections import Counter
    cnt = Counter()
    some_data = ['a','b','v','c','l','c']
    for item in some_data:
    cnt[item] += 1
    print cnt

    Result -

    Counter({'c': 2, 'a': 1, 'b': 1, 'l': 1, 'v': 1})

    But, more elegant, as suggested in comments by Andy

    from collections import Counter
    some_data = ['a','b','v','c','l','c']
    result = Counter(some_data)
    view raw gistfile1.txt hosted with ❤ by GitHub

    And, another way is using defaultdict

    from collections import defaultdict
    some_data = ['a','b','v','c','l','c']
    frequency_count = defaultdict(int)
    for item in some_data:
    frequency_count[item] += 1
    print frequency_count
    view raw freq_defaultdict hosted with ❤ by GitHub

    Result

    defaultdict(, {'a': 1, 'c': 2, 'b': 1, 'l': 1, 'v': 1})

    You can read more here http://docs.python.org/2/library/collections.html

    Facebook, irctc, keep-alive, performance

    Many people are frustrated with irctc.co.in. The site has become excruciatingly slow! And booking a ticket is a test of patience. Not only that, sometimes you end up getting server error during the process of booking. One of my friends shared his frustration on facebook. I decided to do some research.

    First, I looked at the response headers and found -

    Connection:close
    Content-type:text/html;charset=Shift_Jis
    Date:Sun, 25 Nov 2012 08:01:22 GMT
    Server:Microsoft-IIS/6.0
    X-Powered-By:ASP.NET

    A surprise! The site runs on IIS and is developed in ASP.NET ( the headers can be faked). The next thing I noticed was – ‘connection:close’. I thought this might have adverse affect on performance since site uses https. But, apparently no. This article says

    The connection will stay open while both sides send and receive encrypted data until either side sends out a “closure alert” message and then closes the connection. If we reconnect shortly after disconnecting, we can re-use the negotiated keys (if the server still has them cached) without using public key operations, otherwise we do a completely new full handshake.

    Using ‘connection:close’, helps server to keep resources free. And, since requests are ajax based, only small amount of data is exchanged between the client and the server. I found a wonderful article which traces history of keep-alive and why it is considered as harmful.
    Some more resources
    TLS
    TCP Connection establishment

    Stumbled On Python Libraries

    I am making a list of libraries which I discover while reading blogs, mailing lists, etc. and feel can be useful.

    1. Requests An inspiring library, already used in many small python scripts.

    2. django-mediagenerator asset manager. Better are django-compressor and django-pipeline

    3. Wappalyzer The blurb says “Wappalyzer is a browser extension that uncovers the technologies used on websites. It detects content management systems, web shops, web servers, JavaScript frameworks, analytics tools and many more.”

    Django migration timeout

    We had a table having more than 6 lakh records. Everytime we added a new field to the model, and ran the migration we got timeout. So, we did the following 3 step procedure

    1) Opened psql with timeout options –

    PGOPTIONS="-c statement_timeout=0" psql -U name

    2) We extracted the sql to be executed in psql from south

    manage.py migrate –db-dry-run –verbosity=2

    3) Then, ran the migration command with fake option

    python manage.py migrate modelname –fake