Python, Django – The Learning Saga

Saga – a heroic narrative :)

I first coded in python in October 2009. I had an idea for a chess website, and decided to try out Django. I had coded a fair bit in Ruby, and explored rails too. So, the dynamic aspect of python and magical qualities of Django did not overwhelm me. It took me just 7 days to launch the website. The users could solve chess combinations. So, I consider it a small success. But, then I did not take the site any further. Now, in the beginning of 2011, I came up with the idea of adding a new feature – Solitaire Chess. This time around, I had to dig deeper in python. And, these are some of my learnings -

  • TypeError: ‘module’ object is not callable

    I needed a class which would generate FEN string, based on the position on the chessboard. Now, FENgenerator.py had class FENGenerator. The module name is FENgenerator(the name of the file itself). I imported “FENgenerator”, and tried
    to create an object, FEN = FENGenerator(), and got the above “TypeError”.

    In python, everything is an object, including a module.The above import statement, does not import names in the module. So, to create an object, I had to do – FEN = FENGenerator.FENGenerator

  • The scope of variable “pleasant shock” :)
  • The code –

    def printmatter(self,matter):
    if matter == "Urgent":
    givep = "yes
    else:
    givep = "low"
    print givep

    I was able to access a variable declared inside “if” block”, outside the “if” block
    More here,
    stackoverflow
    scope

  • __init__.py required for your module to be seen
  • Session variables can be used in templates (Django)
  • easy_install is equivalent of rubygems
    easy_install django-extensions -> downloads all the packages from an online repository
  • I used TDD approach to develop FEN generation logic. My first exposure to tests in python. Each test method has to start with “test_everything_is_ok”.
  • LINQ-SQL, Contains, Bug

    Recently, client had reported a strange bug, which I had never encountered before. The client was editing value of an int column in a gridview. Once the value was changed, the grid displayed this error -

    The incoming tabular data stream (TDS) remote procedure call (RPC) protocol stream is incorrect. Too many parameters were provided in this RPC request. The maximum is 2100.

    Now, we were not calling any stored procedure. My colleague found the cause of this problem. We were updating an enitity on cell changed event. To update, we were first making a select query like this,
    List cs = (from cl in Dataclass.Consumers
    where (cl.ID == ID)
    && draUmList.Select(f => f.Order).Contains(cl.ID)
    select cl).ToList();

    The above query was getting converted into


    exec sp_executesql N'SELECT xyx AS [t0]
    WHERE (id] IN (@p1, @p2, @p3, @p4, @p5, @p6, @p7, @p8, @p9, @p10, @p11, @p12, @p13, @p14, @p15, @p16, @p17, @p18, @p19, @p20, @p21, @p22, @p23, @p24, @p25, @p26, @p27, @p28, @p29, @p30, @p31, @p32, @p33, @p34, @p35, @p36, @p37, @p38, @p39, @p40, @p41, @p42, @p43, @p44, @p45, @p46, @p47, @p48, @p49, @p50, @p51, @p52, @p53, @p54, @p55, @p56, @p57, @p58, @p59, @p60, @p61, @p62, @p63, @p64, @p65, @p66, @p67, @p68, @p69, @p70, @p71, @p72, @p73, @p74, @p75, @p76, @p77, @p78, @p79, @p80, @p81, @p82, @p83, @p84, @p85, @p86, @p87, @p88, @p89, @p90, @p91, @p92, @p93, @p94, @p95, @p96, @p97, @p98, @p99, @p100, @p101, @p102, @p103, @p104, @p105, @p106, @p107, @p108, @p109, @p110, @p111, @p112, @p113, @p114, @p115, @p116, @p117, @p118, @p119, @p120, @p121, @p122, @p123, @p124, @p125, @p126, @p127, @p128, @p129, @p130, @p131, @p132, @p133, @p134, @p135, @p136, @p137, @p138, @p139, @p140, @p141, @p142, @p143, @p144, @p145, @p146, @p147, @p148, @p149))',N'@p0 uniqueidentifier,@p1 uniqueidentifier,@p2 uniqueidentifier,@p3 uniqueidentifier,@p4 uniqueidentifier,@p5 uniqueidentifier,@p6 uniqueidentifier,@p7 uniqueidentifier,@p8 uniqueidentifier,@p9 uniqueidentifier,@p10 uniqueidentifier,@p11 uniqueidentifier,@p12 uniqueidentifier,@p13 uniqueidentifier,@p14 uniqueidentifier,@p15 uniqueidentifier,@p16 uniqueidentifier,@p17 uniqueidentifier,@p18 uniqueidentifier,@p19 uniqueidentifier,@p20 uniqueidentifier,@p21 uniqueidentifier,@p22 uniqueidentifier,@p23 uniqueidentifier,@p24 uniqueidentifier,@p25 uniqueidentifier,@p26 uniqueidentifier,@p27 uniqueidentifier,@p28 uniqueidentifier,@p29 uniqueidentifier,@p30 uniqueidentifier,@p31 uniqueidentifier,@p32 uniqueidentifier,@p33 uniqueidentifier,@p34 uniqueidentifier,@p35 uniqueidentifier,@p36 uniqueidentifier,@p37 uniqueidentifier,@p38 uniqueidentifier,@p39 uniqueidentifier,@p40 uniqueidentifier,@p41 uniqueidentifier,@p42 uniqueidentifier,@p43 uniqueidentifier,@p44 uniqueidentifier,@p45

    So, if there too many instances of the entity, it could easily exceed 2100. So, the solution which replaced the above query, used join query and where clause, which was more efficient.

    Django Lessons – #2

    I needed to record the IP address of the user as well as the admin for every action completed. And, this IP must show with each log entry. So, I used user_ip=request.META["REMOTE_ADDR"]). But, this did not work on the server. The IP was getting recorded as 127.0.0.1. Why so? The information I got from the webfaction support was that

    REMOTE_ADDR is 127.0.0.1 because requests for your site are proxied through our front-end web server, which of course is on the local host.If you want the original IP of the request, then use
    request.META['HTTP_X_FORWARDED_FOR'].

    I did some research, and found about the concept of reverse proxy.

    Some links
    Forum
    Apache

    Experience Report – From Zero to 500

    Every experience teaches something. It is upto us to imbibe the lessons. Sometimes worrying about success and failure takes away the of joy of just doing.

    I started working on the current project in September 2009. The technology stack is .net 3.5, Windows Forms, SQL Server 2008 and LINQ-To-SQL. The Project Manager hired me. When I joined the team, the software had been in development for 4-5 months. All of us are from different regions(Middle East,Europe) and work remotely. We communicate by using software like Skype, TeamViewer, and of course, email. We use a wiki for documentation, and a web-based bugtracker. Initially, I was given the task of understanding the software, and writing the documentation. I found this a bit strange. But, it turned out to be a good way to understand the software. Also, I was happy that a wiki was used for the documentation. My many doubts were clarified either by the PM or, the client(end-user). Soon, I found there were many problems –

    1) The wiki had all the information related to the system to be developed, and tutorials for the users. But, there were many ambiguities in the system documentation. Also, it was a bit outdated, and not in sync with the state of the software.Most of the domain knowledge was with the PM. And, one of the reasons, he had asked me to do the documentation was to capture that knowledge.

    2) The codebase had huge amount of duplication. There were no tests.

    3) There was an architecture – 3 logical layers – windowsforms , business layer and data access layer. But, there was lot of logic which was in the forms.

    4) All the features which had been implemented were incomplete and had bugs.

    5) One of the developers involved had this strategy – he would fix the bug assigned to him and close it.He did not check all the scenarios. So, other bugs cropped up. The software was unstable. I felt very uneasy about this situation.

    6) We were using LINQ-to-SQL, but did not have good understanding of this technology. This was creating problems like too many queries getting fired. We realized later when we started facing performance issues.

    7) The database was not well designed. There were some columns in some tables not being used. Most of the columns had “allow nulls”. It was not under version control. We had views which were doing lot of calculations.

    The PM knew about these issues, and was eager to overcome them. One thing I appreciate about him is his persistence and eagerness to solve problems. We started talking about testing, and he suggested to write functional tests. But, we found that the Devexpress controls did not support UI testing using Microsoft framework. Then, I told him we should start writing unit tests, and he was open to this idea. I also suggested that we use MVP(Model View Presenter), and take all the logic out of forms. He agreed, and asked me to do it on a couple of forms. I started refactoring, and adding tests.

    Initially, I was the only one running the tests. Sometimes I found compile errors in the test project which surprised me! I realized that the project has not been added as part of the solution!. Initially, many of the tests were integration tests and so the database was involved. We created a script to create the database and, separate data files(sql) for default data. Then, wrote a script which created the database, build the projects and ran the tests. We used NDBUnit for input data to be used for integration tests. This required creation of xml files. Slowly the test coverage increased, and it started catching some bugs. Then, I started writing tests for bugs too. I realized some things were common on all forms. This, was a eureka! moment, seeing patterns among the mass of duplicate code or, the representation of similar ideas. This led to creation of base view and base presenter. This eliminated a lot of duplication. Soon, we had 2 new members in the team. They were eager to learn and write tests. This helped in speeding up of writing tests. Today, we have about 500+ tests. Also, we started a QA page which listed points to be checked for each module. This page helped everyone, including the QA team.

    The state of the software today is –

    1) The software is in better stable condition. Most of the business logic is now in good condition, with very few bugs.
    2) The user interface has still some issues.

    3) We have many tests, but some of them are unreadable, look very ugly. We have have used BDD style of language for some of the tests in given\when\then style.

    4) Linq-to-Sql Entities – They had only set\get properties. We had committed
    violation of “Law of Demeter” throughout the codebase. Instead of asking Customer.IsYourName(x), we had code (if customer.Name == x). So, we did some refactoring, and added more stuff to the entities like validation rules, moved functions which acted upon “entity data” to entity.

    5) A lot of duplication still exists, which can be removed.

    6) We have an acceptable performance metrics page for different scenarios. We have tested a lot, and improved the linq-sql queries and the stored procedures. But a lot of work still needs to be done.

    7) The software, is currently in testing stage. The users are testing the software with live data for some part of the system.

    Some concluding observations -

    1) The architecture of a system can evolve. And, an evolving architecture solves the problems faced, so everyone in the team is able to understand the reasoning behind the changes.

    2) Some of the decisions related to architecture could have been made early. For example, the application is for multiple users, and these users are located in different locations(countries). So, it seems obvious that some thought should/could have been given – Where will the database be located? Will we need replication to handle some scenarios? Should we develop a desktop or a web application? Also, it seems there is lack of literature for such scenarios. What are the best practices which have succeeded? Or, the way we have approached, taking decisions much later like for replication. In fact, once replication was implemented, we had to change all primary keys to guid from int.

    3) A remote set of developers can work well. But, I miss pair-programming. I tried it a couple of times, but due to slowness of broadband, it did not work out.

    4) The unit\integration tests gave confidence to the developers as well as the stakeholders. The stakeholders had become jittery, and were reluctant to request changes, since after every change old problems cropped up. But, having regression tests helped to track such bugs and prevented from re-appearing.

    5) The documentation of various aspects of the project on the wiki has helped new developers and the QA process. But, the issue is of keeping it in sync with the changes being made to the software. And, I think we are not always in sync.

    6) The tests have really helped us. But, I wonder if we are writing too many tests. The test coverage is around 60 %. I feel, since we have remote developers, and also turnover in the team, these tests serve as a good safety net.

    7) We need a continuous integration server. The developers are not disciplined enough, and don’t run tests before every commit. This means, if tests are failing then sometimes it goes unnoticed. We deliver software to client and QA every few days. But, building a setup is a manual process, and done by a developer on his local machine. So, sometimes local changes creep into the build.

    Notes from a TDD session

    My friend wanted to do a TDD session with me.I have experience of using TDD on couple of projects in my previous company.But, since one year I have been doing freelance work,and have not yet got chance to do TDD on a live project.I have been working on a legacy project,to which we have added unit and integration tests.So, which example to do for the session? I decided to go for bowling kata.We used C# with MBUnit framework.

    It took just 5 minutes to go through the rules,as my pair was already familiar with the Bowling game.So, how to go about writing the first test?.My friend,wanted to know how to go about writing tests – I told to him to think himself as the user of the API.How would he like to use the class library?.Write the test for the simplest behavior you can think of.

    From Mike Bria,

    Keep yourself focused on writing tests for the desired behavior of your objects, rather than for the methods they contain.

    Our first test – ScoreShouldBeZeroWhenGameBegins.We did not had any design discussion before we started.Then,when we reached the point where we had to take a design decision for passing the test -ScoreShouldBe24_WhenSpareIsFollowedByRegularThrow.How do we implement the “determination and calculation of spare”? My pair suggested,having a Frame object.But,I felt we can achieve the desired behavior using the BowlingKata class.We implemented,and the test passed.But,the code looked a bit ugly.Then, the next testcase was ScoreShouldBe39_When2SparesAreFollowedByARegularThrow.This led to a long design discussion,and we ended up with a Frame class.

    A good advice from James Shore,

    Don’t let design discussions turn into long, drawn-out disagreements. Follow the ten-minute rule: if you disagree on a design direction for ten minutes, try one and see how it works in practice. If you have a particularly strong disagreement, split up and try both as spike solutions. Nothing clarifies a design issue like working code.

    My friend wanted to know if, when we ended up creating Frame class,we have to write test for this class.I did not feel we needed to.When we are writing tests,usually a class does not exist in isolation.A class achieves its purpose by collaborating with other classes.And,here our tests were good enough.

    We ended our session in about 1.30 hours(the kata was not completed).I committed the code to github.Then,I spent some time looking at design,and did some refactoring.

    Using the idea of Michael Feathers’ Class Splitting,below is the diagram of 2 classes -

    image of Frame class with methods and variables

    image of bowlingkata class with methods and variables

    Interesting notes from the 2 diagram(and some additional information)

    1) In the Frame class, all the methods just use the instance variables,whereas in BowlingKata class,some methods use each other
    2) Good Command Query Separation – all the methods of Frame Class are query based,except PinsFallen.All methods of BowlingKata are Command based.
    3) All methods in the Frame class are public\internal.The BowlingKata has some private methods too.

    Incremental Design

    Our session was very much incremental design,as mentioned here by James Shore,

    This takes place in three parts:

  • start by creating the simplest design that could possibly work,
  • incrementally add to it as the needs of the software evolve,
  • and continuously improve the design by reflecting on its strengths and weaknesses.
  • We followed the rhythm of TDD – fail the test,write just enough code to pass the test, and refactor. The code is here – http://github.com/vishalsodani/LearningTDD

    Note:A friend of mine told me that only tests don’t lead to good design.You can have a class with many small methods and tested, but maybe taking on many responsibilities.The SOLID principles are good guide during the implementation.I agree with him.As soon as you feel that a class is doing too much,apply SRP.