Gokul K Murali

Thursday, August 16, 2012

Tail Recursion with Scheme

Linear Recursion and Iteration

Recursion is a method where the solution to a problem depends on the solution to the smaller instances of the same problem (or simply: when a procedure or function calls itself it's called a recursion).

Now, lets see what is a linear recursive process, let me demonstrate it with the following example(in Scheme)

;recFactorial.scm

(define (factorial n)
  (if (= n 1)
      1
      (* n (factorial (- n 1)))))

Using the above function if we try to calculate the factorial of a number say 4, then it proceeds as follows..

(factorial 5)
(* 5 (factorial 4))
(* 5 (* 4 (factorial 3)))
(* 5 (* 4 (* 3 (factorial 2))))
(* 5 (* 4 (* 3 (* 2 (factorial 1)))))
(* 5 (* 4 (* 3 (* 2 1))))
(* 5 (* 4 (* 3 2)))
(* 5 (* 4 6))
(* 5 24)
120

Here we can see that the amount of information needed to keep track of the operations increases linearly with n. Thus such a process is called linear recursive process.

An important thing to distinguish here is the recursive process and a recursive procedure.
Recursive procedure refers to the syntactic fact that the procedure definition refers to the procedure itself. But with a recursive process we refer to the evolution (at each step) of the process and not about the syntax of how a procedure is written.

An iterative counterpart for the above example can be given as ..

;iterFactorial.scm
(define (factorial n)
  (fact 1 1 n))

(define (fact product count max-count)
  (if (> count max-count)
      product
      (fact (* count product)
                 (+ count 1)
                 max-count)))

Now using the above definition the execution proceeds as follows..

(factorial 5)
(fact   1 1 5)
(fact   1 2 5)
(fact   2 3 5)
(fact   6 4 5)
(fact  24 5 5)
(fact 120 6 5)
120

We can see that a fixed number of variables is enough to store the information needed for the operations. Such a process along with a fixed rule to update these state variables is called an iterative process. Also the number of steps grows linearly with n, hence called a linear iterative process.

Tail Recursion

A tail call is a procedure call that happens inside another procedure as a final action.

During such a tail call, if the called procedure is same as the calling procedure then the procedure is said to be tail recursive.

This property is significant as the tail-recursive implementation can be done without adding a stack frame to the call stack (ie. an iterative process will be executed in constant space even if the tail-call is a recursive procedure).

In the above examples recFactorial.scm is a recursive procedure whereas the iterFactorial.scm is a tail-recursive one (since the variables count,product are updated before calling fact() again).

Here is another example, suppose we want to define a procedure sum(n) which returns the sum of n numbers (ie. sum(5) = 5+4+3+2+1=15)

The recursive procedure can be given as..

(define (recSum n)
   (if (= n 1)
     n
      (+ n (recSum (- n 1)))))

If we call (recSum 5), process proceeds as follows..

(recSum 5)
(+ 5 (recSum 4))
(+ 5 (+ 4 (recSum 3)))
(+ 5 (+ 4 (+ 3 (recSum 2))))
(+ 5 (+ 4 (+ 3 (+ 2 (recSum 1)))))
(+ 5 (+ 4 (+ 3 (+ 2 1))))
15

Now, the tail-recursive procedure can be given as..

(define (tailSum n total)
    (if (= n 0)
      total
        (tailSum (- n 1) (+ total n))))

(tailSum 5) proceeds as follows..

(tailSum 5 0)
(tailSum 4 5)
(tailSum 3 9)
(tailSum 2 12)
(tailSum 1 14)
(tailSum 0 15)
15

In tail-recursive example the variables n,total are updated before calling tailSum again. It is also evident that tail-recursive implementation uses only constant space.

Tail-recursion is an important property provided by certain language standards, like in functional programming languages (eg: Scheme). (more)

Friday, August 10, 2012

Unit Testing in Python

Unit testing is a method by which individual units within the code are tested to determine if they are fit for use and to see if they performs as expected. Each unit may be of different sizes and are the smallest testable part of a module.

Unit testing may be implemented simultaneously with the development of individual units. It can be done by creating an application to perform testing with all probable test cases. [more about unit tesing]

In Python Unit Testing can be done with the 'unittest' module.[see here]
Let me demonstrate it with a simple example.
The program that I'm testing is one which gives the numeric value corresponding to number given in words ie.. 'two hundred and twenty two' will become '222'.
Here's the code Wordnum.py
A simple testing module for the above code is given below.

import Wordnum
import unittest

class WordNumCases(unittest.TestCase):
    knownvalues = (('zero', 0 ),
                   ('one', 1),
                   ('two', 2),
                   ('ten', 10),
                   ('thirteen', 13),
                   ('twenty one', 21),
                   ('hundred', 100),
                   ('hundred and seven', 107),
                   ('hundred and sixty nine', 169),
                   ('seven hundred and seventy seven', 777),
                   ('thousand', 1000),
                   ('one thousand', 1000),
                   ('thousand and two', 1002),
                   ('thousand and hundred', 1100),
                   ('SIx thoUSand FoUr HuNdReD and FIfTy NinE', 6459),
                   ('lakh and SEVenty five thousand ninety seven', 175097),
                   ('million and fifteen', 1000015),
                   ('ten crore ten lakh ten thousand ten hundred and ten', 101011010),
                   ('eighty five billion ninety nine lakh two hundred and five', 85009900205)
                 )

    def testWordNum(self):
        obj = Wordnum.convert()
        length = len(self.knownvalues)
        for word, number in self.knownvalues:
            answer = obj.create(word)
            self.assertEqual(answer, number)

class WordBadInput(unittest.TestCase):
    obj = Wordnum.convert()
    def testInvalidWord(self):
        obj = Wordnum.convert()
        self.assertRaises(Wordnum.InvalidWordError, obj.create, 'hunded fifty')
   
    def testNoInputGiven(self):
        obj = Wordnum.convert()
        self.assertRaises(Wordnum.NoWordGivenError, obj.con, '')

    def testRepetition(self):
        obj = Wordnum.convert()
        self.assertRaises(Wordnum.RepetitionError, obj.create, 'two thousand and three thousand')


if __name__ == "__main__":
    unittest.main()

The result of the script when run as python Wordnumtest.py is

The result says that the 4 tests defined in the test module for testing the above given code ran successfully producing the expected result and is fit to go.

Working
In the test module some known values is given as the test case in a class that inherits an object of type unittest.TestCase. Now the tests are carried by the unittest module through these test cases. The obtained result is then compared with the expected one (given in the test module itself). If both are same result says 'OK' otherwise an error is displayed.

In our Wordnumtest module known values are given as knownvalues in the class WordNumCases. Certain test cases along with the expected result is given in it. These test cases are tested by using the function testWordNum(). The assertEqual() assertion in it checks if the obtained result and the expected one are the same.

Four tests are run using the module, first being the satisfactory result for the known values. Other three tests are used to determine if the code raises exception for the known errors.
Eg: testInvalidWord()- This function checks if the given words describing the number is proper. Suppose that 'hundred and fifty' is unexpectedly given as 'hunded fifty', it should raise an InvalidWordError as given in the code Wordnum.py. It is done by the assertion assertRaises().

Other two tests are used for determining whether the code raises NoWordGivenError and RepetitionError. The former is raised when no input (empty string or only 'and') is given. The latter checks if the same powers of ten are repeated eg: 'three thousand and two thousand' (This test is given just to demonstrate the unittest module and may not be needed in the actual program unless faulty inputs are expected to be given).

Till now we discussed when the test cases suffice. Now what if there is an error.
Let me deliberately introduce an error while checking the known values (returns '269' when 'two hundred and sixty five' is given). Here's what we get when the test is run.

We can that the assertion fails as AssertionError: 265 != 269 and hence the test fails showing that there is a bug in the code.

The above example is a simple one to demonstrate the unittest module in python. This can be extended to whatever code we develop by specifying the proper test cases. And it is a good practice to create test module before or along with development of each unit in the code rather than creating a unit test module at the time of integration testing.

Here's an example code that performs the opposite of the example given here..

Wednesday, August 8, 2012

Python Style Coding

Usually we start our programming through languages like C, C++ or Java. Now if we also want to use Python, it gets a bit tricky. It is a truth that a code is read more often than it is written. So it becomes important to follow certain conventions to maintain consistency and readability (at least in python).

Python being a dynamically typed language does not notify us about the programming errors unless it is run. So with the sheer size of the code, consistency becomes mandatory. (Of course unit testing can be done, but still there is a danger of flawed code running fine; and as always prevention is better than cure).

The two main things that should be kept in mind about the code layout are the use of indentation levels and white spaces. There is something called The Zen of Python in PEP, if you like to read (it's better to have some reading habits).

Indentation
Unlike most other languages white space indents are an important part of Python. It recognizes indents as a substitute for braces. But this may get a little messy while writing code of the size of an elephant.

The official Python style guidelines says that we must indent with 4 spaces but some others like Google dictates indenting by 2 spaces. The use of tabs must be strictly avoided although some editors provide provision to set 4-space or 2-space tabs instead of the conventional 8-space tabs.

Apart from that, the most import thing is never to mix tabs and spaces. It could cause a lot trouble while editing (endurance could become a problem).

def function():
    if condition:
        statement 1
        .......
        ...

def func():
    global i
    i += 1
    return

White Spaces
Although white spaces are not a necessity in Python, as they say the readability counts, the wise use of white spaces and blank lines are needed to ensure it.
The official Python style suggests the use of same amount of spaces before and after arithmetic operators (except within function arguments). It also suggests the use of a space after a coma, colon or a semicolon (like in IEEE format).

i = i + 1
def fun(a, b=0):
    ...........
    ....
    callanother(c=1, d=2)

The use of spaces before and inside parenthesis, spaces in array or dict indices (like array [1]) must be avoided.

Use of blank lines are encouraged before and after logical function bodies or parts. A single blank line is recommended for separating functions and a double blank line for separating classes.

Other conventions

maximum line length of 79 characters.
wise use of comments improve understanding.
use of version bookkeeping and documentation strings.
naming styles (see here)
import statements at the top of the code in the following order with a new line for each import

standard library imports
related third party imports
local application/library specific imports

There are many programming recommendations[1, 2 ] exclusive for Python (Python Enhancement Proposals, PEP) which helps us building more productive codes in less time.

Friday, August 3, 2012

Git : A DVCS

Git is a Distributed Version Control System (DVCS) that can be used to maintain different versions of files modified over time. It differs from almost all other VCS in that it stores the file(s) on the basis of generations. It does so by employing three states of a file -Commited, Modified and Staged .(click here to know more)
Before we can start using Git we need to have it installed and configured. Installing Git

How to use - Git repository
Repository is basically a store of files or directories that Git uses track the states and modifications to perform version control. There is a set of commands that can be used for initializing and maintaining a Git repository. Also we can get a Git project either by by using an existing project or by cloning from an existing repository.
Here are some commands

$ git init
$ git add <project file or directory>
$ git commit -m "message"

If we want to push the project to a remote repository , then

$ git remote add origin https://github.com/user/repo_name.git
$ git push -u origin master

We can create a python script for this if we do not want enter these commands each time we want to add files using git module as follows

import git
import os
import sys

project = sys.argv[1]
remote_origin = sys.argv[2]
g = git.cmd.Git(project)
g.init()
if os.path.isdir(project):
  files = os.listdir(project)
  for file in files:
    g.add(file)

else:
  g.add(project)

message = "First version"
result = os.system("git commit -m %s" %(message,))
if not result:
  print "Successful!"
else:
  print "Unsuccessful, Some error occurred, error code value: %d" %result
  sys.exit(1)
g.push(remote_origin)

Run this script as python script.py /home/user/project https://github.com/user/repo_name.git

The files in the working directory can be in two states : tracked and untracked. Tracked files are those which were present in the last snapshot or generation. Untracked files are everything else.
The command

$ git status

can be used to determine which files are in which state.

So the command

$ git add <filename>

in the above script is used start tracking a new file.

$ git log

The above command can be used to list the commits or changes from the most recent changes to the least.

There are many more things that can can be done using Git and it also provides efficient way of modifying or undoing the changes made to the project file compared to the other Version Control Systems. (see here)

Wednesday, August 1, 2012

Snake-game

Hello everyone, this is my "Hello World!!!" post for my blog..
Recently started using 'pygame' module in python. And thought what if I start with writing a program for the first game that I ever played - Snake (in some good old mobile phones)

So here is it, a functional snake using pygame module.. The aim is to gain maximum points by eating the food (extra points through bonus food) and make the snake grow. Biting itself or the rectangular boundary causes the snake to die..
This is analogous to our life where we learn the basics and grab the bigger ones as they arrive which allow us to grow bigger.
Source code