Friday, April 4, 2014

The intuition behind computing combinations recursively (N choose K)

We all deal with combinations on a regular basis and we're also familiar with the recursive formula

C(N,k) = C(N-1,k-1) + C(N-1,k)

How did this formula come into existence? What is the intuition behind this recursive formula?

Here's my shot at explaining this equation:

Suppose you have a BIG basket of size N and has N items in it and another smaller basket of size 'k' in which you are supposed to put k items from the big basket. C(N,k) now just means the number of ways you can choose 'k' items from the BIG basket and put it in your smaller basket.

Before we delve into approaching the problem let's look at the trivial case of choosing things.

How many ways can you choose one item if you have only one item? You guessed it, it's just one.

Similarly how many ways can you choose N items from N items? You guessed it again, there's just one way of choosing them.

So now we can say C(z,z) = 1, also implying C(1,1) = 1

The other smaller piece of the puzzle you need to think about is not choosing any items when I have 'm' items with me. So how many ways can I do this? The answer is obvious you can only do this one way (just sit idle!)

So we can say C(m,0) = 1, also implying C(1,0) = 1

So now we'll look at another piece of the puzzle: By looking at the two intuitions we developed above. What can you say about you smartly stopping to pick items a particular way?

You stop picking items in the case where there are Z items to pick and there are only Z items left. This includes leaving only one item and picking that one.

Now armed with this background we can go about solving the complete puzzle.

We humans are brilliant, we always try to break up problems into manageable pieces and try to solve them. This is how recursion works. We divide the problem into a readily solvable part (for which we know the answer) and a part which needs further computation

So what I do is approach the problem solving for one item at a time.

I pick an item from the N items in the BIG basket and then I ask myself what can I do with this item in my hand. We can do a lot of things like throw it, eat it.. etc. but I can do one of the two things which are of interest to me in this problem:
I can place it my smaller basket
I can not place it my smaller basket, just place it beside my BIG basket

In the first case as I have placed that item in the smaller basket, I just have to choose 'k-1' items of the 'N-1' items present. So now I should worry about computing the number of ways I can do that.

In the the second case as I have not placed that item in the smaller basket I still have to choose 'k' items from 'N-1' items in the BIG basket. I have to worry about the number of ways I can go about doing this.

So solving this big puzzle now becomes: Summing up these two ways. I note this down in a book. I have two problems to solve now. I go about solving them one after another and go deeper and deeper into trying to solve a smaller problem

I keep track of all the things I did till now and I keep going deep over and over again until I reach one of trivial cases:
Compute the number of ways of choosing Z items of Z items in the BIG basket
Compute the number of ways of choosing no items of the Z items in the BIG basket

Both these are one and I propagate this result to the previous equation which had a dependency on this.

Finally when I finish solving all these problems and finish adding up all the way from bottom to top I've solved the complete puzzle.

A trivial C code explaining N choose K using recursion: I love writing code and can't stop myself from not writing one.

int C(int n, int k)
    if(n == k || k == 0) return(1);
    else return(C(n-1,k-1) + C(n-1,k));

Hope this cleared up things for you. Leave comments if you've got any questions or doubts or if you spot any typos and errors.

Thursday, May 17, 2012

Google App Engine: Memcaching for dummies example

Here's a very simple code written in Python 2.7 and webapp2 on Google App Engine that does memcaching. It is possibly the easiest example I could think of, that I could implement memcaching on Google App Engine.

The main confusion which arises when one starts to look into memcaching is that Google fails to show that memecaching has to be done after instantiating a memcache client object.

The source code for this application is available here and is self explanatory.

Questions, comments, doubts are welcome.

Tuesday, May 15, 2012

Encoding Special Characters with URLLib Quote Function in Python

I was actually testing my final year project today and there was so much noise in the data, I was really frustrated by the end of it.

One of the most troublesome and difficult to figure out was urllib.quote(movie) function.

You should see movie titles people update, here are a few.

I ♥ Bollywood, Funniest movies ever

That really seemed a challenge to be sent over for an API call. We thought of stripping them off in the sentence, but there are a few French and Italian movies which always have some or the other odd character in them. I used quote() function and was getting KeyError exception.

Finally I figured it out, you have to encode it into UTF-8 so that they can be sent across.

so while calling a URL, if it has any special characters in it, better encode it and sent it accross.

Example: (Google App Engine)

import urllib
from google.appengine.api import urlfetch

data = u'♥+'
url = ''

response = urlfetch.fetch(url + urllib.quote(data.encode(encoding = 'UTF-8')))

if response.status_code == 200:
    output = response.content

I wasted almost an hour on figuring out what to do. If you ever get a KeyError when you are using URLLib.Quote, then this is the solution.

Thursday, May 10, 2012

Google App Engine: Geo location identification and reverse geo-coding example

Today, I developed an application that asks users for their location (be it PC or cell phones) and then it displays the latitude and longitude (Geo-Location identification), it even looks up the address (Reverse Geocoding) from Google Maps API. I also store HTTP headers so that this information can be used to identify the device used (I'm working on it right now!)

Okay! This might not involve a lot of App Engine Code, just some basic code to store user location (datastore) and Users API from Google.

These things which are stored can be viewed in an Administrator Area which shows the last 100 peoples' information.

But this has a lot of jQuery and JavaScript code in it! I know jQuery sounds fancy but just visit W3CSchools jQuery tutorial and finish it (it'll not take you more than 5-10 minutes), you'll understand (almost) every line of code.

What I've used:

  • Google App Engine (webapp2, Python 2.7)
  • jQuery
  • A bit of JavaScript as well
  • Google Maps V3 API
  • A bit of StyleSheets
The code can be found here. It's well commented and I expect you to know a bit of GAE (templates as well), jQuery is used a lot so a bit of that as well. Lookup Google Maps documentation if you don't understand the code written for Google Maps.

Wednesday, May 9, 2012

N-Gram generation in Python

I've written a very small code snippet that actually generates n-grams. I've also added a small tweak that gives us the number of times a n-gram has appeared in the document.

The example I've considered is a Shakespeare's play (All is Well that Ends Well). I'll be generating the most common 3,4,5 or 6 word phrases that were used by Shakespeare in this particular play. 

The first thing to do is cleaning up the document. Removing stuff like ACT1, SCENE 1, [To Derpina] etc. The next step is tokenizing the document (splitting the document into tokens by stripping punctuations and white spaces).

Now we get into action: (Core code to generate n-grams)

# word_list = ['some','continuous','clean','set','of',
# 'words','without','punctuation']
# n for n-gram 
# Change it to whatever the requirement is
n = 6

ngrams = dict()
# create an n-gram list 
for i in range(len(word_list) - n + 1):
    gram = tuple(word_list[i:i+n])
    if gram in ngrams:
        ngrams[gram] += 1
        ngrams[gram] = 1

# now the n-grams dictionary contains all the n-grams of the book
# you can sort it or do whatever you want with those n-grams
# and the number of times its appeared in the document.

Okay! this is the only working part of this program that needs to be explained. I believe the the code is self-explanatory if you know a bit of Python.

The source code can be found here.

Google App Engine: Facebook server side authentication

When you're working with devices without JavaScript or in cases where the user has disabled JavaScript you'll need to work out strategies to do your stuff on the server-side. A classic case is form validation using JavaScript, which fails when you don't have a server side verification system and allows users to post junk onto your site.

I've just converted a small script shown in this example on Facebook's site that does the same thing, but on PHP. This code closely emulates the one that leads to the link above.

Another case you might need this is because Facebook has scrapped its Python API support since OAuth 2.0 has been introduced.

There are some limitations to this application, one of which is as Facebook prevents POST headers, you can't integrate this method into a canvas.

It builds upon the sessions example I've covered in the previous post.

The build up of the program is as follows:

There are 3 files:

1. : A class that generate state variable -  a unique combination of 13/27 characters that Facebook generates when you make a request. the variable 'code' in the URL of a Facebook application.

2. This is a class that handles the sessions. It must be inherited by any class that uses sessions. Refer to the post (the link) which covers it for more information on it.

3. This is the main Python program that Google App Engine handles. The two step procedure to check whether the URL has the 'code' request variable in the request header if so continue with OAuth authentication, else redirect the user to the Facebook page on which he gives permissions/authenticates the application.

It really is a primitive code for sandboxing purposes, you'll need to refine the code with Exception Handling and other stuff to actually deploy it.

Any bugs/suggestions/pointers is always welcome. I really don't know if I'll be working on this in any near future. I just wanted this application in public, so that if anybody needs it they can build upon this code.

The download link to the code is here.

Saturday, April 21, 2012

Google App Engine: Sessions

With the advent of Google App Engine (Python 2.7) and WebApp2, there have been many changes in the way people code on Google App Engine. WebApp2 includes a Session Management script in the module 'webapp2_extras'

This is a simple sessions example with Google App Engine with Python 2.7 and WebApp2. The code is self explanatory. I haven't implemented exceptions and errors as I won't be using this snippet of code anymore. But figuring out something is the fun part isn't it?

Session Module:

#Import sessions for session handling
import webapp2
from webapp2_extras import sessions

#This is needed to configure the session secret key
#Runs first in the whole application
myconfig_dict = {}
myconfig_dict['webapp2_extras.sessions'] = {
    'secret_key': 'my-super-secret-key-somemorearbitarythingstosay',

#Session Handling class, gets the store, dispatches the request
class BaseSessionHandler(webapp2.RequestHandler):
    def dispatch(self):
        # Get a session store for this request.
        self.session_store = sessions.get_store(request=self.request)

            # Dispatch the request.
            # Save all sessions.

    def session(self):
        # Returns a session using the default cookie key.
        return self.session_store.get_session()
#End of BaseSessionHandler Class

Main Module:

import webapp2
from webapp2_extras import sessions
import session_module

#MainHandler class where we write code for ourselves
class MainHandler(session_module.BaseSessionHandler):
 def get(self):
  if self.session.get('counter'):
   self.response.out.write('Session is in place 
   counter = self.session.get('counter')
   self.session['counter'] = counter + 1
   self.response.out.write('Counter = ' + str(self.session.get('counter')))
   self.response.out.write('Fresh Session 
   self.session['counter'] = 1
   self.response.out.write('Counter = ' + str(self.session.get('counter')))
#End of MainHandler Class

#The application starts running after this is interpreted
app = webapp2.WSGIApplication([('/', MainHandler),], config = session_module.myconfig_dict)