Posts Tagged ‘python’

What do a philologist and a lollipop have in common?

Wednesday, February 24th, 2010

Question: What do a philologist and a lollipop have in common?

Answer: LOL (if you don’t get it, you will LOL when you see it below)

The generalized problem statement

Given a few strings:

  ewf3hardyharharoiew
  p90weuhardyharhar
  hardyharharoie78wjf
  ahardyharhar787834

Determine the longest substring that they all have in common:

 p90weuhardyharhar
       |||||||||||
      ahardyharhar787834
       |||||||||||
   ewf3hardyharharoiew
       |||||||||||
     d1hardyharharoie78wjf

Which in this example above, is ‘hardyharhar’. The question is, how do you code this?

The solution

Today on Hug-An-Algorithm day, we’re giving the hat-tip to an algorithm called longest common substring (LCS).

The solution can be implemented using a generalized suffix tree, or by dynamic programming
(Wikipedia).

Specifically, I’ll walk you through a Python implementation—but focusing on the approach behind it, so that you can implement this in any language you want. I found a few LCS implementations on Wikibooks.org, which is where I got this simple Python implementation from (and adapted it slightly for this demo).


Technical nitty-gritty details warning! (If you’re interested in how this may be useful to you, skip to real-world applications section after it):

Select full screen + high quality plz:

Code from the video above for plugging into your python interpreter:

def LCS(S, T):
    m = len(S); n = len(T)
    L = [[0] * (n+1) for i in xrange(m+1)]
    LCS_set = set()
    longest = 0
    for i in xrange(m):
        for j in xrange(n):
            if S[i] == T[j]:
                v = L[i][j] + 1
                L[i+1][j+1] = v
                if v > longest:
                    longest = v
                    LCS_set = set()
                if v == longest:
                    LCS_set.add(S[i-v+1:i+1])
    print LCS_set

Real world application

As researchers, we often find ourselves swimming in a sea of seemingly random data, and we’re always looking for ways to make sense of this wealth of data, how we can obtain insights and the act accordingly.

Here’s how I use the LCS algorithm to find patterns in malicious links. Obviously, it can be applied to any kind of data outside the boundaries of security research. Hope you’ll find this tool handy in your problem-solving toolkit!

Security Researcher: Jay Liew

p.s. This blog post is dedicated to the folks in #python on Freenode.

This is a cross-post from my company’s blog.

Allow your visitors to sign in via Twitter, Facebook, FriendFeed, OpenID, and OAuth with django-socialregistration

Wednesday, October 28th, 2009

The title says it all. Allow your visitors to sign in via Twitter, Facebook, FriendFeed, OpenID, and OAuth with django-socialregistration. Hence the name, “social registration”. I’ve been tinkering with this module and I’ve just got the Twitter OAuth login/registration to work. It’s pretty neat!

You know how much you loathe having to register/sign-up for yet another new service, just to try it? Well, as a site owner, it’s really in your best interest to lower the bar for the public to figure out if they want to use your service. Make it as frictionless as possible.

The README provides all the instructions you need, except one thing: you must explicitly tell Django’s authentication process to also check the socialregistration module’s authentication backend

In other words, in your settings.py:

AUTHENTICATION_BACKENDS = (

    # this is the default backend, don't forget to include it!
    'django.contrib.auth.backends.ModelBackend', 

    # this is what you're adding for using Twitter
    'socialregistration.auth.TwitterAuth', 

    'socialregistration.auth.FacebookAuth', # Facebook

    'socialregistration.auth.OpenIDAuth', # OpenID

    )

You can either add it in your settings.py or global_settings.py (mine’s in /usr/local/lib/python2.6/dist-packages/django/conf/ )
(more…)

mod_wsgi compile error: missing Python development package?

Tuesday, September 22nd, 2009

If you’re seeing an error like this when trying to compile mod_wsgi, it’s likely that you don’t have the Python development package installed. Solution: apt-get install python-dev

Turns out, a lot of things can go wrong when you try to compile mod_wsgi, and I spent a good chunk of time before finding it out here. I hope that by posting this here, I would save others the headache. The last few lines of the error was quite misleading, imho. But like they all say, in hindsight, everything’s 20/20

root@jaysern:/usr/local/mod_wsgi-2.5# make > error
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = (unset),
LC_ALL = (unset),
LANG = "en_US.UTF-8"
are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
/usr/local/apache2/build/libtool --silent --mode=compile gcc -prefer-pic -DLINUX=2 -D_REENTRANT -D_GNU_SOURCE -g -O2 -pthread -I/usr/local/apache2/include -I/usr/local/apache2/include -I/usr/local/apache2/include -I/usr/include/python2.6 -DNDEBUG -c -o mod_wsgi.lo mod_wsgi.c && touch mod_wsgi.slo
mod_wsgi.c:113:20: error: Python.h: No such file or directory
mod_wsgi.c:114:21: error: compile.h: No such file or directory
mod_wsgi.c:115:18: error: node.h: No such file or directory

*** snip ***

mod_wsgi.c:11684: error: expected expression before ')' token
mod_wsgi.c:11691: error: expected ';' before 'ap_log_rerror'
mod_wsgi.c:11696: error: expected ';' before '}' token
mod_wsgi.c:11710: error: expected expression before 'module'
apxs:Error: Command failed with rc=65536
.
make: *** [mod_wsgi.la] Error 1
root@jaysern:/usr/local/mod_wsgi-2.5#

A change in direction: Python, Django, and Google App Engine

Saturday, June 13th, 2009

This is a cross-post from my other for-own-use developer blog. I’m posting it here because people often ask me what I’m so busy with.

It’s been a while since my last post; I have been real busy. Anyway, just to quickly say this, I’ve made a change in direction in my development efforts.

I’ve said earlier that I am determined learn a new programming language this year because my brain is starting to rot, but have since decided a few months ago that it will not be Cocoa Touch, for various reasons: too niche (the emerging global mobile apps market is highly fragmented by Nokia, iPhone, Android, Crackberry and possibly Palm as a viable contender), and skills here only attacks a small piece of my larger effort, which my gut tells me it’s a task that could probably be farmed out and done cheaper/faster by outsourcing to a iPhone dev shop where Cocoa Touch is their core competency.

A mobile app that does not utilize any connectivity, nay, “intelligent” connectivity, is not much different from calc.exe on your WinXP desktop. It’s fine for a narrow and specific, uninteresting task. An interesting mobile app would tap the cloud for some form of intelligence. Why not leverage that mandatory data plan from AT&T for your iPhone?

When the time comes, if necessary (such as if the iPhone app will be an important part of my competitive advantage), then I’ll pick up Cocoa Touch myself. For now, I do not think that will be the case, thus I’m going to spend more time on laying the groundwork for the more important piece: the back-end, web 2.0 / cloud computing / SaaS piece. And as Microsoft knows, as Tim O’Reilly says – nobody with their right mind would bet against the Web! (Have you seen HTML5?)

The past month or two, I’ve tried real hard to squeeze time in to learn Python, Django and Google App Engine – all at the same time in parallel, not sequentially. Yes, I’m trying to rush – because I am impatient.
(more…)

My App Engine “Wall”

Friday, June 13th, 2008

I was messing around with Google’s App Engine this week, and learning some Python (the programming language, not the snake! I had someone ask me that) at the same time.

App Engine (for scalability reasons) does not support SQL, but instead provide and API they call Datastore for persistent storage. For the app I made, I queried for data ordered by date, but for some reason the result set still comes out unordered. Odd. I’m probably not calling it right. Anyway, it was just an exercise to see what App Engine was all about.

Here’s a replica of the Facebook Wall I made, feel free to write your heart out. As you can see, I’ve had friends say some really nice things about me already. The full address is liew.appspot.com .. but there’s an iframe to it if you’re too lazy to click on that link :)

Update 6/15/2008 — Tinkered around a little and realized why its unordered .. I lost the timestamps on some of the posts, so for those that come out unordered, it’s because there are no records of which came first (the timestamp was NULL). It should be fine going forward. The un-stamped posts take precedence in ordering over the stamped ones, so you may have to scroll down a little to see your post (or just CTRL-F it).