We have a new home!

Written on July 1, 2015

Over the last year or so I have reduced the amount of blogs I write. This is partly due to the change in my work and schedules. Since then I found that I am not able to keep up with the moderation of the comments on Wordpress, and I am getting a lot of spam.

Read More

A Python API for flight data!

Written on November 2, 2014

Of late I have become a fanatic of learning about aircraft, following aircraft on flightradar24, keeping track of tail numbers I flew on, asking questions and all that. I even track the on time performance of any flight I book or take before I plan my journey!

Read More

IPython kernel for Redis

Written on November 2, 2014

I have been working on understanding how IPython works, the kernels, the client etc. I have managed to figure out how the zmq client and server mechanism works and makes it so simple to add so many types of clients. Its really awesome.

Read More

The illusion of high productivity

Written on June 8, 2014

Nowadays the emphasis seems to be on developer productivity - you have the tools and libraries that have been built by people who know how to do it right, these are open source, and there is a community to help you out. Or there is a large corpus of previous projects in the company that you can simply reuse and save lot of work.All you need to do is get the libraries, follow the examples and you should be good to go! You should be able to build castles out of thin air!

Read More

PyCon 2014

Written on April 15, 2014

I was at PyCon this year in Montreal, and gave a talk and conducted a tutorial. Here are the links to them on YouTube.

Read More

Building a simple journey planner

Written on March 4, 2014

In my last blog I mentioned about using a Scrapy crawler to build information about the bus routes in Hyderabad. Once I had the data in place, what to do with it?

Read More

Locking issues with Java blocking queues

Written on February 23, 2014

A while ago I had posted about issues with poll() vs pol(timeout) on the Java concurrent blocking queues, and issues with the bucket sizes of concurrent hash maps. Those solutions worked well - but the issues with locks seems to love me so they are back. And this time time it is on the other end of the operation - the offer() and offer(timeout) calls.

Read More

The DEV estimate from hell - why some things seem to take a lot longer to get done

Written on January 4, 2014

We all have given estimates for development work - and we all have either missed them at some point or have been asked why it takes so much time to get something done. How and what we respond to these questions varies - but in all cases the common theme is that there were either scope changes, or unexpected issues and defects, or integration problems - and we always promise to do better next time.

Read More

Do u remember your monthly budgets ?

Written on November 22, 2013

As part of our daily lives we have at some point in life come across situations where we felt we should better remember and better stick to our monthly budgets - this is by far easier than tracking individual expenses - but this still requires that we remember numbers top of our heads.

Read More

Writing less code for a given deliverable - is that possible?

Written on February 18, 2013

I have been writing code on the job for over 8 years now, and have written many thousands of lines of code in Java - partly because Java is verbose, and partly because of the size of the problem being solved and the size of my deliverables. I have also tried my hand at Python where I can get more done in fewer lines of code, and Python frameworks like Django where I can get a basic website up in fewer lines of code than anywhere else.

Read More

Using Big Data to solve real world problems

Written on January 13, 2013

Big Data - large data sets on the order of petabytes - this is a buzz word that you must have heard each time you hear about the next big internet website or social network or e-commerce website. Amazon, Google, Facebook, Netflix and many more large internet corporations have proved time and again that Big Data and analysis on these data sets can provide you the critical business intelligence that will take you to the next level and make you the absolute leader. Great!

Read More

Managing code deployments in large distributed applications

Written on December 10, 2012

Most of us have worked on applications that are small enough that they can be deployed to the users desktop, and also on applications that are deployed to the servers. This could be applications like web applications that are deployed to a web server - in this case the code is same across all servers. There can be cases where there are many components that use the same code base but are distinct enough that you can put different version of the code for each component. But generally, the way the code is deployed depends on the nature of the application. And depending on the deployment, there can be some issues with how you manage the code.

Read More

Passing entities between two systems that do not allow the same set of operations

Written on November 9, 2012

If you have ever worked on any sort of adapter - a piece of code that takes entities from a particular system and then makes it available to a different system - then you definitely must have come across a situation where the two systems do not support the same set of operations, or mutations on the entity. There will be string reasons for this, and since it is two different systems there will be another thousand reasons why each was built the way it was built.

Read More

Open development processes - how feasible are they?

Written on October 20, 2012

Open development process - the Utopian dream where you are building on a software platform or use a product and you find a bug; you checkout the source code, figure out what to fix and how; make a fix, add tests, create a patch and submit it. It gets approved, pushed out and you are a happy developer! You fixed a problem, you have a contribution to show and you earn brownie points! This is developer heaven.

Read More

Understanding transient variables in Java and how they are practically used in HashMap

Written on October 2, 2012

What is the significance of the transient keyword in Java? If you know the answer, good! you are a person who uses this a lot or a person who has read this very recently. If this seems like a word from a half remembered dream, well don’t worry you have company. I was and am will be confused if you asked me about this in an hour. It is one of those things that I learnt but never had to use it - mainly because I never worked on code that required me to worry about how my objects were serialized. I could delegate that to the libraries.

Read More

Temporal data - what was the value of something at a given time?

Written on September 20, 2012

We all know data, we all know consistency of data when dealing with transactions. There is another aspect of data - temporality, meaning data at a given point of time. What is the value of something now? and what was the value of it as of yesterday morning? I haven’t worked too much with temporal data, but have used few applications that provided this - in their own ways. I was reading through the new Google  research paper on Spanner - their global time aware database, and came across the TrueTime API - this forced me to think about the temporality of data and how important that is.

Read More

Things to consider when building connection retry and automatic failover

Written on September 11, 2012

Have you ever been in a situation where a connection to a resource was lost and your application either did not tell you about it, or did not try to reconnect or ends up in a mess trying to reconnect? How does that feel? You might have wondered that it would be great to have something that can automatically recover so that you don’t need to intervene, or even avoid any manual recovery work that comes after a restart to recover from these issues.

Read More

Multi-layered applications and long lived objects - issues faced

Written on September 8, 2012

On my day job I build applications that need to get data in and out fast to the other services on our distributed architecture - speed is they key here and so is the ability to be able to reuse and build new services with existing code. We are more than happy to be able to build something that is configuration driven and can be easily re-purposed and deployed for another requirement by just changing the XML files. All this leads to a design that is flexible, separates out responsibilities, has distinction between interface and implementation and more importantly layered. The layer that consumes does not enrich and the layer that enriches does not publish back again.

Read More

Taking a problem from simple to a massively parallel execution

Written on September 1, 2012

Distributed computing and parallel computing used to be something I considered very very high tech stuff that I was not working on. But over the years I figured out that what I was working on some of these - without me knowing. Early on I realized that I was building distributed systems when I got to work on a redesign of a messaging layer at an investment bank - the amount of components that touched the messages and the whole way in which we distributed the logic and load across a set of services made me realize that what I had worked on previously was also something similar - only that I had called it Service Oriented Architecture (SOA).

Read More

Casting nulls in Java

Written on August 2, 2012

I admit this was something that I never noticed, even though I have used it day in and out. And it was a surprise when I realized it.

Read More

How solutions/platforms grow, evolve and become extinct

Written on July 25, 2012

Every so often, I hear someone saying or writing about how a particular solution seems wrong and how given a chance they would design it properly and show how it should be done. My reaction - bravo! if you do get a chance to do the thing cleanly again, definitely do it. Until of course when your solution has evolved enough and there is a new system or a new guy that claims the same thing yet again.

Read More

Building a rudimentary cache in Python using Twisted - part 2

Written on July 17, 2012

I decided to go ahead and try using the cache that I built for some scenario that I might expect to see in real life. Part of the work I do daily demands that I start one thread to create some data that goes into a cache or a messaging layer and then another thread in the same application consumes it - pretty basic multithreading in Java  and nothing really special.

Read More

Building a rudimentary cache in Python using Twisted

Written on July 16, 2012

Every now and then I try and see if I can take something that I built in Java and build it in Python. After having thought of doing a socket application in Python using the Twisted framework, something other than the basic echo. Finally I did it today, by building a simple rudimentary cache with just PUT and GET commands, not even a delete. I admit it is lazy of me, but this illustrates the simplicity and this can be easily built on top of.

Read More

The myth of the perfectly generic and reusable solution

Written on July 1, 2012

Often times when I set out to do something new on the job, I hear a simple requirement over and above the actual business function - keep it generic enough that it can be reused. And I set about it to achieve the mythical generic solution that can be reused again and again for a variety of similar use cases. Do I succeed? Depends on how you define success. Given that my hands are tied because of an existing platform and the solution has to fit in this rather than be a rouge process, I make compromises and we end up with something that sort of looks much better than hard coding but also cannot be directly reused without making small changes. The closest I got is making a component that can take Javascript functions in configuration and I can use these scripts to tweak behavior as I see fit.

Read More

Proxies and Mocks in Python

Written on June 19, 2012

Yesterday I wrote a post about using Proxy class in Java to create a trivial mock framework - creating the same thing in Python takes even fewer lines of code. The generic proxy can be implemented very easily as shown below.

class Proxy(object):
    def __init__(self,subject):
    def __getattr__(self,name):
        if self.expectations.__contains__(name):
            def wrap():
                return self.expectations[name]
            return wrap
            raise Exception("No expectations set on mock for this attribute")

Read More

Building your own Mock objects with Proxy pattern in Java

Written on June 18, 2012

Most of us, if not all, have written unit tests for our code using JUnit or comparable tools. In most cases we are in control of what is being tested and we can provide all the inputs that are needed to test the scenario. But then there are cases where there are external factors or classes that cannot be instantiated for tests and we need to find ways to simulate them - the suggested approach is to use mock objects.

Read More

How much memory does your application need?

Written on June 11, 2012

I am a core Java developer, and have been doing that for about 8 years now, and when people come to me and ask me how much more memory my application will need or complain that Java applications eat up memory on the boxes - I just give them a dry smile, not wishing to comment or explain to them why my applications are no different from their applications. Even C++ or Python applications can and will use memory and that is what even they have memory leaks. Java is not the only memory hog out there - yes we demand our pound of flesh up front in the form of heap space, but we live within that.

Read More

Making sure your application can scale, with less effort

Written on June 7, 2012

These days application scalability is an implicit requirement in whatever you build - you might be doing a few hundred users on day 1 but as time goes by everyone is expecting your code to deliver for thousands of users, or maybe for few users with near real time performance. Even a second of delay is not tolerated. This applies not only to the web applications like Twitter or Facebook but also to enterprise applications that are used for boring back office processing.

Read More

AMQP, ZeroMQ, JMS, TIBCO RV and so on - which messaging system to use?

Written on May 20, 2012

I have been working with middleware systems for the last 8 years, building and using them in many ways. And even after all this I must say that there is lot of confusion in my head as to what a messaging platform must offer and what it should do. Confusion not because I don’t know my stuff, but more because you can solve the problem in more ways than one.

Read More

Anatomy of building a low latency messaging platform

Written on May 11, 2012

Disclaimer - I am not going to reveal here that messaging systems are built from a secret alien technology that none of us knew about. We all know that messaging applications are built using simple sockets, TCP or UDP protocol, some sort of mechanism to represent queues or topics, some storage on the back end to persist messages between restarts, a means to subscribe to data and receive call backs. These are generally all that you have in a messaging system.

Read More

Gathering data for building a personalized user experience in applications

Written on May 6, 2012

If I told you that a person is visting you, gave you no information about that guy, and asked you to prepare a list of things for him to do while he was visiting, will you be able to do a good job of it? You might actually be able to get a so so result but the outcome of whatever you do will be far from good.

Read More

Thread safe usage of third party API

Written on April 22, 2012

Multithreaded  programming is is not rocket science but is still difficult to get right. Anyone who has done a moderately complex solution with more than two threads knows how things can quickly get out of hand. In fact it is possible to get things in a mess even with two threads.

Read More

Breaking down a problem as a team

Written on April 14, 2012

If you walked past most programmers these days and asked what is it that they were coding for then the likely answers would be - business requirement, use case, user story, bug fix - maybe a few more, but very few of them will say that they are coding to solve a problem. You may ask if it really matters, i think it does impact how we think about the solution.

Read More

Moving backwards from distributed to smaller monolithic systems

Written on March 24, 2012

Before you decide to shoo-shoo this post and call me someone who has no idea how great distributed systems are - forget that idea. I am not talking about all the social networks or the high volume websites out there, I am not talking about the oh-so-agile and oh-so-great enterprise software that you built last night. I have worked all my life in building messaging layers that facilitate distributed systems and I still swear by them. I am just going to talk about a different thing here.

Read More

Making your code suitable to automatic unit testing

Written on March 13, 2012

If you tell me about unit tests and the importance of having those to build a good application, my emotions swing between two extremes - one where I fantasize about having the perfectly unit tested software, and two where I want to vent my frustration on those tests being broken - everything in between rarely crosses my mind.

Read More

Timers and Timed events - can you be certain they will be triggered on time?

Written on March 2, 2012

We all need to measure time, whether we are late for an appointment or if we want to decide how much more we can laze before we need to get going. In our daily life, thankfully, we only deal with minutes and hours, rarely with seconds. This makes it easy. But it is a different story when we deal with computers and software - we need to measure in terms of milliseconds and microseconds. Some applications like low latency market data for financial institutions demand nano second precision in this fast paced world of electronic trading.

Read More

Should I use a file to store data? Or a database? Or some other funky technology?

Written on February 20, 2012

No matter what technology you use, you always have to work with files - source code, compiled binaries, configurations etc. File is ubiquitous when you work with computers. But then have you ever suggested to someone that you should probably store certain data in a file instead of that funky-new-storage-application because it will be simpler? And having done that have you ever heard them saying - ‘ who uses files anymore?’.

Read More

What is a NoSQL database? And why would you use it?

Written on February 10, 2012

I first heard about NoSQL databases in an interview when a i-am-the-dude developer asked me about my experiences with NoSQL. I had no idea and later looked up the internet about these and found that they are key-value stores - like a hash map. Or like Berkeley DB  and used to store objects - In fact I had worked on building a huge messaging platform where we used Berkeley DB JE to do just that - store Java beans representing messages so that we can easily reconstruct them.

Read More

Using LinkedBlockingQueue for high throughput Java applications

Written on February 4, 2012

Java provides a LinkedBlockingQueue as part of the standard library from Java 5. This is a very easy to use blocking queue to share data between two threads and not run into any concurrency issues. I have used this as part of many uses cases where we had to deal with a producer creating a large number of objects to be consumed in a very short period of time and a set of consumer threads running to process these. There have also been cases where the producer is free to create objects at will, but the consumer controls how they are delivered to the upstream and needs a queue to hold things in between.

Read More

Using map and reduce for everyday business problems - part 2

Written on January 24, 2012

In my last post I described what kind of scenarios in our everyday business problems can be solved using map and reduce - we can do this even though we don’t have the kind of computing power that Google or Facebook have. In this post I will show how we can implement a map and reduce approach to solve a pseudo-real business problem in Python. Python provides in-built functions for map and reduce which we will use in this example.

Read More

Using map and reduce for everyday business problems - part 1

Written on January 21, 2012

Most of us have heard about MapReduce - the Google framework for solving problems in parallel - which is based on their BigTable and Hadoop which is an open source implementation of MapReduce. These are tools that are used by Google and Yahoo and many other companies to build scalable websites and all that. All these need many nodes to be setup and work load distributed across these nodes to get the results that we want.

Read More

Experiences with building websites in Django

Written on January 17, 2012

The first time that I heard about Django was actually when I heard about another equally good and well known web development framework - Google App Engine. I had read about App Engine somewhere and was itching to try out a test site over the weekend. I am basically a middleware and messaging guy and feel most comfortable with command line - so I was not fully sure I would be able to build the web pages for my experiment with App Engine - but somewhere in the tutorial while showing the easy to use templates, the guy mentioned that App Engine supports Django templates - that’s how I got to know Django and fell in love with it mostly for the ease with which you can build things and even more for the Admin website.

Read More

Do frameworks really speed up development and make us more productive?

Written on January 9, 2012

Any developer will know that it is not possible to code from scratch and re-invent the wheel each time we want to solve a problem.Since most problems fall into generic categories with some custom logic, standard libraries were developed which shipped with the programming languages and helped ease things. The Java standard library, MFC for C++ etc are some examples. While some languages have required us to install the library separately, languages like Python have followed the batteries included approach where we get everything in the standard install.

Read More

Fear of regression tests and application instability

Written on December 29, 2011

We all have heard about or experienced first hand some situations where an application owner comes back saying that a seemingly small change will take an obscenely long time because it would impact the other components and regression testing will take time. In these cases generally you will also see someone from you team or above you start a long winding discussion on why you should have test packs and unit tests so that you can automatically regression test the whole thing and making a change takes only a very very unbelievable short time. Am sure that sounded familiar - whichever side you were on or not on.

Read More

Making sense of an existing code base

Written on December 23, 2011

At least once in our developer lives we would have come across a situation where we need to look at an existing code base, make sense of if, understand exactly what it is doing and fix something. If you are one of the many active contributors to the various open source projects there then you might have done this again and again. You look at a project, find an issue, make a fix, send a patch get it reviewed and get brownie points when they accept it. If you are a software developer in a large company then you probably got put in that place when a developer quit or someone had to revive the code of a legacy application.

Read More

How to build a replicated HashMap in Java

Written on November 7, 2011

We all have heard about clustered application servers and databases and how they can replicate data between instances. We have heard about distributed cache implementations like Ehcache, Hazelcast etc and even NOSQL databases like MongoDB that replicate data between all the instances. They do a great job of replicating data efficiently and ensuring data integrity. They provide different guarantees on data replication and availability.

Read More

Workflows demystified

Written on October 8, 2011

If you have ever worked on a project that involves building an application for a business process, then you definitely have heard of or used a workflow. A workflow is nothing but a set of steps - with rules that define which step comes after the current step, until you complete all the steps and the result is achieved. So, if you said go to the shop, get a notepad, do your homework on it, and then turn it in for review - that is a simple workflow. This one stops at submitting for review, but if I said that once you get the review result, then go and do something else - then we have added a step in between which requires someone to come in and manually review your work. This is still a workflow, but one that involves a manual approval step.

Read More

Making Java applications more easy to re-use with embedded scripting

Written on September 27, 2011

Lets admit it - writing an application in Java takes a lot more code lines and configuration than in other languages like Python or Ruby. However, Java has been around for many years and it will stay around for many years simply because of the number of applications in various enterprises built around Java. However, of late with the new kid’s on the block like Rails, Django and all these rapid development paradigms that are out there, poor Java programmers do feel left out. Of course we have Groovy that is borne out of Java and a couple of such options, we do have to stick to Java and the different Java frameworks like Spring, Hibernate etc for our day jobs.

Read More

Open Source Software and Licences

Written on September 23, 2011

We all have heard about open source software, the philosophy behind that, how it came into being etc etc. The general opinion is that these open software are free, we can see the source code and change that to fit our needs - perfect! No more paying yearly licence fees! And then there are problems like getting support for these and who to sue when your million dollar system built on open source technologies fails - but that is a different story.

Read More

Godavari - A Python grid computing framework

Written on September 2, 2011

Previously I blogged about how to build a simple grid computing framework using Python standard library. I have recently been working on converting it into a functional library that can be used by anyone very easily. That is now ready and available for you all. I have not enforced any licence at this point and it is generally free, but I would encourage for people to consider donating some amount to help me build the resources required to maintain this as a long term project.

Read More

When coding standards and design patterns wont help you

Written on July 19, 2011

We all have been through those phases in our lives as developers where we religiously followed coding standards and used the most appropriate design patterns. Before we got to that stage we would have been in a stage where we did not care what these meant or we did not even know these, and sometimes we will be in a situation where we cannot use them for some reason no matter how much we want to. In all these cases the job got done and we got paid. So, do we really need these? Do they really matter? Or are they just instruments of torture which more experienced and opinionated developers use on juniors? Is it possible at all to write a piece of code that is not coded as per standards, has no design patterns and absolutely no documentation and still works? Works without failing?

Read More

Building a simple messaging server

Written on July 16, 2011

Anyone working in a large enterprise IT department has used JMS, MQ or TIBCO RV for messaging. These are proven industry solutions that work well in demanding situations.

Read More

Building configurable enterprise services

Written on May 31, 2011

Sometime back when I was in India and talking to my brother, he mentioned that he could not get himself to work on building something abstract like software which mostly takes shape inside the developers head. He works in accounts and finance and is more used to dealing with numbers and facts that exist on the books. To an extent he is correct - software starts as an idea in someones head and then manifests itself as a set of applications or services that process or provide data that performs the tasks and in these days of big enterprises spread across the globe and using IT to drive business, it helps make money by providing efficiencies.

Read More