Procrastination post: Erdos Number

In waiting for my combinatorial explosion of a factorial design experiment to complete, I decided to find out what my Erdos Number looks like, which is basically defined as the collaborative distance between yourself and the famous mathematician, Paul Erdos. Turns out, I have an Erdos number of 4 via the following chain of co-authorship.

/me — Anja Feldmann — Edward G. Coffman, Jr — Joel H. Spencer — Paul Erdős.

You can calculate your Erdos number here.

Optimize your life for learning

Just stumbled upon a gem of an email [1] from Professor Alexander Coward at Berkeley, explaining why he isn’t going to cancel a class despite a strike.

The last couple of paragraphs ought to resonate well with anyone who’s obsessed with learning. To quote the email,

 …

In order for you to navigate the increasing complexity of the 21st century you need a world-class education, and thankfully you have an opportunity to get one. I don’t just mean the education you get in class, but I mean the education you get in everything you do, every book you read, every conversation you have, every thought you think.

You need to optimize your life for learning.

You need to live and breath your education.

You need to be *obsessed* with your education.

Do not fall into the trap of thinking that because you are surrounded by so many dazzlingly smart fellow students that means you’re no good. Nothing could be further from the truth.

And do not fall into the trap of thinking that you focusing on your education is a selfish thing. It’s not a selfish thing. It’s the most noble thing you could do.

Society is investing in you so that you can help solve the many challenges we are going to face in the coming decades, from profound technological challenges to helping people with the age old search for human happiness and meaning.

That is why I am not canceling class tomorrow. Your education is really really important, not just to you, but in a far broader and wider reaching way than I think any of you have yet to fully appreciate.

[1] : http://alumni.berkeley.edu/california-magazine/just-in/2013-11-21/cal-lecturers-email-students-goes-viral-why-i-am-not

Questions to ask before conducting a performance evaluation

I’ve been evaluating some popular cloud systems lately and unsurprisingly enough, I’m finding it really hard to make a final call on a set of results. Have I tuned the systems I’m comparing appropriately for the workloads I’m subjecting them to? Are the workloads reasonable? Is the system itself the bottleneck or is it a set of external factors? There were so many questions piling up that it’s made me rather skeptical about system evaluations I find in papers today.

So I’ve picked up this book, and found this minimal checklist to go through to avoid common mistakes when conducting a performance evaluation.

  1. Is the system correctly defined and the goals clearly stated? 
  2. Are the goals stated in an unbiased manner? 
  3. Have all the steps of the analysis followed systematically? 
  4. Is the problem clearly understood before analyzing it? 
  5. Are the performance metrics relevant for this problem? 
  6. Is the workload correct for this problem? 
  7. Is the evaluation technique appropriate? 
  8. Is the list of parameters that affect performance complete? 
  9. Have all parameters that affect performance been chosen as factors to be varied? 
  10. Is the experimental design efficient in terms of time and results? 
  11. Is the level of detail proper? 
  12. Is the measured data presented with analysis and interpretation? 
  13. Is the analysis statistically correct? 
  14. Has the sensitivity analysis been done? 
  15. Would errors in the input cause an insignificant change in the results? 
  16. Have the outliers in the input or output been treated properly? 
  17. Have the future changes in the system and workload been modeled? 
  18. Has the variance of input been taken into account? 
  19. Has the variance of the results been analyzed? 
  20. Is the analysis easy to explain? 
  21. Is the presentation style suitable for its audience? 
  22. Have the results been presented graphically as much as possible? 
  23. Are the assumptions and limitations of the analysis clearly documented?

Afraid of code? Don’t be

A student approached me over email recently, wondering if I was proposing any master thesis projects he could work on. In his mail, he’d left a disclaimer that he wasn’t a good programmer. I assumed it was one of those cases where the student was indeed capable, but decided to be modest about his abilities.

When we met, I walked him through the problems I was working on and explained a handful of potential master thesis topics I’d sliced out of this space that he could give a shot.

Even though he had no prior experience in distributed systems or the cloud computing arena, I felt he was able to comprehend me quite well, asking rather intelligent questions along the way. Around 15 minutes into our discussion, he asked me what it would entail to go about these projects.

I answered with something along the lines of: “Profiling a bunch of systems, and of course, reading and modifying the source code of at least one of them.”

The student seemed rather discomforted by my response. He immediately reiterated  that he’s not a very good programmer, and that if the project involved a lot of coding, he won’t be able to deliver on time. At this point, I pulled out my Obi-Wan hat and explained to him that being comfortable with programming is inevitable, especially considering the fact that he will be awarded a masters degree in Computer Science once he completes his thesis. I could tell that he wanted to be good at this, but was afraid of not being able to match up to the challenge.

This got me thinking. Haven’t we all been here before? What did we do to overcome our fear of code?

So here’s my personal take on overcoming this fear.

You need to start somewhere

Cliché yes? Yet, it seems to be lost on a lot of students.

See your first major programming task, be it a thesis project, a course project, your first job, or even a self-planned weekend project, as a rite of passage.

Writing software is an art. And as with any kind of art form, it takes sweat and discipline to be good at it.

There’s only so much you can learn through reading books, going through tutorials and solving exercises that will serve as a substitute for actual experience. More often than not in your career, you’ll end up having to read, modify, or maintain code that others have written. The sooner you’re exposed to the innards of dealing with such a situation, the better. So start somewhere!

The first time’s the hardest

Let’s say your project involves building atop an existing code-base (open-source or otherwise). How do you tame the beast?

It’s often as simple as:

  • Acquiring the pre-requisite domain knowledge
  • Identifying the exact components you need to deal with
  • Hacking away

Acquiring necessary domain knowledge

Of course, before you can hack on a piece of code, you need to have some sense of what the code should be doing. It’s practically impossible to dive into a large code-base without any context whatsoever and manage to understand what’s going on. For instance, attempting to hack a Linux WiFi driver without having any clue of what the WiFi MAC/PHY functions are would be an exercise in masochism. If, however, you knew the concerned protocols, things will make more sense to you.

And guess what? You already have or will have to build up this domain knowledge anyway, like when you’re about to embark on a masters thesis.

Identifying the components you need to deal with

A mature open-source project is usually accompanied with really good developer documentation, which will have important information about the code’s architecture and some hints about the the key code paths.

Make a pass through the documentation and have a look at the source tree of the project. The directory hierarchy is usually quite revealing of the code structure. Also, grep through the source files for  keywords pertaining to the components you’re looking for.

Soon enough, you’ll be able to identify the subset of source files you’ll need to tamper with.

Hacking away

I usually begin by instrumenting the code with simple logging statements to get an idea of what’s happening under the hood. Now start introducing your modifications little by little. Re-run the code to see if your modifications work. If possible, use a debugger and stack traces to understand what code paths are being taken internally. Rinse and repeat till you’re done.

With enough time and effort, you should eventually tame the code.

Stand on the shoulders of giants

Key to any form of learning is to make diligent use of the experience of those who’ve traveled the same roads before you. In the programming world, reading code that others have written is an easy way to accomplish this. Better yet, try to contribute patches and have the experts themselves review your code for you. Bear in mind that you will make mistakes, but that’s part of the journey.

In my case, my first non-trivial code patch to an open-source project was this one. Notice how many mistakes I’m making along the way, and how many iterations it took before the patch was finally committed (and how patient the maintainers were!).

Wrapping up

The path to become a better developer is exciting, deeply empowering, and incredibly humbling.

Of course, to experience all of that, it’s imperative that you gulp away all that self-doubt and embark on the path first. So don’t fear code.

In Soviet Russia, code fears you!