Monday, March 22, 2010

PhDs make many smart programmers become software engineering n00bs

This is true. A couple of years in a PhD program -- reading papers and writing throw-away code in Matlab, and it easy to become a throw-away programmer, a sort of liability in the real world. It is no surprise many companies look down on hiring PhDs. I've seen kids enter the PhD program with real programming talent and exit real software engineering n00bs. In graduate school, you might code for 6 years without anybody grading your code. If you get sloppy, you will be worse off than when you started.

The problem is that many advisors don't care about their students writing good code. Writing good papers and giving good presentations -- you will be told that this is what makes good PhD students. Who cares about writing good code? -- we'll just have some 'engineering' people re-write it once you become famous. This is what students across the globe are being fed. This is no surprise, because your advisor won't get tenure by turning you into a mean mathematically-inclined super hacker. Then again, your advisor won't care if you go bald, are malnutritioned, and have no life outside research. There are many things that one has to take care of themselves, and software development skills aren't any different.

Note to the real world looking to hire talent: You should grill, I mean really grill fresh PhDs regarding the software development skills. Don't become mesmerized by their 4.0s, their long publication lists, and all their 'achievements.' If you want to hire a fresh PhD to write code, whether in a research or an engineering setting, then give them one hell-of-an-interview. I agree with Google's interview process. I studied for it, I am proud of my own software engineering skills, and I was proud to have been an intern at Google (twice). But I know of companies who were sorry they hired PhDs only to learn these recent graduates could only dabble on the board and would utterly fail at the terminal.

Note to PhDs looking to one day take our skill-set and impact the real world: Never stop learning and never stop writing good code. Never stop taking care of yourself. You were the brightest of the brightest before you started your PhD, and now you have 5-6 years to exit as a real superman. With all the mathematics and presentations skills you will acquire during a PhD ,on top of good software engineering skills, you will become invaluable to the real world. Its a real shame to become less valuable to the outside world after 6 years of a strenuous PhD program. But nobody will give you the recipe for success. Nobody will tell you to exercise, but if you want to pound your brain with mental challenges for decades to come, you will need physical exercise in your daily regiment. Your advisors won't tell you that keeping up to date on the tools of the trade, and being a real hacker, is very valuable in the real world. You will be told that fast results = many papers and its not worth writing good code.

After obtaining a PhD we should be role-models for the entire world. Seriously, why not? If a PhD is the highest degree that an institution can grant, then we should feel proud about getting one. But we are human, and one is only as strong as their weakest link. We should become super hackers, fear no quantum mechanics, fear no presentation in front of a crowd, and be all that one can be.

This is a part of a serious of posts aimed at finding flaws in the academic/PhD process and how it pertains to building strong/intelligent/confident individuals.

24 comments:

  1. I'm testing out comments here. I'd like them to be readable with Google Reader.

    ReplyDelete
  2. Anonymous8:08 PM

    Do you mind elaborating on Google's interview process?

    ReplyDelete
  3. I don't there is anything I can add that you can't find online when searching for "Google interview". The important bit -- and quite relevant for somebody like me -- is that even with a PhD you are expected to be a good software engineer. That means that everybody must pass a software engineering interview if they will be writing code. Some of the best hackers that I know would kick butt at Google are not my fellow Robotics PhD classmates at CMU. Of course, I know some really talented kids in my department at CMU, but when it comes down to surviving in a C++ heavy development cycle, if you didn't learn it as an undergraduate (and bothered to maintain your skills) then you probably didn't learn it in graduate school while becoming a master of a narrow field such as Computer Vision.

    ReplyDelete
  4. Anonymous3:24 PM

    PhD's make professionals, if you have a Bachelor in CS you are not a programmer. University teaches theory and software writing within certain base of knowledge, not how to write production software. In a real environment no one cares about how good your code looks, the important is the ratio of efficiency to time or money spent on it, and basically money gain vs money(time) spent on development.

    - My opinion

    ReplyDelete
  5. Anonymous6:39 AM

    I can't agree more. Actually, part of the problem is that software engineering is an incredibly fast paced area, continuous integration and cloud computing were non-existent 10 years ago, and now they are the standard. Academics like stability in knowledge, or at least "stackability" of knowledge. But in software, each revolution tend to invalid the previous ones (think super-computer vs cloud).

    ReplyDelete
  6. I was really very interesting articles, is it information about software development's students I mean really grill fresh PhDs regarding the software development skills.

    ReplyDelete
  7. Mixed feelings. Yes PhD students do write a lot of throw away code, however I personally take software engineering fairly seriously. My throwaway code is always stored in a svn repo and my non-throw away projects are stored in separate repos with proper build systems, regression tests, manual pages and portability routines etc. This is the first time I have been so thorough with my engineering practice because I *know* I have to present this work at conferences.

    Sweeping generalisation?

    ReplyDelete
  8. Hey Edd,

    First I want to reinforce my observation that merely *many* PhD students become worse programmers after many years of writing throw-away code, definitely not *most*. I use the expression "many" because I think it is a surprisingly high number -- much higher than I expected when starting my PhD. I think as an undergraduate I always thought a PhD in Computer Science would make anyone a bad-ass programmer.

    Unfortunately, I cannot speak for all departments nor even all branches of Computer Science. Working in a very mathematical field such as Computer Vision there is an emphasis on theories and formulations instead of building systems. To make things worse, due to the egocentric predicament, I can only share my own experiences related to my own adventure through life.

    So, if fresh PhD aren't better programmers than their undergraduate counterparts, are PhD degrees worthless? Definitely not! From my experience, many PhD students become powerful orators, better writers, and have the necessary skills to be at the forefront of scientific research.

    I think there are many ways for one to become a bad-ass programmer, and with open source projects being ubiquitous, opportunities are everywhere. I think a PhD is still necessary to develop a critical mind and mathematical skills necessary for working on open-ended research topics.

    ReplyDelete
  9. Oh yes it is really very nice blog, and so profitable for software engineering, thanks allot for shearing.

    ReplyDelete
  10. Anonymous1:46 AM

    I graduated with a CS PhD from a top-20 university not too long ago and have worked for some of the major search engine companies. My advice to you: avoid software engineering if you can. There will always be some Russian or Indian software engineer who is willing to put in more effort than you and work for less money. Leverage your PhD and move up the food chain as fast as possible. Find a research-oriented job where your output is measured by patents and productized ideas rather than by LOC or features implemented. If you insist on being a software engineer, you will find that you will burn out far, far faster.

    ReplyDelete
  11. Unfortunately, I think that in order to advance Computer Vision one cannot fear large-scale experiments. I find too many smart PhDs who do not possess such skills. For example, to convince me that your object recognition algorithm is *good enough*, you'll have to execute your algorithms on thousands (potentially millions) of images. This isn't always as easy and handing-off poorly executed "prototype" code to some software engineers.

    I think being a superb software engineer is just a part of the game -- the game of advancing Computer Vision. I don't think being a software engineer is enough, but a dimension that cannot be overlooked. I don't think writing C++ all day is going to advance Computer Vision, but it's not something I'm willing to be afraid of.

    Galileo spent countless days building telescopes for the sake of science; therefore, I'm willing to spend countless days engineering large-scale computation frameworks (if that's what it takes).

    ReplyDelete
  12. Anonymous9:18 AM

    I totally love this article.
    Kirk

    ReplyDelete
  13. Anonymous6:58 PM

    Discipline in writing code is good. But then most good schools will measure the quality of your Ph.D from your research output and that would invariably include deal with obscure math that nobody other than folks specializing in a certain kind of problem (and associated mathematical techniques) to solve them will understand. Post-Ph.D you are expected to be hired because of the effort you put it into understanding these problems and thinking of creative ways to solve them. Writing and maintaining code is essential but then that should be ingrained you and after a certain point should be effortless. Great software engineering is good but that is seldom the focus of your Ph.D. If it were then you don't need to the Ph.D and spend 5-6 yrs devoted to a field. You can pick up good software practices in <1 yr but the 6 yrs are better spent understanding and contributing to the field... in most cases.. without having a social life :)

    ReplyDelete
  14. I understand that to get a PhD, writing good code is not enough. But to turn an idea into a million dollar product, or to have thousands of researchers use your software and eventually become a house-hold name in academic circles, good code is essential. If you want to make an impact beyond satisfying your PhD committee and change the world, I highly recommend going not only writing good code, but sharing it with the rest of the world.

    Good software practices in < 1yr? You can hire those guys if you want -- I prefer to be around the PhDs that mastered both machine learning and software engineering (perhaps by being the only ones in their academic departments who gave a damn).

    Norvig thinks 10 years is what it takes to become a jedi.
    http://norvig.com/21-days.html

    I recommend a PhD for everybody who likes a good challenge. I recommend learning to write good code to everybody who is a programmer.

    ReplyDelete
  15. Anonymous11:13 PM

    Wrong! Wrong! Wrong! I am sorry Tomasz, but you got it all wrong in this article. You make a good argument but it misses the point.

    I have personally known many famous computer scientists and have worked with them on several projects. All of them unanimously agree on this:

    Software Engineering is the enemy of good research. You can spend all your time doing that. As a researcher, you have to read, think, write, design experiments, and put up with 100s of things. Practicing good software engineering will kill your time. Time is everything and can lead to strategic advantage or disadvantage.

    Even guys from industry are not in favor of writing highly maintainable code:

    http://www.joelonsoftware.com/articles/fog0000000069.html

    PhD is indeed about writing good papers and presentations. There is nothing wrong about good software engineering practices except they slow you down significantly in some cases. If the project is risky, and the goals are uncertain, we may first want to test the waters by writing throwaway code. And research is full of risk.

    ReplyDelete
  16. Hi Anonymous,

    Thanks for your interest in my post! I read the joelonsoftware post you mentioned, and I don't think I ever claimed that PhDs should re-write code from scratch (what the article indicates is a bad idea in most cases). Perhaps when I mentioned that PhDs should not let their software engineering skills get rusty, I wasn't being clear. Writing good enough code so that you can share it with others and establish credibility is of utmost importance. I'm not claiming that writing unit tests, writing tons of base classes, using C++, distributed compilers, etc is necessary to establish yourself as a good researcher, but releasing code in a state that others can believe your research findings is key. Too many papers get published, but where is the code? This is computer science, if the world isn't seeing your code then try harder!

    I think the most important thing that one can do during their PhD is to establish scientific credibility! "Why should the world trust your scientific intuitions and ability to test your hypotheses?" I don't trust many scientific findings -- if I can't re-create your results then there's always reason to doubt. Research code should be good enough to be usable, and definitely needs to be sharable, but it doesn't always have to be "highly maintainable" or "overly engineered to perfection."

    Nobody has to agree with me, I'm just saying that your pretty curve won't really convince me if the code is nowhere to be seen. And if your code is pathetically structured and so ugly that it can't be read, then it better run.

    ReplyDelete
  17. Anonymous4:07 AM

    Hi Tomasz

    Why do you think there's reason to doubt the findings that you can't recreate? Do you think that the researchers intentionally mislead the scientifc community? Do people lie?

    ReplyDelete
  18. I think there is too much pressure to produce results in a timely and expedient manner -- too much pressure to have time to look over your work a second time. I think people should allow other researchers (perhaps the students who do have more time on their hands) to look over their "lab notebooks" where for computer scientists those "lab notebooks" are source code. People should focus on advancing science a not prematurely celebrate "getting a paper in."

    I do not think that "not releasing code" is any sort of intention to mislead the community. Although forging results does sometimes happen, when a student is told they must publish to graduate and they have no intention of following-up their work nor pursuing any other academic track.

    There is dishonesty at all levels, and releasing code helps keep everyone honest. I think a good scientist should always be skeptical regarding non-reproducible research.

    ReplyDelete
  19. Anonymous2:45 AM

    I have an other take
    Really good programmers like experts in other fields start early in life
    So in an interview ask how many years the candidate is already programming.
    Here is my personal statistik:
    Started programming in basic with 14
    With 15 i published my first basic programs
    Then i learned pascal in high school
    Again sold some software for the atari st
    Of course i went into a cs program
    And also i thought the best way is to go with a phd in cs
    For my phd thesis i wrote probably 40000 loc c++
    Of course i published papers and i have some patents filed.
    I think none of my education makes me a worse programmer
    But certainly it is different if you start hiring physics , math or ee phds and
    Assume all phds are the same.
    I think the main problem with prgramming is that everybody with a 6 months course
    Thinks he can do it ... And hr folks seem to agree.
    Gosh a phd in cs in combination with real programming experience is certainly
    More than any hr guy could possibly ask for.

    ReplyDelete
  20. "mean mathematically-inclined super hacker"
    I'd rather have that as my epitaph than P-h-D.

    ReplyDelete
  21. Your programming skills can only get worse during the PhD if you start with any!

    ReplyDelete
  22. I see you are manually approving posts. Here's just a note you can feel free to dismiss/delete. http://quantombone.blogspot.com/2010/03/phds-make-many-smart-programmers-become.html?showComment=1273728460079#c7383183812031967037 and http://quantombone.blogspot.com/2010/03/phds-make-many-smart-programmers-become.html?showComment=1270707942353#c5582438559913801763 appear to be spam.

    ReplyDelete
  23. I agree with your point that software engineering skills can be invaluable in the real world after your PhD. However, it would have been great to see "how" rather than why. If possible, you could have written about how you picked up those great software engineering skills you claim you have 2. Advice to manage research and good software engineering. This would have benefited several PhD students like mine who read your blog. Personally, i would love to see examples of good C++ projects on github/elsewhere that were written by yourself or the 'kids' or fellow PhD students, as i would put it, from your dept. that you consider great.

    ReplyDelete
  24. well, it's like with any other skill in life - use it or loose it!

    ReplyDelete