Clay has posted 1 posts at DZone. View Full User Profile

What is so Hard About Parallel Programming?

03.31.2009
| 10560 views |
  • submit to reddit

I've seen it again.  One more claim that "Parallel programming is hard" and then the claimant launching into some complex and convoluted solution to make parallel programming easier.  (Matt Wolfe has some interesting observations in this regard.) How hard is it to do/learn parallel programming?

Disclaimer: I've been doing parallel/concurrent/multithreaded programming in one form or another for over 20 years. I want my readers to understand that my opinions in this area may be tainted by my experience. I watch chef Gordon Ramsay whip up something in the kitchen, making it look easy, but I know I'd never be able to do the same thing with the same ease. He's had years of practice, and I know that I might be able to get closer to his culinary skills with some instruction and lots of practice. It's the same with parallel programming, IMO.

I do know something that is hard: Theory of Computation. O-M-G! All that stuff about grammars, finite state machines, context-free languages, pumping lemmas, Turing Machines, recursive functions, uncomputability, Church, Post, and Kleene. It was enough to make my head swim when first confronted with all of this. (If your head is swimming with the memory of these topics, too, pause now and take a deep calming breath.) With mega-liters of water under the bridge since those harrowing days, I no longer wake up in the middle of the night screaming about being asked to solve the Halting Problem.

Recently, I started reading Feynman Lectures on Computation (Perseus Publishing, 1996), edited by Tony Hey and Robin W. Allen. These are transcriptions of lectures delivered by Richard P. Feynman during the mid-80s at the California Institute of Technology. Chapter 3 deals with the Theory of Computation. It is 40 pages long and covers a range of topics from finite state machines to Turing Machines to uncomputable functions and the Halting Problem.  Forty pages(!) on stuff that took me a whole semester to cover. Can it really be that hard? Prof. Feynman (a Physics professor) certainly makes it look easy.

Granted, there was a lot of stuff I covered in my semester course that was left out of Feynman's lecture. But, on further reflection, he didn't need to cover it to make his connections and to give his students (likely non-CS majors) enough background to allow them to be conversant with the topic and to get started on their own investigations. I believe that Parallel Programming can well take this path, too.

My thoughts about the proponents of the "Parallel Programming is Hard" line is that they are probably thinking about the totality of all the things one needs to have mastered to be an expert and effective parallel programmer in all situations. That is a lot of stuff and could be thought of as "hard." But do we really need to teach it all at once, in one sitting as it were?  Couldn't we start by hitting the highlights and give some previews about what the future issues are that will give students enough information to be practical parallel programmers?

I think that learning parallel programming is no harder than learning a second programming language. C++ and Fortran and Java have syntax rules and methods of coding that must be mastered in order to write programs, as do Perl, Lisp, COBOL, Python, Ruby, Scheme, APL, C#, and a thousand other languages. Any given parallel or multithreaded programming model/method/API also has syntax rules and methods of coding that must be mastered in order to write programs.

Some may say that the methods and other pitfalls in parallel programming are so different from sequential programming, and this is what makes it "hard." Well, COBOL and Lisp are two very different programming languages, but I've been able to learn both of these during my career.  While I may not be proficient in either of these right now, I feel I could easily brush up on the details and be able to achieve at least a journeyman level of skill.

For any university degree program that involves computer programming, journeyman seems like a sufficient level of skill in parallel programming to ask a sophomore to have achieved. (Is there any university/college that wouldn't expect their rising juniors to have at least two programming languages under their belts?) If more specific parallel programming skills are needed, those niggly, more advanced details can get taught during the junior and senior years. To get to the journeyman level, we need to add parallel programming fundamentals and principles (not all the esoteric details) in the first programming course and continue to exercise those skills beyond that.

What about the creative side of parallel programming? I've often heard that programming is more art than science. I think that this is doubly true for parallel programming. We can easily teach the mechanics of programming. But, if there isn't an innate talent for logical thinking and breaking down a problem into a series of steps, a student isn't going to stick around in a CS program for very long. Once parallel programming becomes a part of the university curriculums, especially if it starts out on Day One, we'll still have people enroll in courses only to later discover that they can't grasp the art.

To summarize, I don't think parallel programming is any harder than learning a second programming language and, taking a cure from Richard Feynman, we don't need to fill in all the details of problems and pitfalls inherent in concurrent algorithms in order to start teaching parallel programming at the outset of a student's academic career. We don't need to breed a race of "Atomic Super-Programmers." Give them enough to get started with some chance of success in a controlled environment and fill in the blanks later, as needed. That doesn't sound all that "hard," does it?

References
Published at DZone with permission of its author, Clay Breshears. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Álvaro Martínez replied on Tue, 2009/03/31 - 2:29am

Check Scala!

Slim Ouertani replied on Tue, 2009/03/31 - 2:46am in response to: Álvaro Martínez

Yes, may scala resolve the problem with actor model and Erlang too.

Artur Biesiadowski replied on Tue, 2009/03/31 - 4:12am

I don't think that concepts of parallel programming are so hard compared to learning another language[1]. What is a major showstopper IMHO is difficulty with debugging and testing. In many cases, you will learn some parts of new language by trial and error - be it on compilation level, runtime or during debugging. This gives immediate feedback in your learning process. You can also quite sure that after writing extensive unit test, your module will be correct - you can look at it as a box with inputs and outputs.

With parallel programming (the hard kind, not Erlang-like solutions), you will get something which is working perfectly, except once in a month under heavy load. Or it works ok on your dual core test machine, but fails every other time on 16-core production machine. In some cases, it will fail in mysterious way, not directly linked to the real problem. With multithreading issues you are not getting immediate feedback and risk of getting undetected bugs through all layers of testing is a lot higher.

This also means that 'journeyman' level might be not enough. It is ok to have junior programmers coding some easy, unit-testable modules - bugs will be catched early, failures rather local. Putting people with same skill in multithreaded programming can cause major disaster.

[1] - I still think that designing proper lock-free data structure in language you know is harder than designing single-threaded data structure in a new language.

Raw ThinkTank replied on Tue, 2009/03/31 - 10:34pm

The difficult part is in taking decisions and unavilibility of programming constructs for doing the job with mininum code.

For example, is it a common practice to build GUI on one thread, Business logic on another and database on third; its not. And even if u do that, how will u keep it running in parallel.

There is no practice of keeping program modules distinct, disconnected and yet dependent on each other.

Hence there are no avaliable tools for doing so, it will take time for us to reach there with evolution of such. 

Howard Lewis Ship replied on Wed, 2009/04/01 - 5:16pm

One could almost mistake this for an April Fools posting (!)

 Writing Parallel code is not that hard; writing parallel code that can be trusted is very hard. The Java Memory model is more complex than you might think (review Brian Goetz's book if you don't believe me).   I've found that people have a hard enough time reasoning about sequential behavior in a program, and throw in the demands of picturing mutliple threads, mutability, locking, retries and so forth and you're beyond most people's comfort zones.

I know from hard experience that deadlocks do happen quite easily (Tapestry had a devil of a deadlock that was caused by the interaction of Javassist and ordinary ClassLoader operations). That should not be considered a dig at Tapestry or Javassist, just an observation that threading and deadlock issues are prime candidates for The Law of Unexpected Consequences.

 I'm a big fan of immutability, so I'm very thrilled with Clojure and its Software Transactional Memory approach.

Andy Leung replied on Wed, 2009/04/01 - 8:28pm

I think the world of programming has been changed.  In my opinions, we should not do parallel programming, instead we should model processes and build models (sequential + parallel) by using frameworks or BPM.  As processes and workflow are controlled by BPM, so programming will be done for pure I/O or DB or data transfer.

Well this may be too ideal, but this is what the direction I am looking at.

Just my 2 cents.

Artur Biesiadowski replied on Thu, 2009/04/02 - 1:22am in response to: Andy Leung

"In my opinions, we should not do parallel programming, instead we should model processes and build models (sequential + parallel) by using frameworks or BPM."

Which is exactly what this the original poster asked about - is parallel programming so hard that everybody should use some kind of frameworks to shield themselves from the complexities of it. 

If you are into 'modelling processes' area, then sure, you are right. But:

1) Who will write the frameworks? It is not like there is somewhere a hidden breed of programmers for whom multithreading is easy and they will write the frameworks for us to use. 

2) Fortunately, there is still a plenty of places which are not fitting into frames given by ready solutions (be it for performance reasons or requirements). By accident, those also tend to be the places with most interesting and best paid jobs...

So, I fully agree with your statement, just replacing 'we' with plural 'you' - more people focused only on ready solutions and integrations, less competition on 'advanced' job market.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.