Perpetually In Beta.

Archive for the ‘threading’ tag

When is multi-threading not a good idea?

with 15 comments

"When all you have is a hammer, everything looks like a nail."

I was recently working on an application that sent and received messages over Ethernet and Serial. I was then tasked to add the monitoring of DIO discretes. I throught,

"No reason to interrupt the main thread which is involved in message processing, I'll just create another thread that monitors DIO."

This decision, however, proved to be poor. Sometimes the main thread would be interrupted between a Send and a Receive Serial message. This interruption would disrupt the timing and alas, messages would be lost (forever).

So I found another way to monitor the DIO without using another thread, and Ethernet and Serial communication were restored to their correct functionality.

The whole fiasco, however, got me thinking. Are their any general guidelines about when not to use multiple-threads?

I asked my friends at stack-overflow and scoured the internet for information. The results may surprise you.

Think twice, no three times before even thinking about using threads

I heard this advice time and time again on blog after blog. Before you thread, think.

I think the old programming wives tale puts it best: A Programmer had a problem. He thought, 'I know, I'll use threads' Now the programmer has two problems.

Many think that if multiple threads are used, old-man-trouble gonna come round the corner and knock the sweet bejesus outta you. There is good reason to fear old-man-trouble. Using multiple threads adds a whole new layer of complexity to your code. If something, somewhere goes wrong with you code down the line, that new layer of complexity makes your code just that much harder to debug and maintain. Not to mention that multiple-threaded code is a good bit harder to scale.

In a recent project, I used two threads to implement a command processing routine. Thread One sent Thread Two a command. Thread Two executed that command. Thread Two said to Thread one, "Give me another command." Thread one sent Thread two another command. And so on and so forth. It took me a long time and many lines of code to get the two threads playing nicely.

Several weeks later, the code broke. I stared at that code for — I kid you not — two hours trying to figure out exactly what I had originally done. Code should just not be that complex. Remember, keep it simple. I could have easily performed the same functionality with a single thread.

As Thomee on StackOverflow put it, "Don't use threads, unless there's a very compelling reason to use threads."

So, when it comes to threads, ignore Bob Dylan's sweet sweet crooning, "Don't think twice it's alright…."

When it comes to everything else through, Dylan should prevail.

Absolutely Never

Fifel, in 'An American Tale' made his sentiments on the word 'never' as his little mice lungs wailed, "Never, say never, say never my friend." While I try to never adhere to absolutes, there are several cases where multiple-threads should absolutely never be used.

When bound by a single resource

Multiple threads should never be used when each thread is just going to be battling it out for the same single resource. In his blog entry, "When to use Threads", skeet expresses the pointlessness of using multiple-threads when sharing a single resource.

"If your application is bound by a single resource (i.e. the disk, or the CPU) and all the tasks you would use multiple threads for will all by trying to use that same resource, you'll just be adding contention."

When I was in elementary school, I certainly knew about the contention stemming from vying for the same resource. Each day after P.E., we would all come pouring back into the building from being outside in the 99 degree heat for the last 30 minutes. We were hot, sweaty, and thirsty, but our elementary school only had a single drinking fountain. There was always a race to the fountain.

When we arrived, we would push, shove, grope, kick — anything to get a drink of that cool refreshing beverage. But no matter how hard we battled each other, ultimately only one person could drink at a time. We were bound by a single resource. And battling each other for water only slowed the process down. A teacher would arrive 30 seconds later and make us form a single-file line. Standing in line, we would tap our feet and clear our throats when we felt someone was taking just a little bit too long, drinking a little bit too much. But in a short period of time everyone got their fill.

A few years ago, I was in the process of moving a large amount of music from one hard drive to another. I selected all the folders at once (all 2000 or so of them) and dragged them over to the second drive. The Windows File Copy routine booted up and showed '45 minutes' as the estimated amount of time to complete the job.

Well, I was in a rush. And thinking that I was clever, I decided to cancel the operation and instead tried to parallel-ize the copying job. As fast as I could click, I copied and pasted 10 folders at a time into the new drive. When I got about 500 folders in, all the processes ground . to . a . halt. The hard drive head was seeking all over the place and never making any progress on the copy operation.

At this point, I realized my stupidity. The operation was bound by a single resource — the disk! The hard-drive was my metaphorical elementary school water fountain. And each copy process was pushing, shoving, and grouping for that single resource.

Understand how the resources you are using are bounded, and if those resources only have a single point of entry, there is absolutely no point in trying to use multiple-threads to complete the task at hand.

Sequential Code

Multiple Threads should also never be used when each step being executed relies heavily on the results from previously executed steps. For example, the steps to drive a car are as follows:

  1. Turn On The Ignition
  2. Put the car into 'Drive
  3. Hit the Gas

To drive a car, you must complete the above steps in order.
Each step must be completed before the next step can be executed.
If you try to complete any of these steps out of order, the results will not be what you desired.
You simply cannot put the car into 'drive' without first turning on the ignition.

I guess you could technically reverse steps two and three by hitting the gas, then put the car into 'Drive', but only if you were filming a remake of Smokey and the Bandit or something. Regardless, you would never try to complete all three steps at once. That would be foolish.

starting with a
b = functionX(a)
c = functionY(b)
d = functionZ(c)

Take a look at the code above. I think we can agree that most code is sequential in nature. You start with A. You plug A into functionX, it returns B. You plug B into functionY, it returns C. You plug C into function Z, it returns D. Each step leads into the next. FunctionY cannot be executed before B is computed. FunctionZ cannot be executed before C is computed. Trying to implement the above code with multiple-threads is like trying drive a car without turning on the ignition. It is pointless.

If your code must be completed sequentially, using multiple-threads is worthless.

When to (Maybe) Use Threads

GUI Programming - Keeping a Process Responsive

One place multi-threading does actually come in handy is when programming a Graphical User Interface (GUI). Multiple-Theads are sometimes used in GUIs to keep them responsive while the computer is chewing on user input.

For example, say you have a program that did nothing but compute the digits of pi. Next, you add a GUI to the program that had a single Big Button that toggled between "Compute Digits of Pi" and "Stop Computing Digits of Pi." After clicking on "Computer Digits of Pi" and starting the program, if multiple threads are not employed to keep the GUI active while computation is taking place by the processor, "Stop Computing Digits of Pi" would never be able to be clicked. The program could never be stopped. The GUI would be frozen, indefinitely churning out digits of pi until the universe came to an end.

A Threading Rule Of Thumb: When programming a GUI, if the user input takes longer then one to two seconds to compute, consider using a second thread.

Keeping the GUI responsive is very important and contributes to usability. Skeet delves more into this sentiment on 'When To Use Threads' when he says that "the UI can easily become very unresponsive, which gives a horrible user experience. (GUI) threading isn't used to get the job done quickly - it's used to get the job done while keeping the user satisfied with responsiveness. You might be surprised just how quickly a user can notice an app becoming unresponsive. Even if the user can't actually do anything but close the program or move the window around while they wait it gives a much more professional feel to a program if you don't end up with a big white box when you pass another window over it."

But, again let me emphasize that you should always think twice about using multiple-threads even in GUI applications. Earlier this year, I developed a QT GUI application that interacted over Ethernet with big-hunking piece of hardware. I would send a command to this hardware, the hardware would execute that command, then, after executing the command, the hardware would send back a response saying "Everythings OK!" or "FAIL!" The hardware could take up to five minutes to respond to certain commands.

If I didn't want my GUI to freeze up while waiting for the hardware's response, one option would have been to spin the command/response process onto a separate thread. But another less complex option would have been to use non-blocking calls to the ethernet API functions and a timer to periodically monitor the status of the ethernet ports to see if a message had been received or not. Easy Right?

Whether or not you understand what non-blocking calls are, my point is this: There is almost always more then one way to skin a cat. Think about that cat. Wait, is the cat the thread? Never-mind.

Data Analysis (Algorithm Processing) on Multi-Core Machines When Performance Matters

If you are working with multiple-processors, with data computation that can be parallelized, and are in the business of getting things done quickly, then guess what? Multiple-threads are something you should consider.

With more than one processor, multi-threading is truly multi-threading — meaning that two tasks can be (for-real) executed at the same time. One task on each core (processor). On a single core machine, multi-threading is a scam (a farce!). The scheduler plays a game of red-light, green-light with the threads when more than one thread is used. It decides which which task executes when. If you set break points and watch the scheduler switch back and forth between threads, it seems totally random. You can give the scheduler general guidelines as to which thread is more important, but exactly when each thread executes is mostly out of your control. Multiple-Cores give you much more control over schedule execution.

Data computation that can be parallelized is the opposite of the sequential code processing that was talked about above. Parallel tasks can be divvied up and divided between multiple-cores. The ordering of the execution does not matter. If thread 4 finishes execution before thread 3, this is OK. Think parallel complex data processing algorithms. Think complex math stuff.

When coding-up these parallel data processing algorithms, something that you have try to be careful about is creating too many threads. Say you get excited about your multiple-cores and parallel processes and start spinning threads off all over the place unbounded. Then, instead of resource contention, you suffer from thread contention and your CPU's spend more time switching between threads then it does processing them. Put an upper limit on the number of threads you create. Google 'Thread Pools.'

But remember just because you have multi-core machine and are working with data computation that can be parallelized doesn't mean you should use multiple-threads. Try it single threaded first. Not fast enough? Try two threads? Still not fast enough? Try three. Take it slow. Only add more complexity when absolutely necessary.

So, what have we learned?

1. Try not to use threads.
2. Think.
3. Keep it simple.
4. Sharing a single resource? Don't use threads.
5. Sequential Code? Don't use threads.
6. GUI programming? Try not to use threads, but keep the GUI responsive.
7. Multiple-Cores? Parallel Data Computation? Don't use threads.
8. Multiple-Cores? Parallel Data Computation? The need for speed? Use Threads.

Written by codingwithoutcomments

September 21st, 2008 at 11:23 pm