Fog Creek Software
g
Discussion Board




scheduling combinatorics

I should probably know this since I took a ton of combinatorics in college, but is there a tractable algorithm for this problem:

Say you have m machines and n tasks, where m > n (well I guess it is trivial for m<=n).  Each of the n tasks takes t_1, t_2, ... t_n minutes.  How do you distribute the the tasks so that the total completion time is minimal?  Or calculate a mapping from machines to subsets of tasks.

What if I make it a little harder, say the machines have a speed factor s_1, ... , s_m.  Say the total time to completion for a task i on machine j is t_i * s_j.  Same question, find a mapping so that the total completion time is minimized.

I assure you this is not a homework problem, but it is a little embarrassing because I remember doing tons of homework problems like this!

Roose
Thursday, January 29, 2004

AFAIK there is no known way to guarantee the most efficient schedule, like the graph search problem.

Matthew Lock
Thursday, January 29, 2004

Maybe a copy of Sedgewick works would be useful ?

"Algorithms in C++" (he did a Java version lately).

Philippe Back
Thursday, January 29, 2004

In the literature this is known as a "multiple knapsack" problem. As Matthew implied, finding an optimal solution is NP-hard. But you should be able to find good-enough solutions in a reasonable time. Google may help.

MugsGame
Thursday, January 29, 2004

Did you state your problem as intended. Isn't it easy and obvious if you have *more* machines than tasks? Fastest machine gets longest atsk, and then in order down to  slowest machine gets shortest task. Slowest m-n machines are unused. Total time is the largest speed * time product sinc e all are in parallel.

Or am I missing something obvious???

sgf
Thursday, January 29, 2004

According to "Complexity and Approximation" the 'Minimum Processor Scheduling with Speed Factors' problem has no known polynomial time algorithm. However, an approximate solution can be computed in polynomial time where the time complexity is proportional not only to the size of the input, but also to the quality of the desired solution.

A reference is provided to:
Horowitz, E. and Sahni S.K. (1976) "Exact and approximate algorithms for scheduling nonidentical processors", J. ACM 23, 317-327.

Devil's Advocate
Thursday, January 29, 2004

Not an expert in the field, but it sounds like a typical "job-shop" problem, rather than a "knapsack" problem.

Nigel
Thursday, January 29, 2004

Sorry you're right, it should be m < n, trivial for m >= n.  And yeah I pretty much guessed that it was NP-hard, but just wanted to make sure.  I do remember the knapsack problem phrase vaguely.

What's the "try all possibilities" algorithm?  I can't really think of that off hand.  I mean the constraint is that the subsets of tasks you map each machine to must be a partition of the set of all tasks, but I can't think of a way to iterate over all of those.

Roose
Thursday, January 29, 2004

It's straightforward to iterate over all partitions.  If you have m machines and n tasks, there are n^m partitionings.  Let x enumerate the partitionings, x = 0 to (n^m)-1.  In each partitioning x, task y goes on machine (x / y**m) mod m.

In other words, treat x as a base-m number with one digit for each task.  The digit for task y identifies the machine to put the task on.

Of course, n and m don't have to get very big before n^m becomes unmanageable.

rob mayoff
Thursday, January 29, 2004

try this pseudocode on for size:

solve(depth, assignment) {
  if (depth == num_tasks) {
    if (f(assignment) < best) {
      best_assignment = assignment
      best = f(assignment)
    }
  }
  else {
    for each machine m {
      assignment[depth] = m
      solve(depth + 1, assignment)
    }
  }
}

This will search the entire space of possible assignments and store the best one. This is not optimally efficient by any means.

Devil's Advocate
Thursday, January 29, 2004

*  Recent Topics

*  Fog Creek Home