We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Scheduling in the Random-Order Model

00:00

Formal Metadata

Title
Scheduling in the Random-Order Model
Subtitle
Q/A Session F - Paper F1.A
Title of Series
Number of Parts
123
Author
Contributors
License
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Order (biology)Random numberData modelSicNeuroinformatikComputerKolmogorov complexityAlgorithmScheduling (computing)Task (computing)Virtual machineMaxima and minimaDeterminismBound statePermutationParameter (computer programming)Identical particlesMathematicsMathematical analysisMathematical analysisVirtual machineSequencePoint (geometry)Scaling (geometry)Scheduling (computing)Model theoryMaxima and minimaMultiplication signRevision controlPhase transitionRandomizationoutputLimit (category theory)Process (computing)Axiom of choiceNumberLine (geometry)Impulse responseDrop (liquid)Presentation of a groupSet (mathematics)Square numberWorkstation <Musikinstrument>System callSign (mathematics)Parameter (computer programming)Software testingHypothesisFunctional (mathematics)Block (periodic table)Term (mathematics)Arithmetic progressionComplex (psychology)Dataflow40 (number)WordPhysical lawCASE <Informatik>ResultantAreaComputer configurationPlanningVotingStructural loadSequelOrder (biology)Computer scienceSummierbarkeitDeterminismOnline-AlgorithmusStrategy gameMultiplicationTheoryWeightStandard deviationClosed setRootMathematical optimizationProof theoryPositional notationBound stateSampling (statistics)NeuroinformatikSupremumIdentity managementComputer clusterGoodness of fitParallel computingComputer animation
AlgorithmRandom numberData modelOrder (biology)Mathematical analysisPermutationScheduling (computing)Mathematical optimizationPairwise comparisonModel theoryGreedy algorithmStructural loadDeterminismBound stateLemma (mathematics)System identificationOraclePoint (geometry)Term (mathematics)SequenceFraction (mathematics)CASE <Informatik>NumberMultiplication signInternet service providerVirtual machineOrder (biology)Metropolitan area networkMaxima and minimaRandomizationDeterminismMathematical analysisCategory of beingExpected valueModel theoryMeasurementObservational studyGame theoryOffice suiteSet (mathematics)TouchscreenClassical physics2 (number)Matching (graph theory)ResultantNatural numberComplete metric spaceSheaf (mathematics)Moment (mathematics)MultiplicationProcess (computing)Service (economics)WebsiteArithmetic meanCartesian coordinate systemRevision controlSmoothingSecretary problemSpacetimeSummierbarkeitCycle (graph theory)Parameter (computer programming)Functional (mathematics)Video gameGoodness of fitStrategy gameGreedy algorithmAlgorithmCore dumpoutputBound stateStructural loadServer (computing)Lemma (mathematics)Online-AlgorithmusScheduling (computing)AreaPermutationComputer animation
SequenceData modelOrder (biology)Scheduling (computing)Random numberAirfoilSet (mathematics)outputError messageLemma (mathematics)System identificationStructural loadOracleAlgorithmSimilarity (geometry)Bound stateExpected valueParameter (computer programming)DepictionMathematical analysisProcess (computing)AlgorithmEstimatorVirtual machineMultiplication signCategory of beingCASE <Informatik>Order (biology)SequenceRandomizationOracleStructural loadCommunications protocolRight angleMeasurementQuicksortDifferent (Kate Ryan album)ResultantWeightTotal S.A.NumberMessage passingSampling (statistics)Volume (thermodynamics)Bound stateReverse engineeringCycle (graph theory)Greatest elementLetterpress printingApproximationLogical constantOnline-AlgorithmusProbability theoryFreewareVulnerability (computing)Similarity (geometry)Online helpConcentricType theoryModel theoryComplex (psychology)Set (mathematics)Power (physics)Point (geometry)Video gameDemosceneElectronic mailing listLaptop2 (number)CommutatorScheduling (computing)Expected valueHypermediaWorkstation <Musikinstrument>CausalityPrincipal idealPhysical lawCurvatureBus (computing)Information technology consultingAreaOffice suiteOctahedronBoundary value problemFlip-flop (electronics)Classical physicsComputer animation
Data modelOrder (biology)Scheduling (computing)Random numberComputer animation
Transcript: English(auto-generated)
Welcome to my presentation on scheduling in the random order model. My name is Maxmin Janke and I am presenting joint work with Professor Susanna Alwas, my supervisor.
In this talk we are going to study the most basic scheduling problem of makespan minimization. A set of jobs is given and has to be assigned on M identical and parallel machine. Each job runs on precisely one machine, in particular preemption is not allowed.
In the following we see such an assignment or jobs, or the schedule. We will use it to introduce some basic notation. The load of a machine is the time this particular machine takes to process all the jobs it is assigned to. In particular this is just the sum of processing times of all the jobs which run on this machine. The makespan, which is the time we want to minimize, is then the time it takes us
to process all jobs, or formally the makespan is just the maximum load of a machine. This sample schedule already shows that finding an optimal schedule is far from trivial. Has anybody already spotted a way to improve it? In fact the optimal makespan is 4.5, which already shows that even the offline version
of this problem is highly interesting and non-trivial, and it has in fact been studied for quite a while in the literature. In this talk though we are going to focus on the online version. To an online algorithm jobs are revealed one by one, and each one has to be assigned permanently and irrevocably before the next one is revealed.
Hence online algorithms make choices without knowing the future. They only know jobs, in particular processing times of jobs, they have assigned so far, and the current job, which they are about to assign. This already impedes online algorithms from performing optimally on arbitrary input sequences.
Let us proof this by example. Say we are an online algorithm, and now we are presented with this one job of processing time 1.5. Since all machines are identical, we can just assign it to any machine. Now assume we want to maintain an optimal makespan, the next job has to go on an empty machine. Which one of course doesn't matter either, and for the next it's the same.
Now our online algorithm would have to make a choice. It could schedule this one job on the rightmost machine, or on the one machine with load1. Now scheduling it on the rightmost machine seems somewhat risky. If afterwards a large job comes, then we won't have an optimal makespan anymore.
So we use this machine, and we see that this was the right choice. The next job could have spoiled the schedule. Now assigning this job to the only machine on which we can still maintain an optimal makespan, will force the next job to be scheduled sub-optimally. So even if we put it on a least-loaded machine, we can see that the ratio between
the online makespan and the optimal makespan is 1.5. So in particular we are far from being optimal. Now while this may seem like a problem, it is in fact precisely what we want to study when we study online algorithms. The main question is, what is the price online algorithms pay for not knowing the future?
We commonly answer these questions in terms of compatibility analysis. Here we compare the ratio between our makespan and the optimal one. In this example it's 1.5. But of course we don't just look at any ratio, but we look at the worst-case ratio which can occur. In other words, we often think about this adversary who designs an input sequence of
job, maximizing the ratio between the online makespan and the optimal makespan. This maximum ratio, or to be precise it is the supremum over all the ratios the adversary can achieve, is then called the competitive ratio. The goal of designing good online algorithms is then of course finding online algorithms
for which one can prove low competitive ratios. Since the definition of a competitive ratio incorporates this very powerful adversary, we also call this model the adversarial model. Now let me briefly review the literature. And since the literature is quite vast, encompassing many areas such as randomization, resource augmentation, semi-online settings and many more, let us focus on the results most
relevant to our work. Upper and lower bounds on the competitive ratios achievable by deterministic online algorithms. The first result is already from the 60s due to Graham, who analyzed the most intuitive greedy strategy which always chooses a least loaded machine and performs quite well.
Too competitive in fact, he shows. It took nearly 30 years till Galambos and Wilginger improved upon these results for a small number of machines, which would be negligible by modern standards which commonly just look at the limit as the number of machines goes to infinity, but laid the foundation and the groundwork for the long line of research falling up on their result.
Which lead finally to the currently best competitive upper bound and competitive ratio by Fleischer and Wall, which is slightly below 1.9201. Of course of one honorable mention, it should be known that the best online algorithm, optimal online algorithm is basically known the model 2, but nothing more is known about
its competitive ratio. Lower bounds follow a similar progression. The first non-trivial one, due to Fekel et al., is 1 plus square root of 2 divided by 2, about 1.707. The currently best lower bound is due to Rudin III, who in his PhD thesis showed that
no deterministic online algorithm can be better than 1.885 competitive on general number of machines. As you see, this famous old problem has a small but distinct gap left to be filled, which is interesting, but probably also quite a hard problem, since no progress has been made during the last 20 years. So instead of closing this gap, we want to ask a question which has been asked quite
often in theoretical computer science, whether worst case analysis is too pessimistic. I would like to intrude this using the lower bound of Alberth, with which C19.99 shows that no online algorithm on a multiple of 40 machines can be better than 1.885 competitive.
So here you see a lower bound, and in fact, no deterministic online algorithm can be better than 1.885 competitive either on this sequence or on a prefix of it. But the question is how fragile is this sequence? Does it need to be precisely the way it is? And the answer, obviously if I asked you a question like that, is yes.
Alberth's arguments would not hold if we do some small perturbations, like delete any job, even the tiniest one would mostly spoil your arguments, and deleting a larger one would be even worse, if we swap non-identical jobs, or if we just change the number
of machines by one. Now if we consider a lower bound for 40 machines, changing the number of machines by one might seem like a big deal, but of course if we consider a lower bound for 8 million machines, and feed it into an online algorithm running on 8,001,999,000 machines, then this may seem like far less a deal.
Still her arguments wouldn't hold. Now of course this fragility could in fact be due to design, and in some way it also is. Certainly Alberth tried to provide the most simplest sequence she could become up with. For example if she wanted to make her sequence more robust, she could have added some negligible jobs of weight zero or epsilon, which could be deleted as one pleases, and swapped
around without causing any problems, since they are so small that they don't matter. But it leads to the central question we want to ask in this talk. How rare are both case sequences such as this one? And of course we're going to consider this question with regard on job order.
So how pessimistic or how many ways are there to order these sequences which perform bad? Or how probable is this for such a sequence to arise if we assume that it's not presented in the most worse order? For example here we see that all jobs are presented by increasing job size, and they
have to be as we argued, because if we swapped, Alberth's arguments wouldn't work. So what is the general situation? We of course need to make such a question formal. And luckily there's one another with this technique. One model which should immediately come to mind, since it has been successfully employed in a wide area of online research, and it does precisely that, wreaking the worst
order assumption. Of course we're talking about the random order model. We call that in the adversarial model, the adversary chooses both. The job order is the job set. In particular after it has chosen the job set, it rearranges it in a particularly devious way, a worse case order, to make our online algorithm look bad.
In the random order model, the segment capability is weakened. The adversary still chooses the job set, but the job order is picked uniformly at random. Now the optimal makespan does not depend on this order, but the online makespan
most likely does. As is common in randomized settings already, we thus consider the expected makespan. That is, we randomly reorder the input sequence, and then consider the expected makespan of the online algorithm. Of course comparing this expected makespan to the optimal makespan again leads to a notion of competitive ratio in the random order model, which is just the worst ratio
an adversary may achieve by just choosing a job set. And again of course the goal of algorithm design is to find algorithms for which one can show low competitive ratios, now in the random order model. Now since we asked on the previous slide, somewhat provocatively, whether a worse case
analysis was too pessimistic, there is one question we also should not forget to ask, just for fairness. You see if our machines are servers which run processes at jobs from multiple parties, which is the most common application, albeit there are many, we might not expect worse case orders. Worse case orders would require the parties to all collude, which might not be sensible.
But we also might not expect random orders, because each party individual might want to schedule a few jobs, and of course they may be shown in a particular order. The question we should ask the worst is, is the random order analysis too optimistic? Now I wouldn't dare answering this question either, but the answer might as well be yes.
So just to make sure, I would like to propose a new model, which we would like to regard as a smoothed analysis version of worst case adverse analysis, the notion of nearly competitiveness. Intuitively, prior adversarial work already analyzes the competitive ratio as the number
of machines goes to infinity. Thus they call an algorithm C-Competitive, if it's in fact C plus F of M-Competitive, for a small function M which runs as the number of machines goes to infinity. Now to get away from worst case orders, in our model we also want to exclude a
very tiny fraction of job sequences. Now to make this term very tiny formal, we have to say what we mean by that. By that we just mean that after random PM rotation, the probability of obtaining such a very rare and tiny fraction of bad orders approaches zero as the number of machines goes
to infinity. Finally, in our new model we even want to have a performance guarantee on these worst case sequences, so even here we don't perform arbitrarily bad. Formally we call an algorithm nearly C-Competitive, if it's C plus F of M-Competitive, on all but one minus G of M of the permutations of the input sequence, where G and F are
a function which vanish as the number of machines goes to infinity. And moreover we want to be K-Competitive for some constant K. Now this may sound like we are hiding some large cost here, but in fact we don't. For example, K in our case, which could of course be really large, is just 2. So even on worst case sequences our new algorithm will be not worse than Graham's
greedy strategy. It just seems sensible for a general measure of competitiveness to not fix the constant K to 2 for example. Now we think that this measure of nearly competitiveness is a really good measure for random order max-band minimization and that future analysis using it will be less
prone to have to use lots of case distinction arguments. But of course if you don't really care about this measure, which seems very close to a diadversal model but still better, we can of course show that an algorithm which is nearly C-Competitive is in particular C-Competitive in the random order model.
And this is not a very difficult argument. So even if you feel like our new measure is not very interesting, which would be unfortunate, our results still carry over to the classical random order model. Now before introducing these new results of ours, let me briefly review the literature
on random order scheduling. Interestingly, albeit random order analysis has been successfully employed in a big area of online algorithms, such as matching, bin-packing and of course the most classical secretary problem with its many generalizations, but also in many other areas, for scheduling not much is known about random order analysis.
So the first result, which we are aware of and we are only aware of three, is due to Osborne and Tong, who show that Graham's greedy strategy is still only too competitive. Surprisingly this matches what is possible in the adversarial model, at least as the number of machines goes to infinity. For small numbers of machines there might be a slight advantage, in fact we conjecture
that for two machines Graham's greedy strategy is 1.25 competitive, but this quickly vanishes as the number of machines goes to infinity, according to Osborne and Tong. In 2017, Ori Naro analyzes a very general scheduling problem, using random order analysis and voice case analysis both. Finally, I became recently aware of the work of Gurbel Kesselar-Montunis, now he's
called Abel, who studied the problem of minimizing the sum of completion times of all jobs where each job has also release time. Now for that problem, adversarial analysis does not allow for non-trivial competitive ratios, thus they also use the random order model.
Coming to our new results, we devised a new deterministic online algorithm, which is 1.8478 competitive in the random order model, not only in expectation, as was analyzed before, but in fact nearly 1.8478 competitive, meaning that sequences on which this ratio is not achieved are highly improbable and rare.
Recall that the best current deterministic algorithm is 1.92 competitive in the adversarial model, and the best what is theoretically possible by such an algorithm lies above 1.88. Since we vastly improve on either of these bounds, this shows that the adversarial worst
case sequences are highly order dependent and might even be considered pathological. We also, as a minor result, provide first lower bounds. We showed that in the random order model, no algorithm can be better than 4 over 3 competitive and not better than nearly 1.5 competitive. But of course, our main result is the nearly 1.8478 competitive algorithm, which we provide.
Since our algorithm relies deeply on the many great discoveries and achievements made in prior work on adversarial scheduling as intuition, we decided against discussing it and its analysis in every detail.
Instead, we decided to just show the three key ideas, which set us apart from the adversarial model. The key three steps to profit from random order arrival. The first of these steps is the load lemma. In the random order model, two time measures coexist, one important for probabilistic considerations, one essential for analysis purposes.
The load lemma compares them. Without it, analysis on this problem should be impossible. We deem it fundamental for future analysis. Second, there is what we call large-shop magic. Of course, this technique is not magic in the sense that it fails to be mathematically sound and rigorous, but in the way it should satisfy anybody familiar with adversarial
analysis. It allows to skip their most complicated arguments, commonly found in the appendices, while compatibility increases tremendously. Finally, there are Oracle-like properties, which gives the model a taste of semi-online scheduling, which is so subtle that it requires deep insights into the analysis not
to be missed. Combining these three core techniques into a challenge in analysis, which I invite you to delve into by perusing our preprint or the IKAL paper, yields the desired nearly 1.8478 competitive algorithm. So let us start with the load lemma, which I like to introduce by a rather philosophical
question, which sheds great intuition on how one should think of time in online mix-band minimization. Where would you consider the middle of the sequence to be? After half the jobs are scheduled? This is what we call the naive measure. But to a mix-band analysis, this is uninteresting.
Nothing of importance happened so far. You see that the great, large and important jobs are yet to arrive, they are all to the right of the center. So far, at this middle, only a few tiny jobs arrived. And of course, an adversary who knows that they wanted to make an argument about this half of the sequence, she or he might as well fill it with negligible jobs, or jobs
of weight zero. So no argument could be made. From an analysis perspective, the whole sequence looks differently. Not the number of jobs scheduled measure, but the amount of total processing volume passed. The bottom sequence depicts this. Here, processing time is not encoded in the height of a job, but its width.
We see that in this slowed measure, the naive half is actually quite early, not much happened. While at the analysis half of the sequence, 80% of the sequence in fact passed. Now why I am telling you this? The naive time measure of course comes in handy when we do probability theory.
When have we seen the largest job with probability one half? Well, at the naive half time. Now we have two time measures, and they don't agree. How could they? The jobs are ordered by increasing processing times. But wait, we actually don't look at ordered sequences.
So what happens if we randomly permute this sequence? Do the measures still disagree? It looks better. So here the measures seem much more similar, and we may tentatively ask, are these measures similar in the random order model? Now the short answer is no.
There are certain simple sequences we need to exclude, like this one. Here it may look like there is one job, but in fact there are very many tiny jobs and one job that measures. Now here these measures couldn't agree. The load measure would just look at the one job, while the naive measure would see all the tiny jobs. We need to exclude simple sequences, where a few jobs hog all the processing volume
for all kinds of arguments in fact, not only to prove the load lemma. Luckily it is simple for us to show C-compativeness on such sequences. After excluding these sequences, we may in fact say that the load measure and the naive measure agree. Of course there are some legalese involved, so we have to read the fine print.
For example we wouldn't expect them to agree 100%, but only up to epsilon. Next we're going to discuss large job magic, getting large jobs for free. Now let me start with yet another philosophical question. Which of the following sequences is not random? Now if you guessed correctly, the first one, you're right.
Here suspicious concentration of large jobs occurs close to the end of the sequence. Now this easy principle, namely that in a random sequence the concentration of large jobs should be somewhat constant, can be already applied to get large jobs for free. How do we do that?
If we can prove a high concentration of large jobs toward the end of the sequence, we can expect this concentration to be global, thus there will be many large jobs in this sequence. Now this does not sound like magic at all. Well, it should for those who are familiar with the most complex arguments in adversarial analysis.
Look in the appendices of prior work where the goal is to prove an upper bound on a certain average load. Here you see a depiction of the argument one needs. In our case we can just skip the whole argument. If the bound we need does not hold, we are already done by large job magic. This of course not only circumvents a very complicated argument,
it actually vastly improves the competitive ratio of our algorithm. There is another subtle use of large job magic which I'm not going to get into here. The final way we profit from random order arrival is what I want to term oracle-like properties. If you look at adversarial algorithms, they maintain a lower bound for OPT at time t,
an estimate of OPT. Now improving the quality of this lower bound will unsurprisingly also improve the competitive ratio. And if they can estimate OPT substantially better, then they should also perform better, all other things being equal. Interestingly, contemporary work in the adversarial model only uses one value for the lower bound,
the famous average load. Not that they did not know any better values, they just could not prove them to be useful. Consider in particular the largest job seen so far. Including the size of the largest job seen so far into the local lower bound on the optimal makespan is useless,
if worst case sequences tend to order their jobs by increasing processing times anyway. Yeah, this is a wrap. This of course is where we come in. In fact, just including the bound of the largest job seen so far can immediately improve classical adversarial algorithms. They should get a competitive ratio slightly below 1.9,
but still not beating the general lower bound for adversarial algorithms. And of course, this ratio should only hold an expectation, not abide to our much stronger measure of nearly competitiveness. Now, lifting this result and improving compatibility further does require all we have got.
We have to first of all include a third lower bound. Secondly, we need to sample a third machine. The most recent adversarial algorithms tend to sample two machines, as is depicted on the bottom. And we actually need to sample a third one. And we need a very interesting use of large job magic.
Now, the details go beyond this talk, but it leads to the following risky protocol cycle. The value Pm plus one, the size of the m plus one, the largest job, is a value very common in analysis. Commonly, the challenge of every adversarial analysis is to prove that this value is sufficiently large.
Now, the value Ph works as a sort of oracle, in the way that it is available to the online algorithm at the right time with very high probability. By classical analysis then, a large value of Ph will immediately lead to a large value of Pm plus one. By an interesting use of large job magic, this can be reversed.
In fact, this large bound on Pm plus one will ensure an even larger bound on Ph. We can run through this cycle till Ph is huge, giving a constant approximation of Opt and working in some sort of weak bin-stretching type of oracle. Now, of course, the details of this are beyond this talk.
But this is what we mean by the oracle-like properties, and they help a lot. So, this concludes the talk. Let me briefly summarize how random order arrival helps. First of all, there is the load lemma, relating algorithmic and probabilistic properties.
Second, large job magic allows us to get jobs for free in ways which would be unthinkable in adversarial settings. Finally, there are oracle-like properties, which gives the setting a very subtle taste of semi-online scheduling, albeit the taste is really subtle.
With this being said, I hope you enjoyed my talk, and thank you for your attention.