ACCURACY OF A HEURISTIC FOR TOTAL WEIGHTED COMPLETION TIME MINIMIZATION IN PREEMPTIVE SINGLE MACHINE SCHEDULING PROBLEM BY NO IDLE TIME INTERVALS

Background. A special case of the job scheduling process is that when jobs are processed on a single machine, preemptions are allowed, and there are no idle time intervals. Despite the exact solution models are always much slower than the heuristics, laws of heuristic’s rapidness advantage and heuristic’s solution closeness to the exact solution are unknown. Such laws would be useful to estimate real benefits of solution approximation. Objective. Issuing from the lack of knowledge in relationship between heuristics and exact solution models, the goal is to study statistical difference between them for the preemptive single machine scheduling problem by no idle time intervals. Methods. The two well-known approaches are invoked — the rule of weighted shortest remaining processing period as a heuristic and the Boolean linear programming model as an exact model. The relative error of the heuristic is defined and then studied how it varies against increasing complexity of job scheduling problems. The heuristic’s rapidness gain is shown as well. Results. The main issue with the heuristic’s accuracy can arise at a few jobs to be scheduled. Additionally to this, if a sequence of jobs is divided into the fewest parts, the heuristic’s accuracy becomes the lowest. The exception exists for the shortest sequences — when only two jobs are to be scheduled. As the number of jobs increases up off 6, the relative error expectedly decreases along with the dramatically growing heuristic’s rapidness advantage. Therefore, scheduling a long sequence of jobs is preferable. The top relative error of the heuristic can exceed 6 % for three to five jobs to be scheduled, when they are divided into the fewest parts. Conclusions. Starting off six jobs, the heuristic’s accuracy averagely increases, by a fixed rate of randomness in processing periods, priority weights, and release dates, as the complexity of job scheduling problems increases. The rate of randomness influences inversely: if processing periods, priority weights, and release dates are more randomly scattered, the heuristic schedules more accurately. The exact approach is truly applicable for cases when three to five jobs are to be scheduled (in particular cases, when the number of job parts is constant and is 2, the upper number of jobs can be increased up to 10). For such cases, an approximate solution’s real loss (given by the heuristic) is the average relative error not exceeding 1.2 % for job scheduling problems with low rate of the randomness. If such a loss is not admissible, the exact approach will work instead.


Introduction
Job scheduling is the process of allocating/assigning tasks to be processed on machines. A special case is that when jobs are processed on a single machine [1,2]. The minimal number of jobs is 2, and each job has an arbitrary processing time/period, release date, and priority weight. Going with it a little bit wider, preemptions are allowed, which means that the processing of any job can be interrupted at any time and any number of times in favor of other jobs [3]. Another special condition/requirement is that there are no idle time intervals [1,3,4].
The preemptive single machine scheduling problem of minimizing total weighted completion time with arbitrary processing periods and release dates is an important NP-hard problem in the scheduling theory [1,2]. Obtaining an exact solution to this problem becomes very resource-consuming (implying processor clock speed, memory space, and time) even for a few jobs [1,5]. For a few tens of jobs, the problem becomes intractable even for the fastest and most powerful processors [1,4,6,7].
Processing periods and release dates are often given as integers. This opens a way to solve the respective integer linear programming problem. Models based on the branch-and-bound approach are commonly used for that [6,8]. Along with models of obtaining an exact solution, there are a lot of heuristics allowing to find an approximate solution [1,2,4,7]. Computational studies claim that those heuristics are extremely rapid compared to the exact solution models. Besides, a heuristic approximate solution appears very close to the exact one [1,4].
Despite the exact solution models are always much slower than the heuristics, laws of heuristic's rapidness advantage and heuristic's solution closeness to the exact solution are unknown. Such laws would be useful to estimate real benefits of solution approximation. What is the actual difference between an exact approach and a heuristic? Are there any artifacts in the difference? Whether could we apply the exact model at all? If not (at least for a definite set of input parameters), then it is no sense to consider and develop such exact models for the mentioned job scheduling problem. These questions are open and waiting to be addressed.

Problem statement
Issuing from the lack of knowledge in relationship between heuristics and exact solution models, the goal is to study statistical difference between them for the preemptive single machine scheduling problem by no idle time intervals. The two wellknown approaches will be invoked for this -the rule of weighted shortest remaining processing period as a heuristic and the Boolean linear programming model as an exact model [1,6,8]. The relative error of the heuristic will be defined and then studied how it varies against increasing complexity of job scheduling problems. The heuristic's rapidness gain is going to be shown as well. The research result should answer the questions of when the exact approach is truly applicable, and what an approximate solution's real loss (given by the heuristic) is.

Boolean linear programming model
is a vector of processing periods, is a vector of priority weights, and is a vector of release dates. However, whereas n r is the time moment, at which job n becomes available for processing, condition of "the proper start", which is is the exact total weighted completion time for those N jobs.

A heuristic approach
The heuristic is an online scheduling algorithm, which applies the rule of weighted shortest remaining processing period [1]. Let gives a set of ratios whence the maximal ratio is achieved at subset is an approximately minimal total weighted completion time that corresponds to the nearly optimal job schedule . This schedule often coincides with the job schedule produced by exact solution (16): Therefore, relative error and computation time ratio make sense to be considered for cases when inequality (31) is strict.

Counterexamples of the heuristic job scheduling
Consider a problem of scheduling five jobs by the following parameters: is pretty noticeable, although computation time ratio (34) is quite great. Nonetheless, when it is not critical to spend up to 10 milliseconds for an exact solution, the heuristic (23)-(29) for this example would be worse than the exact solution by Boolean linear programming model (1)-(22).
If we modify the scheduling problem just in the third job processing period to Relative error (38), i. e. the error in 7.5 %, may be critical in some cases.
Another counterexample with five jobs is peculiar in that all the job processing periods are equal: Just like in the previous case, we have an error in 6.3 % once again. Despite relative error (40) for seven jobs stands against the same value (39) for five ones, a greater number of jobs does not necessarily make the respective relative error less significant. Those four counterexamples might serve as the basics for a few classes of the job scheduling problem, wherein the heuristic works poorly. Hence, it is common to learn some statistics of the heuristic's accuracy. It will help in revealing "weak places" of the heuristic.

An analysis of heuristic's advantage
First of all, if every job consists of a single part, total weighted completion times (30) and (29) are the same, although the heuristic's and the exact model's schedules may differ (in particular, jobs with identical priority weights may be permuted). It is very easy to check this for up to 100 jobs and even more owing to that finding solution (16) by the Boolean linear programming model is relatively very rapid for the case when the number of jobs is equal to the grand total of processing periods T in (7). Henceforward, learning statistics of the heuristic's accuracy should be started off the cases wherein, along with (1), 1 1 For obtaining credible results of a statistical analysis, a few series of job scheduling problems should be generated with randomizers. The generators for processing periods (1), priority weights (2), and release dates (3)    As it is clearly seen, randomly generated job scheduling problems by (42) -(44) are solved more accurately by the heuristic. The average relative error does not exceed 0.2 %. On the contrary, job scheduling problems with a way lesser randomness, where only priority weights are generated random by (43), are not always solved accurately enough. For such cases, the average relative error exceeds 1.2 % which may be a significant loss. In particular cases, the relative error easily jumps beyond 6 %. Indeed, just for a few instances with three jobs to be scheduled, the following results are very "discreditable" for the heuristic:  Statistically, scheduling three and five jobs is the most "vulnerable" when using the heuristic. Despite scheduling five jobs by the heuristic fails to be accurate rarely, its "vulnerability" is still impressive by 25 , e. g.  Nevertheless, scheduling a greater number of jobs (7,8,9,10, and so on) appears more accurate (then the maximal relative ratio does not exceed 5 %). Fig. 5 reveals that the heuristic's rapidness gain by computation time ratio (34) grows dramatically: since 5 N it is hardly comparable to the solution procedure by the Boolean linear programming model. The heuristic is averagely at least 100 times faster. Three jobs are scheduled on average 580 times faster, whereas five jobs are scheduled beyond 10000 times faster. Besides, the higher randomness in processing periods (42), priority weights (43), and release dates (44) facilitates in the faster solving. However, the heuristic's rapidness gain for three jobs is not so perfect ever: there are cases when it drops to about 100 down to 10. Fig. 4 hints at that the constant processing period (46) and unique monotonously increasing release dates (48) are the "weak places" of the heuristic. When the constant processing period is increased to the "weakness" gradually disappears (Fig. 6). Such an outcome truly does depend on (49), but the dependence is weak itself. It is worth to note that whereas the "weakest" job scheduling problem is of three jobs, it has also the least rapidness gain (see Fig. 5) by computation time ratio (34). But at the higher randomness in processing periods (42), priority weights (43), and release dates (44) the slowest heuristic is expected for job scheduling problems with two jobs.

Discussion
The graphical results in Figs. 4 -6 are quite credible owing to good enough averaging (it has been used 100 repetition cycles of generating a one job scheduling problem). If the generations were repeated all over again, similar graphics would be obtained, although the peaks in Fig. 4 and polylines in Fig. 6 might then be slightly displaced. The displacement would likely be perceptible but it would not break the general tendency.
Obviously, the statistical analysis carried out by generating both random job scheduling problems and problems wherein the later job has a greater weight by a constant processing period might have been continued: a class of randomizers could be widened, and the constant processing period could be increased up to a few tens. However, even the factually obtained results visualized in Figures 4 and 6 allow to confidentially claim that the main issue with the heuristic's accuracy can arise at a few jobs to be scheduled. Additionally to this, if a sequence of jobs is divided into the fewest parts, the heuristic's accuracy becomes the lowest. The exception exists for the shortest sequenceswhen only two jobs are to be scheduled. As the number of jobs increases up off 6, the relative error expectedly decreases along with the dramatically growing computation time ratio (34). Therefore, scheduling a long sequence of jobs is preferable, although it does not mean that we can just "glue" together a few short scheduling problems into a longer one.

Conclusions
In approximately solving the preemptive single machine scheduling problem by no idle time intervals, the top relative error of the heuristic can exceed 6 % for three to five jobs to be scheduled, when they are divided into the fewest parts. As a triviality, the heuristic schedules two jobs exactly, with 100 % accuracy. Another triviality is that, whichever the number of single-part jobs is, by just excluding case (41), the heuristic schedules them exactly as well.
Starting off six jobs, the heuristic's accuracy averagely increases, by a fixed rate of randomness in processing periods, priority weights, and release dates, as the complexity of job scheduling problems increases. The term "complexity" here can be roughly treated as a computational complexity [12]. Therefore, the heuristic's advantage grows as the job scheduling problem becomes more complicated. The rate of randomness influences inversely: if processing periods, priority weights, and release dates are more randomly scattered, the heuristic schedules more accurately.
The heuristic's rapidness is hardly comparable to that of the exact model even for a few jobs. It has been revealed that the heuristic's rapidness gain by computation time ratio (34) grows dramatically as the number of jobs increases. Meanwhile, the complexity of job scheduling problems is not so influential on the heuristic's rapidness gain.
Hence, the exact approach is truly applicable for cases when three to five jobs are to be scheduled (in particular cases, when the number of job parts is constant and is 2, the upper number of jobs can be increased up to 10). For such cases, an approximate solution's real loss (given by the heuristic) is the average relative error not exceeding 1.2 % for job scheduling problems with low rate of the randomness. If such a loss is not admissible, the exact approach will work instead. Thus, the further research may be focused on possibilities to speed up the computations by the exact approach.