For the case of <i>load balancing</i>
as in the OP, for a statement of the
problem, over time, requests for work
arrive at an Internet computer
server farm. The requests differ in
what computing resources
at the server farm are needed
and for each resource how much work.
E.g., a request may need 10 GB of main
memory and 40 GB of virtual memory
paged memory, 3 TB of I/O to
rotating disk, and four processor
cores, all for 50 minutes.<p>So, at each arrival, have to
assign the work to a server.
When the assignment is made,
we do know about about the, say,
<i>state</i> of each of the servers,
say, what work has been assigned
and how busy the server is.<p>The goal of the work is in some
sense to get the best performance
from the server farm for the loads.
There are various candidate
definitions/measures of performance,
e.g., minimize the average
elapsed time
for the work of the requests to
be done, that is, minimize the
average <i>response</i> time
as seen by the users/customers.<p>So, the work of the load
balancing is just the
assignments, one
work request at a time,
to the servers. This
assignment is under
<i>uncertainty</i> maybe about
the details of each request
and, so that can
<i>plan ahead</i>, also about what
the next requests will be.<p>E.g., if we had some idea
that soon we would get a
request that would need
all of one server, then
maybe we should think ahead
and have reserved an empty
server for that request. If
we don't reserve such a
server, then to assign
this request we might have to
keep waiting until have
an empty server and, unless
make an effort to have
such an empty server,
the request could
go unassigned for a long
time thus hurting the goal.<p>So, we have to make decisions
over time under uncertainty
to achieve a goal that is
an average of the results.<p>Now likely we have a problem in
the field of applied math called
<i>stochastic optimal control</i>.
In our case the <i>control</i> is the freedom
we have when we make the
assignments. <i>Stochastic</i>
means varying with uncertainty
over time. The <i>stochastic</i> part
is the uncertainty of the
next requests that arrive.
The <i>optimal</i> part is that
we want to do best as possible
likely on some average measure,
only <i>average</i> because we
are working with uncertainty.
Sure, if we don't like average,
we could use median, etc.<p>IIRC there was some work on
this problem by some especially
diligent researchers at the IBM
Watson lab and based in part on
work of H. Kushner at the
Division of Applied Mathematics
at Brown University.<p>IIRC the good news is that those researchers made good
progress. IIRC the bad news was
that the computations for
their solution were a bit
too large to be practical!
Also, from the references I
saw them using, the math
prerequisites for their
work were a bit much!<p>So, maybe good work for
a solution is in the literature
but too difficult to use.
Maybe.<p>My suggestion: Be practical.
First cut, assign the next
request for work to the least
busy server. If the server farm
is so busy that often all the
servers are too busy for more
work, then have to leave requests
in a FIFO queue waiting until
some server is not too busy
for the first request in the queue.
Collect empirical, production data on this <i>control</i>, the FIFO queue,
the response times, any
problems etc. If see no significant
problems, then declare the problem
solved for now. Else focus on where
the problems are and look
for a control that also handles
those problems.<p>This material about stochastic optimal control was heavily from R. Bellman. Other authors include W. Fleming,
D. Bertsekas, S. Shreve, E. Dynkin, R.
Rockafellar, R. Wets. The field has
been of interest by parts of
applied math (e.g., Brown), operations
research, and systems and
electronic engineering.<p>This load balancing at a server
farm contains as a special
case some of the topics
in queuing theory.<p>Computer science has long
been interested in the
problem of <i>job scheduling</i>
on batch computer systems
which, of course, is a
relatively small special
case of load balancing
as in the OP. IIRC the usual
result of the job scheduling
work was that
the problem was in NP-complete
and otherwise in practice
more work to do than
the work being scheduled!
Maybe now computer science
will also be interested in
something stochastic
optimal control for
load balancing.