Combinatorial Optimization Functions
JavaTools provides several network flow optimization functions from the realm of combinatorial optimization. At a glance:
◆ JDijkstra[] to compute the set of all shortest paths from a source node to all other nodes.
◆ JFloyd[] to compute the optimal all-pairs shortest path solution (arbitrary arc costs, including negative)
◆ JMooreBellman[] to compute all shortest paths from a source node to all other nodes (arbitrary arc costs, including negative)
◆ JKruskal[] to compute the minimum spanning tree of a connected undirected network
◆ JKnapsack[] to compute an optimal selection of items for the knapsack problem.
◆ JTravellingSalesman[] to compute a solution of the Travelling Salesman Problem (four methods).
◆ JTravellingSalesmanDelaunay[] to compute a solution of the Travelling Salesman Problem using the Delaunay Triangulation to reduce the number of connectable points.
◆ JTravellingSalesmanMAOS[] to compute a solution of the Travelling Salesman Problem using the MAOS multi-agent optimization system on the JavaTools server.
◆ JPostOpt[] to compute an improved solution of the Travelling Salesman Problem using an existing solution as input (post-optimization).
All of these methods by far outperform their Mathematica equivalents, to the point of being so much faster that they allow for the solution of problem sizes that the built-in functions cannot cope with. For example, thousands of nodes in the case of the Travelling Salesman Problem can be solved, whereas a practical limit of the built-in function FindShortestTour[] is a few hundred nodes. They also feature some of the best algorithms known today to solve these problems. For example, MAOS is one of the world-best TSP solvers and has produced the best solutions of several benchmarking problems of the TSPLib, some of them to optimality. Besides, JTravellingSalesmanMAOS[] runs on the JavaTools server, a high-performance AMD Opteron system powered by Fedora 16. The solutions of the server-based computations are also encrypted with 2048 bit strength RSA encryption, protecting the proprietary results of the user against intercepting traffic sniffers on the internet. (Neither input data nor output results are stored or logged on the JavaTools server.)
JDijkstra[]
The function JDijkstra[] computes the set of all shortest paths from a source node s to all other nodes in a cycle-free directed network with non-negative arc costs. JDijkstra[] uses the collections from the Java collections framework as well as the Google Collections (now part of Google Guava) internally as sets of nodes and arcs and their dependencies can be easily represented through these data structures.
Although the Dijkstra algorithm is not the fastest algorithm to determine all shortest paths from a source node to all other nodes in the network, the JavaTools implementation with the above-mentioned collections is MUCH faster than the Dijkstra algorithm implementation in the Mathematica Combinatorica package, as the latter is entirely written with symbolic Mathematica code and doesn't execute in a compiled environment like the Java virtual machine. In particular on large problems with hundreds of nodes and arcs the performance difference is substantial. In fact, the large examples at the end of this page couldn't be solved with Mathematica's Dijkstra implementation at all.
The inputs to the function JDijkstra[] are a sparse array defining the arc costs for every node pair having an arc, and an integer indicating which node is the source node. Spare arrays are particularly memory-efficient on large graphs where most node pairs do not have an arc.
The following example uses the Dijkstra algorithm on a very simple problem with 6 nodes and 6 arcs:
This computes all shortest paths from source node 1:
This is to be interpreted as follows: The first list is a list of the predecessor nodes. The second list is a list of the shortest path distances from the source node to every other node. The predecessor lists tells us that in the optimal solution the predecessor of node 1 (the source node) is Null (meaning, it has none), the predecssor of node 2 is node 1 (the source node), the predecssor of node 3 is node 1, the predecssor of node 4 is node 2, asf. The second list tells us that in the optimal solution the shortest path distance for node 1 is 0, for node 2 is 1, for node 3 is 2, for node 4 is 4, asf.
Note how fast the optimal solution is computed with JDijkstra[] from the JavaTools:
The arc costs matrix, in which the cost for the arc (i,j) is shown in the element in row i and column j, looks as follows (0 means the nodes have no connecting arc):
We can see that the arc between nodes 1 and 2 is included in the shortest paths for nodes 2 and 4. Therefore, if we increase its value (=costs), that will be reflected in the costs of the optimal solution, and we can also see that due to its high cost, arc (1,2) is NOT included in the optimal solution for node 6. While the cost for the latter was 7.4 previously, going "over" arc (1,2), it is now 9.3, going "over" arcs (1,3 ), (3,5), and (5,6).
Here is a slightly more complicated example:
JDijkstra[] is very fast. It finishes in less than 2 milliseconds.
The following example reads a file from the MatrixMart example data, uses Abs to ensure that all arc costs are positive, and subtracts the main diagonal to ensure that no node has an arc to itself. Then we compute all shortest paths with node 1 taken to be the source node. JDijkstra[] finiahes in 6 seconds. Dijkstra{} in the Mathematica Combinatorica package can not compute this problem at all due to its size.
The runtime complexity of the Dijkstra algorithm is O(
).
For sparse networks with only non-negative arc costs the D'Esopo Pape variant of the Moore Bellmann algorithm is more efficient (O(min(nmC,m
))), thus possibily linear) than the Dijkstra algorithm, see JMooreBellmann[] below. While JDijkstra[] computed the above result in 6 seconds, JMooreBellmann[] computes it in 78 milliseconds due to the extremely efficient double-edged queue used internally.
The author is inviting competing implementations in Mathematica or Java or both for large problems at info@lauschkeconsulting.com to challenge the presented performance claim.
JFloyd[]
The JavaTools function JFloyd[] computes the optimal all-pairs shortest path solution, permitting negative arc costs as well. "all-pairs" means that unlike the Dijkstra algorithm the Floyd algorithm computes the shortest paths to all nodes for ALL nodes as source nodes. The output is the so-called "shortest paths" matrix, which contains the optimal nodes for every row (where row index n represents node n as source node), where Infinity indicates that there is no path from that source node to the target node (column index), as well as the matrix with the corresponding distances. Using the example data from above we get:
JFloyd[] finishes in less than 2 milliseconds:
Note that the last example for JDijkstra[] is just a special case of the output of the Floyd algorithm (first row).
Using the MatrixMart example from above, JFloyd[] computes the all-pairs optimal solution, and we can compare the first row with the output from the Dijkstra algorithm, which must be identical:
The values attained by the two solutions have to be the same (not necessarily the solutions themselves). Three cases contain Infinity (which means no directed path is possible between the two nodes), and where one solution has Infinity, so does the other. In all other cases the difference between the two values is 0:
The runtime is less than 9 seconds:
The runtime complexity of the Floyd algorithm is O(
), and its data storage requirements are O(
). It is therefore much more efficient to use the MooreBellmann or Dijkstra algorithms over the Floyd algorithm if only the optimal paths for one source node have to be computed.
The author is inviting competing implementations in Mathematica or Java or both for large problems at info@lauschkeconsulting.com to challenge the presented performance claim.
JMooreBellmann[]
The JavaTools function JMooreBellmann[] finds the shortest paths from a source node to all other nodes for arbitrary arc costs, using the D'Esopo Pape variant, which is the most efficient method known today for practical problems (its theoretical worse-case complexity is exponential). The D'Esopo Pape variant is also sometimes known as the "Deque" variant ("double-edged queue"), as a deque is used internally for updating nodes. Unlike the Dijkstra algorithm, the Moore-Bellmann algorithm works correctly even for arbitrary (negative) arc costs. However, for the Moore-Bellmann to work correctly, it is imperative that the network does not contain any *circles* of negative arc length (as it would be possible to "walk" in the circle an infinite number of times and always decrease the total arc length).
Using the example from above:
This is the same output as was computed with the Dijkstra algorithm, and it matches the first line of the output of the Floyd algorithm.
Its running time is very fast: less than 2 millisconds for this example:
In the following example we use a few negative arc costs, not introducing a negative cycle:
This is the same output as was computed with the Dijkstra algorithm, and it matches the first line of the output of the Floyd algorithm.
For dense networks with only non-negative arc costs the Dijkstra algorithm is more efficient (O(
)), for sparse networks with only non-negative arc costs the D'Esopo Pape variant of the Moore Bellmann algorithm (implemented in JMooreBellmann[]) is more efficient (O(min(nmC,m
)), thus possibily linear).
Using the MatrixMart example from above, JMooreBellmann[] computes the result in 78 milliseconds.
Here as well we get different solutions, but identical values of the solution:
However, the solutions themselves returned by the two algorithms are NOT necessarily the same (the solutions of the shortest path problem, and of the minimum spanning tree, see below, are not unique). In this example, out of the 1074 paths, 1038 are the same, 36 are different:
This means that in 36 cases different paths were chosen, however, they both attain the same optimum value.
The author is inviting competing implementations in Mathematica or Java or both for large problems at info@lauschkeconsulting.com to challenge the presented performance claim.
JKruskal[]
JavaTools also provides a very efficient implementation of the Kruskal algorithm to compute the minimum spanning tree of a connected undirected network. The "modified Kruskal algorithm" is invoked with JKruskal[] in the JavaTools, and it operates on sparse arrays or lists with the nodes and costs. Here we use the sparse array version:
This means the arcs (1,2), (2,4), (3,4), and (3,5) form the minimum cost spanning tree.
Note the speed: JKruskal[] finishes in roughly a millisecond:
The author is inviting competing implementations in Mathematica or Java or both for large problems at info@lauschkeconsulting.com to challenge the presented performance claim.
Knapsack Problem
JavaTools provides a very efficient method for solving the knapsack problem exactly, and one that is extremely fast but only a heuristic. The exact method is guaranteed to produce an optimal solution, not just a heuristic solution (the solution is not necessarily unique, there can in theory be multiple solutions). Note that the general knapsack problem is NP-complete. The heuristic method will come “close” to the optimal solution, but for larger datasets will usually miss it by a small amount. However, it is much faster.
Applicability
A hiker has a backpack with limited capacity. He/she wishes to include the items that provide maximum utility/value that “fit” within the capacity constraint of the backpack. In other words, the objective is to max out the capacity with items that provide maximum utility value, but the “weights” of the items cannot total more than what he/she can carry on that trip. The knapsack problem is one of the most basic, fundamental problems in combinatorial optimization, opening many avenues for problem solving by means of problem reduction. In the science of combinatorial optimization the knapsack problem is considered BASIC and FUNDAMENTAL. For the practitioner, the knapsack problem allows the modelling of many applied tasks, ranging from capital budgeting over cargo loading to cutting stock. Simply consider a freight airplane or an ocean tanker that can carry only a fixed amount of weight or volume, but the carrier wants to maximize the “bang for the buck”, and can pick-and-choose which items to carry and which items to leave at home. Or consider an investor who has a fixed amount of capital and wants to allocate all or part of his/her capital to various investment alternatives, where every investment alternative requires a certain amount of capital
to be deposited, and promises a return or profit of
. It is obvious that the knapsack solution of that problem is the optimal investment decision.
Example
For a simple introductory example we assume the following data:
Utility values: item 1, 40, item 2, 15, item 3, 20, item 4, 10
Costs/weights: item 1, 4, item 2, 2, item 3, 3, item 4, 1
Capacity: 6
This means items 1 and 2 should be included in the hike, yielding a utility value of 55 (40+15), and all other items should be excluded. We can see clearly that this solution is optimal: Items 1 and 2 have costs of 4 and 2, which total 6, which is the capacity restriction. If we tried to “max out” the 6 with other combinations of items, for example picking items 2, 3, and 4 (with costs 2+3+1=6), we’d only get 15+20+10, or 45, as utility value. Picking items 1 and 2 is better, they provide a greater utility value.
In the following example we generate random data to show the performance for large problems.
In[42]:=
In[81]:=
Out[81]//Short=
In[82]:=
Out[82]//Short=
In[83]:=
Out[83]=
In[84]:=
Out[84]=
Note the capacity utilization:
In[85]:=
Out[85]=
which maxes out, but does not violate, the capacity restriction. The slack is capacity minus total costs, or
In[86]:=
Out[86]=
or 0 slack.
The total = maximum utility possibly attainable with that capacity restriction is
In[87]:=
Out[87]=
That means a) in some 11 seconds we determined that items 2, 74, 98, 110, 117, 149, 163, 173, 185, 240, 247, 269, 275, 281, 309, 335, 372, 429 should be included in the trip, which b) provides maximum utility value possibly attainable (which is 6253), and c) maxes out the capacity restriction, but d) doesn’t violate it, and e) is guaranteed to be optimal (among possibly other optimal solutions), and f) is not just a heuristic solution, but guaranteed optimal, and g) no other solution yields a higher utility total.
Note that for efficiency/speed reasons the JKnapsack[] method in JavaTools only accepts integer data. If you have fractional data, you have to rationalize your data (Mathematica’s function Rationalize[]) and multiply your data with the largest denominator. This is in line with the fact that most high-performance optimization algorithms in combinatorial optimization operate MUCH faster on integer data, and the cost of converting fractional data to integer data is by far inferior to the speed gains obtained by operating only on integer data.
The JKnapsack[] method in JavaTools is very efficient because internally it uses SparseArray[]s for the Mathematica parts and an extremely advanced dynamic programming method for the Java parts to obtain its guaranteed optimal solution.
The heuristic method is called by simply setting the last optional argument to False. JKnapsack[] contains a fourth argument exact of type boolean, which is True by default. Thus, if JKnapsack[] is called with only three arguments, the exact method is used. With JKnapsack[...,...,...,False] the heuristic method is called.
It is much faster:
In[88]:=
Out[88]=
But it didn’t exhaust the capacity to its full extent:
In[89]:=
Out[89]=
In[90]:=
Out[90]=
The value of the solution is less than that of the optimal solution:
In[91]:=
Out[91]=
The heuristic method is provided for extremely large problems:
In[92]:=
In[93]:=
Out[93]//Short=
In[94]:=
Out[94]//Short=
In[98]:=
In[99]:=
Out[99]//Short=
In[100]:=
Out[100]=
In[101]:=
Out[101]=
In[102]:=
Out[102]=
The author is inviting competing implementations in Mathematica or Java or both for large problems at info@lauschkeconsulting.com to challenge the presented performance claim.
Travelling Salesman Problem
Initial/Creational Methods
JavaTools provides the FarthestInsert, NearestInsert, NearestNeighbor, and CheapestInsert methods as initial/creational algorithms, as well as a random method in the multi-threaded TSP functions that use post-optimization and have less than 300 cities. JavaTools also includes the extremely powerful MAOS multi-agent optimization system (http://www.adaptivebox.net/doi/MAOS-TSP, used with permission by the license owners), which, although operating only as a heuristic system as well, provides some of the best solutions to TSP problems that can be obtained. In fact, for almost all TSP problems with only a few thousand nodes the solutions ARE optimal. Two of the NETLib TSP benchmarking problems were first solved to optimality with MAOS.
The function JTravellingSalesman[] accepts an n x 2 array for the edge list together with an n x 1 array for the cost vector and outputs a list with the edges (as pairs of nodes) of the solution. Only one edge weight has to be specified in the input, as the weight of the opposite direction is taken to be the same value. This is sometimes called the "Symmetric Travelilng Salesman Problem". It is, however, not necessary to specify data for all edges that could be placed between the nodes ("complete graph"). In other words, the input graph does not have to be complete, but it is assumed to be symmetric.
The initial methods can be called individually:
or
as FarthestInsert is the default method, as well as
Usually the FarthestInsert method returns the best initial method solution, which is why it was made the default method. However, to obtain really good solutions, post-opt methods have to be applied, even for solutions computed by FarthestInsert, see next subsection.
JTravellingSalesman[] has three options that make use of multi-threading:
When the option Concurrent is used, all four initial method solutions are returned in a list. This can be useful for comparing the solutions. With the option ConcurrentPostOpt only the best solution of the following four initial/post-opt method combinations is returned (four combinations, three threads):
1. FarthestInsert with 6NodeSwap, CheapestInsert with 6NodeSwap (in one thread, as 6NodeSwap is very fast)
2. NearestNeighbor with 6NodeSwapSingle
3. NearestNeighbor with 6NodeSwapDouble
The solutions returned with the ConcurrentPostOpt method are regularly on the order of 1 - 3 % above optimal, as verified with CPLEX. On a modern 8 GB machine with at least four cores a practical limitation is about 3000 cities. If more cities are used, more than 8 GB memory would be needed, and runtime would exceed one hour.
However, even better solutions can be obtained by calling other post-opt methods in sequence, which may require a bit of experimentation, see next subsection. If the user wants very good solutions
should be used, which can easily be done with one function call.
This will use six independent threads. Thus, if the user's hardware has at least six cores, they all run in parallel. If less than six cores are available, the threads will be queued accordingly. However, two of the post-opt sequences are only applied if the problem is less than 300 cities. Therefore, if the problem size is at least 300 cities large and only four cores are available, all four post-opt sequnces will still run concurrently.
The function JTravellingSalesmanMAOS[] accepts the following parameters:
1. the matrix of node coordinates
2. the number of agents used
3. the size of the maximum learning cycle as termination condition
4. the number of cycles for the best cycle that no longer changes
5. the number of trials
6. the distance function used. The valid distance functions according to the TSPLib specification are: EUC_2D for 2-dimensional distances, rounded to the next integer, GEO for great-circle geo distances, ATT for the AT&T distance function, and CEIL_2D for the 2-dimensional Euclidean metric rounded up.
The computations for JTravellingSalesmanMAOS[] are performed on the JavaTools server. You have to be connected to the internet when running this function. Neither inputs nor output solutions are stored on the JavaTools server.
Post-Optimization Methods
JavaTools provides several very efficient proprietary algorithms for the Travelling Salesman Problem that meet industrial-strength needs.
As a rough guideline, CheapestInsert or NearestInsert or FarthestInsert together with 6NodeSwapSingle, ChainInsertion, and 6NodeSwapDouble produced the best solutions of all post-opt method combinations for smaller problems (no more than 500 cities),
and NearestNeighbor together with 6NodeSwap, 6NodeSwapSingle, ChainInsertion, 6NodeSwapDouble or 6NodeSwapDouble, ChainInsertion, 6NodeSwap or ChainInsertion, 6NodeSwapDouble
produced the best solutions of all post-opt method combinations for larger problems (500 - 1500 cities). The best solutions obtained with these post-opt methods (in sequence) regularly produced solutions of the order of 1% above optimal, or better. However, some of these running times can be very long. In particular, the methods ChainInsertion and 6NodeSwapDouble can take 20 minutes and consume 4 GB of data for 1000 cities.
Therefore, JavaTools provides two functions with parallelized post-opt methods for two different purposes. The post-opt method ConcurrentPostOpt in each of its three threads (four method combinations) uses only one post-opt method after an initial method. This provides for very good solutions in very reasonable time (“80/20 rule”). These solutions oftentimes cannot be improved any further by the post-opt methods in JavaTools, including the application of several post-opt methods in sequence. In extensive tests the worst cases were found to be around 5% above optimal, usually 1 - 3% above optimal. But in many cases, in particular in large cases (3000 and more cities) ConcurrentPostOptSequential usually *does* produce solutions that are better than the ones obtained with ConcurrentPostOpt. The post-opt method ConcurrentPostOptSequential implements sequential post-opt methods in parallel and returns the tour with the shortest length (ignoring the other solutions). However, given that they contain ChainInsertion the running times can be *very* long, reaching two hours on sufficiently large problems.
In the following we will frequently compare the JavaTools results with the results obtained with Concorde/CPLEX. As Concorde/CPLEX uses an exact method, we can rely on the fact that its output is guaranteed to be optimal (there is no guarantee when using a heuristic), and we will use the value of the Concorde/CPLEX solution as the optimal solution value. With that optimal solution value we can compute how much above the optimal value our heuristic solutions’ values are. This difference is called “gap” and is usually expressed as “1.1% above optimal” or similar.
As a simple introductory example we show a comparison against the solutions of the built-in function FindShortestTour[] from the help browser, but for more thorough examples see below in the section "More Large Examples". The help browser only uses 10 points (63 coordinate pairs) for the grid construction. To make it more interesting, and to show JavaTool's superiority, we will use 50 here (1547 coordinate points). Already with 50 points Mathematica's function FindShortestTour[] is quite unusable on a problem that was picked for the help browser. As is shown in the table below, one method (GreedyCycle) could not even produce an answer for the relatively small problem with 50 points. For that reason, the Mathematica method FindShortestTour[] was not even considered for the really big examples further down on this page (thousands of nodes, and millions of edges).
This finds all points on an n x n grid with coordinates that are coprime:
We index the points and store the coordinates:
We compute and store the Euclidean distances for all pairs of points:
Here we call the Travelling Salesman function of JavaTools, using the NearestNeighbor method as initial method and then the 6NodeSwap method as post-opt:
JavaTools needed 21 seconds to compute a solution that is 2.3% above optimal. The solution is better than any produced by the built-in methods in FindShortestTour[] and was computed in a fraction of the FindShortestTour[] running time.
Here is the solution:
The optimal solution has a length of 1773, so the result computed with JavaTools is only 2.3% above optimal.
MAOS finds the optimal (!!!) solution in 5 seconds:
The following table compares the solutions and the running times for this problem:
In the following example we find a Travelling Salesman Tour through all points in the [-300,300] x [-300,300] interval that have integer distance from the origin (excluding all points on the two axes).
The optimal solution has a length of 19080, so JavaTools’ ConcurrentPostOpt is only 3.5% above optimal in a fraction of the running time CPLEX needed to find the optimal solution, and it beats any of the methods built into Mathematica’s FindShortestTour[].
JavaTools provides a single-threaded post-opt method that reduces running time substantially by using the Delaunay Triangulation to include only “nearby” edges of every node in the optimization process, and ignoring the “more distant” edges. Only the edges up to the fourth neighbor of every node are included in the optimization process. This is particularly effective when the set of nodes contains “clusters” of nodes, such as the cities in California, New York, Texas, Illinois, etc. and is less effective when there are no discernible “clusters”, such as in Wisconsin, North Dakota, South Dakota, etc. (see more below in the “Large Examples” section) The solution will likely be not as good as the corresponding solutions returned by the “full” methods (occasionally they are even better!), but the decrease in running time is oftentimes so significant that a slight decrease in the quality of the solution can be acceptable.
This solution is 4% above optimal, but required only 129 seconds running time.
MAOS finds the optimal solution in half a minute:
The following table compares the solutions and the running times for this problem:
More Examples
The following example uses a slight modification of an example from the Mathematica help browser. We compute a minimum spanning tree (assuming arcs are just undirected edges and all edge=arc costs are their -- 2-dimensional -- Euclidean lenghts) using the modified Kruskal algorithm, the shortest paths from the source node 1 to all other nodes using the Dijkstra algorithm as well as the Moore-Bellmann algorithms, the all-pairs shortest path solutions using the Floyd algorithm, and show related timing results.
Note that the Kruskal algorithm and all its variants compute EXACT optimal solutions, that means they are not heuristics to compute a solution that is not necessarily optimal (only "good"). The solution is not necessarily unique, but it is guaranteed to be an optimal solution. In other words, other solutions are possbile, but no better ones. Other solutions are only just as good. However, for "sufficiently random" inputs of type Real, such as the coordinates of cities (see below), the solutions usually are unique, hence, the optimal solution is the only optimal one. It is mathematically next to impossible that random inputs would result in problems for which the optimal solutions are not unique. For the next few graph theoretic examples other optimal solutions are readily apparent, but for the city data examples below it is impossible that other optimal solutions exist.
To compute the minimum spanning tree solution using edge lengths as edge costs we have to extract the edge lengths from the graph:
Here we compute the minimum spanning tree solution (in practically 0 time):
This is what the minimum spanning tree solutions looks like, overlaid on the whole graph:
The shortest paths from the source node 1 to all others, using the Dijkstra algorithm:
The shortest paths from the source node 1 to all others, using the D'Esopo Pape variant of the Moore-Bellmann algorithm:
The all-pairs shortest paths solution using the Floyd algorithm (again, note that the first line is identical to the Dijkstra and MooreBellman solutions):
The following example uses the Petersen graph as a starting point, but with double-edges removed:
Another example (a modification of an example from the Mathematica help browser):
More Large Examples
In the following section we show more large-scale examples. None of the problems could be solved with the functions that Mathematica has built in.
The following is also an example from the Mathematica help browser. As in the previous examples we compute the minimum spanning tree with the modified Kruskal algorithm (again assuming all edge costs to be their 2-dimensional Euclidean lengths), and the shortest paths from the source node 1 to all other nodes with the Dijkstra and Moore-Bellmann algorithms, as well as the all-pairs shortest path solutions using the Floyd algorithm, and show related timing results and solution overlays.
JavaTools computes the minimum spanning tree solution in a mere 188 milliseconds:
The following is also an example from the Mathematica help browser. The matrix is related to a structure from NASA's Langley Research Center. As in the previous examples we compute the minimum spanning tree with the modified Kruskal algorithm (again assuming all edge costs to be their 2-dimensional Euclidean lengths), and the shortest paths from the source node 1 to all other nodes with the Dijkstra and Moore-Bellmann algorithms, as well as the all-pairs shortest path solutions using the Floyd algorithm, and show related timing results and overlays.
JavaTools computes the minimum spanning tree solution in less than a second:
In the following examples we use the coordinates of several cities, this data is built in Mathematica. We will be using the ConcurrentPostOpt option throughout, as it provides the best run time/quality ratio ("cost/benefit ratio"). In some cases we will also use the ConcurrentPostOptSequential method if further improvement of the solution appears possible. We will try to pick interesting geographies/shapes to illustrate the solutions.
First we define a distance function that computes the great circle distance in statute miles, given the latitude and longitude coordinates of a given city.
The following are a minimum spanning tree and a Travelling Salesman Tour of all large cities in the United States (excluding Alaska and Hawaii).
This is the minimum spanning tree:
This is the travelling salesman solution.
The length of the optimal solution is 14347, so JavaTools' ConcurrentPostOpt method produced a solution in 2 seconds that is only 1.6% above optimal.
By increasing the problem size the runtimes also get larger very quickly. The following are a minimum spanning tree and a Travelling Salesman Tour of all cities in the United States with city populations larger than 35,000 (excluding Alaska and Hawaii). This means 1,156 cities. These could not be solved with Mathematica's FindShortestTour[] function at all due to problem size.
This is the minimum spanning tree:
This is the travelling salesman solution:
The optimal tour has a solution of length 23790. The solution produced by JavaTools' ConcurrentPostOpt is only 3.5% above optimal.
However, the solution can be improved further with sequential application of post-opt methods, at the expense of longer running times:
This solution is 2.1% above optimal, but it required 35 minutes computation time.
MAOS finds 23794.4 (that is 0.018% above optimal) in 135 seconds ...
... and 23797.6 (0.032% above optimal) in 28 seconds ...
Here we increase the problem size further. Here we compute a minimum spanning tree and a Travelling Salesman Tour of all cities in the United States with city populations larger than 15,000 (excluding Alaska and Hawaii). This means 2,827 cities, and JavaTools solves the minimum spanning tree problem in less than 2 minutes and the Travelling Salesman Tour (using MAOS) in 2 minutes.
This is the minimum spanning tree:
The optimal solution was not determined, but it is assumed that the MAOS solution is less than 1% above optimal.
The following example shows the minimum spanning tree and the Travelling Salesman Tour solution of all cities in Japan.
The minimum spanning tree solution :
The TSP solution:
The length of the optimal solution is 8596, so JavaTools' ConcurrentPostOpt is only 2.43% above optimal.
With the option ConcurrentPostOptSequential the solution can be improved even further, however the running time increases substantially due to the ChainInsertion method used internally:
This evaluation took about 70 minutes. The solution is only 1.01% above optimal.
MAOS finds the optimal solution in 19 seconds. Note that there are slight inaccuracies due to the fact that the function was called with the EUC_2D 2-dimensional Euclidean norm as distance function, although for geomatric latitude/longitude coordinates the correct norm to use would be the great circle distance (see above).
The following example shows the minimum spanning tree and the Travelling Salesman Tour solution of all cities in Chile.
The minimum spanning tree solution :
The TSP solution:
The length of the optimal solution is 9057, so JavaTools' ConcurrentPostOpt is only 0.6% above optimal.
MAOS finds the optimal solution 2 seconds:
The following example shows the minimum spanning tree and the Travelling Salesman Tour solution of all cities in Ecuador.
The minimum spanning tree solution :
The TSP solution:
The length of the optimal solution is 2908, so JavaTools' ConcurrentPostOpt is 2.8% above optimal.
We try the sequential method:
This solution is 2.1% above optimal.
Occasionally for small problems such as this (113 cities) a random initial tour provides the best starting conditions for the post-opt methods. For example, with the random initial tour
we can obtain:
which is 0.7% above optimal with a single method call and even
with sequential post-opt method calls, which provides a solution that is 0.4% above optimal.
Of course, there is never a guarantee that random initial tours result in very good solutions, and for larger problems random initial tours are always inferior.
MAOS finds the optimal solution in half a second:
For the following examples we have to modify the graphics function a bit, as Mathematica has no maps for the US states:
The following are the minimum spanning tree and the Travelling Salesman Tour of all cities in Illinois.
The optimal solution has a length of 5895. JavaTools' ConcurrentPostOpt method produced a solution that is 3% above optimal.
We try the sequential method:
This solution is 2.2% above optimal, but it needed 38 minutes to compute.
MAOS needs less than half a minute to compute the optimal solution. Again note that there are slight inaccuracies due to the fact that the Euclidean norm and rounding the next integer produces slight different solutions than the great circle distance norm.
The following are a minimum spanning tree and the Travelling Salesman Tour of all cities in Maine.
The optimal solution has a length of 2873. JavaTool's ConcurrentPostOpt method is only 0.4% above optimal.
MAOS computes 2879 (that is 0.2% above optimal) in 7 seconds.
The following are a minimum spanning tree and the Travelling Salesman Tour of all cities in California.
The optimal solution has a length of 6265. JavaTools' ConcurrentPostOpt method is 3.3% above optimal.
The solution produced by JavaTools' sequential method is 1.5% above optimal, but it needed 91 minutes to compute.
MAOS computes the optimal solution half a minute:
The following are a minimum spanning tree and the Travelling Salesman Tour of all cities in New York.
The optimal solution has a length of 6035. JavaTools' ConcurrentPostOpt method is 5.4% above optimal.
This solution is 1.2% above optimal, but it took 79 minutes to compute.
MAOS produces the optimal solution in 34 seconds:
The following are a minimum spanning tree and the Travelling Salesman Tour of all cities in Texas.
The optimal solution has a length of 11444. JavaTools' ConcurrentPostOpt produces a solution that is 3.5% above optimal.
We try the sequential method:
This solution is 3.1% above optimal, however, it took about an hour to compute.
MAOS computes 11453 (that is 0.079% above optimal) in less than a minute:
The following are a minimum spanning tree and the Travelling Salesman Tour of all cities in Hawaii.
The optimal solution has a length of 825. JavaTools' ConcurrentPostOpt produces a solution that is 0.05% above optimal.
MAOS computes the optimal solution in less than half a second:
The following are a minimum spanning tree and the Travelling Salesman Tour of all cities in Oklahoma.
The optimal solution has a length of 4572. JavaTools' ConcurrentPostOpt produces a solution that is 3.1% above optimal.
We try the sequential method:
This solution is 2.4% above optimal. It took less than 2 minutes to compute.
MAOS computes the optimal solution less than 5 seconds. Again note slight inaccuracies due to the fact that Euclidean norm is slightly off when great circle distances should be used.
The following are a minimum spanning tree and the Travelling Salesman Tour of all cities in Maryland.
The optimal solution has a length of 1289. JavaTools' ConcurrentPostOpt produces a solution that is 3.2% above optimal.
We try the sequential method:
This solution is 2% above optimal. It took less than a minute to compute.
MAOS computes the optimal solution less than 5 seconds.
The following are a minimum spanning tree and the Travelling Salesman Tour of all cities in West Virginia.
The optimal solution has a length of 1706. JavaTools' ConcurrentPostOpt produces a solution that is 2.9% above optimal.
We try the sequential method:
This solution is 0.4% above optimal.
MAOS computes the optimal solution less than a second. Again note slight inaccuracies due to the fact that Euclidean norm is slightly off when great circle distances should be used.
The author is inviting competing implementations in Mathematica or Java or both for large problems at info@lauschkeconsulting.com to challenge the presented performance claims.