TSP DIY

Traveling Salesman Problem .DIY

or, A Young Optimizer's Illustrated Primer

The TSP is a rock star of the optimization world. It's easy to state, has a cool name, it captures the essence of complexity, and it drives the discovery of general-purpose methods. What's not to like?

The problem asks for a shortest circular route to visit a set of locations. You must visit each of the locations and then return to the start. In TSP lingo, the route is called a tour or Hamiltonian circuit.

We all like simple problems. But this one sounds crazy simple. Anyone equipped with a map and a pencil can plot a short tour through a few points. Yes, but the TSP asks not for a reasonably good tour, but the best possible tour. Squeezing every last inch (or penny, or drop of fuel) out of the route.

Okay, but even demanding perfection is simple enough if you have a small number of points to visit, say 6. Just go ahead and check every possible route. There are only 5! = 5 × 4 × 3 × 2 × 1 = 120 candidates. Easy peasy.

The challenge comes when we let the number of points n get large. You then can't simply list all of the candiate tours and check them one by one. You need more sophisticated ideas. The aim of this app is to show you these, including general methods that have been used to find optimal tours through many thousands of points.

DYI App

The version of the TSP we consider in the app asks for a tour through randomly chosen points in a rectangle, matching the size of your browser window. We use the straight-line Euclidean distance between points, rounded to a fixed precision (to avoid working with the possibly irrational square root values Euclid gives us).

The focus of the app is Do-It-Yourself, to learn fundamental tools adopted in general computational discrete optimization. But we also include several TSP algorithms, to allow you to get a feel for automated solution techniques and to see how your own work stacks up against the computer. You can find these in the drop-down menus on the top bar.

Go ahead and click the various choices to see what happens: if the status of your solution is not ready for the given algorithm, a polite alert message will pop up asking you to behave better in the future. A word of warning though: to save memory, the app only stores a single linear programming (LP) model. So if you have painstakingly added cutting planes by hand, be aware that these will all be tossed out if you select a new initial LP relaxation in the LP menu. Similarly, running branch-and-bound will delete your LP, to make room for the relaxations needed in the newly created subproblems. Just follow the natural progression of the tasks, moving from left to right across the top bar, and your work will be safe.

Credits

The graphics for the TSP app are created with Mike Bostock's great D3.js library.

The app layout is generated with the Bootstrap 4.5 framework, using jQuery to manipulate page elements.

Linear programming models are solved with the QSopt.js library, created using Emscripten to port the QSopt LP Solver to run in a browser.

The app was written by William Cook, University of Waterloo.

Notes

All computations in the app are run in your browser. If the app becomes non-responsive (for more than 10 seconds or so), the simple move is to reload the page or to delete the browser tab. It will destroy your current work, so maybe take a screenshot first if you have something pretty.

The most common reason for the app freezing is an OOM (out of memory) error produced in the Emscripten-generated library. You can check this by opening a Javascript Console in your browser. With the problem-size limits built into the app, you should only very rarely encounter this problem.

If you happen to see a different type of error message in the console, and you have a free minute, it would be great if you could send an email to me at bico@uwaterloo.ca so that I check it out. Thanks!

The (somewhat simple) codes included in the app can handle small instances, but if you are looking for a full-blown implementation of the branch-and-cut ideas, check out the Concorde TSP Solver.

Reading

After solving a few TSP instances in well less than 1,000 years, you might be interested in learning more details of the solution method, along with the history of the problem. For this, I can point you to the short book In Pursuit of the Traveling Salesman, published by Princeton University Press. The list price is $16.95, but you can typically find it for few dollars less at various outlets.

For geometric instances of the TSP, Jünger and Pulleyblank (1993) introduced a nice interpretation of linear programming (LP) duality, providing a clear, visual lower bound on the length of any tour through the point set.

To start off, we draw a disk around each node, such that the disks do not overlap. The disks are called control zones and the collection of non-overlapping disks is called a packing. Since a tour must select two edges meeting each node, the edges in the tour have total length at least as large as the sum of twice the radii of the disks. This sum is thus a lower bound on the length of any TSP tour.

The goal here is to make the sum as large as possible, to obtain a strong quality guarantee for the best tour you have found. Indeed, if you can produce a lower bound equal to the length of the tour, then the TSP is solved for this point set.

To try your hand at this, select the "Build a Bound" option in the menu. Now click or tap a node. The node will turn green and a slider labeled "Radius/Width" will appear in the top left corner. Drag the slider to set the radius of the zone around the selected node. When you are done, click the node again or hit the Clear button to unselect the node.

Your packing gives a first lower bound for the point set, but typically the bound will be much lower than the length of your best tour. But don't fear, Jünger and Pulleyblank have another trick up their sleeves. If you select more than one node (but not all of them) then a tour must include at least two edges having one end in the selected set and the other end not in the selected set. So if you draw a band around the selected set, then you can add twice the width of the band to the lower bound, as long as the band does not overlap any control zone or other band. The bands are called moats. Think of a salesman swimming across the width of the moat to reach the set of nodes it encloses.

To add a moat to your zone packing, select more than one node and move the slider to adjust the width of the moat enclosing the selected set. In doing this, you should aim to put moats only around sets of nodes having control zones that form a connected region (since otherwise you would be better off increasing one of the zones or growing a smaller moat).

Again, the aim is to get the resulting lower bound as large as possible. Its value is reported on the message bar at the bottom of the page, allowing you to track your progress. The bounds achievable with zones and moats are typically very good, particularly on instances with up to 25 points or so. You might even be able to prove your tour is optimal.

After having a go at building your own zones and moats, if you want to see packings that give best possible bounds, select "Control Zones" or "Zones and Moats" in the drop-down menu. These packings are created by solving an LP model, where the variables give the radius of each zone and (when also looking for moats) the width of each moat.

In the drawings, we also display edges between some pairs of nodes. These edges represent the solution of the dual LP model corresponding to the packing problem. In the zone-packing case, the dual is the degree LP relaxation (see the LP section of the app) where the values are not constrained to be at most 1. In the drawings, thick, dashed edges carry the value 2, while standard black edges carry the value 1. Every node will meet one dashed edge or exactly two standard edges. Thus the values sum to 2, as required in the degree LP.

In the case of zone+moat packings, the dual LP model is the full subtour LP relaxation for the TSP. The values assigned to the edges vary between 0 and 1. We indicate these values with colors. White (invisible) edges carry value 0, red edges value 0.5, black edges value 1, and a spectrum of shades for inbetween values.

Very strong lower bounds on the length of TSP tours can be computed with linear-programming (LP) methods. Linear programming was developed by George Dantzig in the 1940s, and first applied to the TSP by Julia Robinson in 1949.

The variables in TSP LP relaxations are assigned to edges, each taking on a (possibly fractional) value between 0 and 1. In this model, a TSP tour is represented by an LP solution where each edge in the tour has value 1 and edges not in the tour have value 0.

The LP objective is to minimize the sum of each edge's length times the value of its corresponding variable. In other words, letting x_e be the variable associated with edge e and denoting its length by c_e, the objective is to minimize ∑(c_ex_e: e is an edge joining two nodes in our TSP instance). So the objective value of a tour is precisely its total length. For LP constraints, we choose linear equations and inequalities that are satisfied by all tour solutions. Following this rule, the LP objective value is thus a lower bound on the length of any tour (since every tour is a candidate solution to the LP model).

The initial relaxation consists of the degree equations, stating that the edge variables meeting any point must sum to exactly 2. You can create and solve this relaxation by selecting "Initial LP (degree equations)." This is the starting point for working with cutting planes in the Cuts module. The LP solution is displayed as a graph, using colors to indicate the values assigned to the edge variables. White edges carry value 0 (you won't see these in the drawing), red edges value 0.5, black edges value 1, and a spectrum of shades for inbetween values. Hovering a pointer over an edge will report its value.

Except for small instances, the solution graph (not including the 0 edges) for the degree LP will be disconnected. On the other hand, a tour is always connected: we can't have the salesman traveling around small subtours. The answer here is to add further constraints to the LP model.

If we let S denote a subset of points (but not the whole set), then every tour must contain at least two edges that have one of their ends in S and the other end not in S. So every tour solution satisfies the subtour constaint stating that the sum of the variables corresponding to these edges must be at least 2.

An easy first source of subtour constraints to add to the LP relaxation are the islands in a disconnected solution graph. To get a headstart for the Cuts module, select "Initial LP + Connected Components" to repeatedly add subtour inequalities for all of these islands, stopping when the graph is connected.

Going even further, select "Initial LP + Connected + 2-Connected" to add subtour constraints until the graph has no cut nodes.

Finally, select "Full Subtour LP" to solve the relaxation consisting of all subtour constraints.

A great thing about the LP approach to the TSP is that you can always increase the lower bound, by adding to the relaxation further linear inequalities satisfied by all tours.

A crucial point is that we must be careful to add only inequalities that might lead to an improvement. For example, there is a subtour constraint for each subset of points (and there are lots of those), so you can't possibly throw them all into an LP model in one go. Instead, we employ the cutting-plane method, adding only inequalities that are violated by the current LP solution. This is the technique invented by Dantzig, Fulkerson, and Johnson, who solved, by hand, a 49-city instance of the TSP in 1954.

To get started, select "Do-It-Yourself Subtours" from the menu and click the "Build a Cut" button that appears. Now click/tap nodes to include in the subset S defining your subtour constraint. The bottom message bar reports a running sum of the edge variables having one end in S and the other other not in S. If this value drops below 2, an "Add Cut" button will appear. Clicking the button will add the subtour constraint, re-solve the LP model, and display the new solution.

In the beginning, if you started with the "Initial LP (degree equations)" option in the LP menu, it's easy to build violated subtour constraints from the islands in the LP graph. If you would like the code to do this initial work for you, select "Connected Components" in the Cuts menu. This will take you step-by-step through rounds of cutting planes, until the graph is connected.

Taking this up a notch, you can select "2-Connected Cuts" to also add subtours identified by finding cutnodes in the graph.

Once you have the graph 2-connected, you can try "Min (s,t)-Cut" to search for general violated subtour constraints. This routine allows you to select two nodes, s and t, and run a max-flow min-cut algorithm to determine if there is any violated subtour constraint where exactly one of s or t is contained in the set S. The flow from s to t will be indicated by arrows drawn on the graph's edges, where the color of each arrow corresponds to the value of its flow. The set S will be indicated by green nodes.

Going fully automatic, you can instead select "Global Min Cut" to find the most violated subtour constraint. The set S will again be indicated by green nodes.

That's the end of the line for subtour constraints, but the LP solution may still have edges with fractional values. To continue the cutting-plane method, there are many known additional families of linear inequalities that may possibly allow you to further improve your LP bound. The simplest of these (but still complicated) are called comb inequalities. You might want to do a Web search to learn more about combs, but the short description here will allow you to bash on regardless.

A comb consists of a set of nodes H, called the handle, and an odd number k of non-intersecting sets T_i called teeth. Each tooth T_i must contain at least one point in the handle and one point not in the handle. The comb inequality states that the sum of all edge variables having exactly one end in the handle H plus for each tooth T_i the sum of edge variables having exactly one end in T_i must be at least 3k + 1. It takes some work to see, but every tour solution satisfies all comb inequalities, so combs are candidates to be added to your LP relaxation.

To search for these, select "Do-It-Yourself Combs" in the menu and click the "Build a Comb" button that appears. A row of buttons for the handle and teeth will appear on the top left of the window. Clicking these will allow you to select nodes to be included in each of the sets. If you need more teeth, click the "+" button to add another pair, up to a total of nine. The message at the bottom of the screen will let you see your progress towards satisfying the comb conditions. Once you have a comb, its value will be displayed in the message. If the comb inequality is violated, then an "Add Comb" button will appear to allow you to add the inequality to your model.

Comb-finding is research-level work. If you discover a fast way to find violated combs, it could be a significant boost for TSP solvers. But there are known, fast algorithms for finding combs where each tooth has just two nodes. These are called blossom inequalities. Some of these are particularly easy to spot by deleting the LP edges of value 1 and examining the remaining components in the graph. If a component meets an odd number of these 1-edges, then it gives a violated blossom, where the nodes in the component form the handle and the nodes in each of the odd number of 1-edges form the teeth. To find these automatically, select "Simple Blossoms" in the menu.

The cutting-plane method plays very nicely with branch-and-bound search. The combination of the two, called branch-and-cut, is the Katie Ledecky of the TSP world, setting every record for the exact solution of large-scale instances.

Here is a high-level description. In solving the TSP (or other discrete optimization problem) with cutting planes, our enemies are the fractional-valued solutions. Those are the ones we seek to cut out of our LP model with additional constraints. And we can always do that ("there's always money in the cutting-plane method") , but the right inequalities can be very difficult to find, as we are chugging along with our solution algorithm.

But there is more than one way to skin a fraction. Suppose, for some edge e, its variable x_e takes on a dreaded fractional value. Well, look. In a tour solution to the LP, we know x_e must be 0 or 1. So let's force the issue. Split the TSP into two child subproblems, considering separately the x_e = 0 and x_e = 1 cases. This is called a branching step. On each side of the branch we add to the LP model the single constraint fixing the value of x_e. The smaller of the two LP objective values we obtain is a lower bound on the length of any TSP tour (since any tour is a candidate solution for one of the two LPs). With a good choice of the branching edge e, this new bound will likely improve what we had originally from the single model. And we might be able to get another improvement, by running the cutting-plane method on the two child problems.

What we have described is just the first step. In branch-and-bound search, we can choose any subproblem, whose LP solution a not already a tour, and split it again with another branching step, creating a search tree of subproblems. The killer idea, due to Ailsa Land and Alison Doig (1960), is that if the LP bound for a subproblem is at least as large as the length of a tour we already know, then we need look no further at the particular subproblem: it cannot contain a better tour in its set of candidate solutions. In this case we say the subproblem has been pruned from our search tree. The app will give you a bird's eye view of the process.

Choose "DIY Branching" to run your own branch-and-bound search, selecting with a click the active subproblem (that is, unprocessed, not pruned, and does not have a tour solution) to process in the next branching step and then selecting the edge variable to set to 0 and 1, creating two child subproblems. In the drawing of the search tree, active subproblem nodes are colored red, tour nodes are blue, processed nodes are black, and pruned nodes are white (with a black outline). You can see the LP solution for any subproblem by clicking its node, letting you select a branching edge.

You can also sit back and watch the search tree grow, letting the computer do the work. Select either "Best-Bound Search" or "Depth-First Search". In both cases the algorithm uses the simple rule of branching on the fractional variable that's closest in value to 0.5. They differ in their choice of the next subproblem to process. In best-bound search, the algorithm selects the active subproblem having the lowest LP bound. This choice gives us a chance to improve the overall lower bound, computed as the smallest of the LP bounds of the active subproblems. In depth-first search, we process the most recently created active subproblem. This search strategy does not lead to an immediate increase in the overall lower bound, but it allows the algorithm to dive deep into the tree and perhaps spot a new best tour (after fixing the value of many variables).

If an LP model has not yet been created, the two algorithms will start with the full subtour LP, combined with cutting planes obtained with the simple blossom procedure. In depth-first search branching, it's particularly important to have a good starting tour, to avoid spending large amounts of time in part of the search tree that could be pruned if a tour were available. For this reason, if depth-first search branching is called without a tour, the app will make a call to Lin-Kernighan before it begins to build the search tree.

For both types of searches, we limit the tree to a total of 1,000 subproblems. If you find such a limit-hitting instance, you might want to take up the challenge of solving it with DIY Branching (and hopefully a much smaller tree).

Note: The app limits branch-and-bound tests to instances having at most 50 points, to keep things snappy in the browser and to avoid out-of-memory errors in the LP solver.

Traveling Salesman Problem DIY

or, A Young Optimizer's Illustrated Primer

Parent LP Bound: 123456.99

Child 0: 123456.99

Child 1: 123456.99