Solving Travelling Salesman Problem with Go Language

I had been wanting to learn the Go language for some time. The homework I was given for the Parallel Programming course that I took this semester was a great opportunity for me to do that. Go language has utilities and control structures which are very similar to the languages (ADA, Occam etc.) that are shown during the course. Since Go is a language with a more up-to-date design, I thought that using it for the semester project would be a teaching experience, not a waste of time.

The aim of the project was to implement a solution for the famous Travelling Salesman Problem (a.k.a TSP). Both sequental and parallel running algorithms were required to be implemented. The goal was to compare the sequental and parallel algorithms in terms of running time.

I decided to use the same approach with an implementation of Ant Colony System for the TSP problem that I coded before in Python. This implementation was for another course and got good results. It can be said that Ant Colony System is very suitable for problems with obvious components such as cities in TSP, and can be appropriate to demonstrate the performance gain from running the ants parallelly.

There are two main solver types in the implementation: SeqSolver (sequential solver) and ParSolver (parallel solver). These implementations are in their respectful go files, "seqsolver.go" and "parsolver.go". Apart from these, there is a Tsp type which carries general fields that the solvers operate on, such as city list, number of ants etc. Also, there is the main.go where the main function is, which parses command line arguments and starts the program. The project folder structure is as seen below:

tsp/
- main.go
- (sample data files such as tsp_51_1.txt)
parsolver.go
seqsolver.go
tsp.go

Ant Colony System Algorithm

The Ant Colony System implemented is very simple. If we represent the cities as a fully connected graph, each ant will select the next node from their current location using heuristic and edge pheromone values. The selection is done using the following formula:

$P\left( {{e_{ij}}|{s^p}} \right) = {{\tau _{ij}^\alpha \times \eta _{ij}^\beta } \over {\sum\limits_{{e_{ix}} \in N\left( {{s^p}} \right)} {\tau _{ix}^\alpha \times \eta _{ix}^\beta } }}$

${{s^p}}$ : partial solution
${N\left( {{s^p}} \right)}$ : Unvisited nodes for the ant
${{e_{ij}}}$ : Edge between nodes i and j
$P\left( {{e_{ij}}|{s^p}} \right)$ : Probability of selecting edge ij
${\tau _{ij}}$ : Pheromone level at the edge
${\eta _{ij}}$ : Heuristic value for the edge

The pheromone values evaporate at a constant rate after each iteration. At the end of an iteration, the current best path is designated as the global best, if better than the current global best. Iteration count, ant count, alpha, beta, rho constants are all given as the command line arguments to the program.

Implementing The Parallel Solver with Go

The only difference between the sequential solver and the parallel solver is that the parallel solver runs the ant steps paralelly. Ants will construct their paths in seperate "goroutines" who communicate their results with the main goroutine using Go language's channels. Ant goroutines are created by running their StartStepsPar(...) function with go statement. Then the main goroutine listens to the solverChan channel for ant paths. When a path comes, it compares it to the global best:

func (solver *ParSolver) Solve() {
	...
		for j := range solver.Ants {
			go solver.Ants[j].StartStepsPar(solverChan)
		}
		for {
			// Wait for an ant to sent its path to us
			antPath := <-solverChan
			... // Set as global best if necessary
			solver.Tsp.DoneAntCount++
			if solver.Tsp.DoneAntCount == solver.Tsp.AntCount {
				break
			}
		}
	...
}

After the ant is done, it sends its path to the solverChan channel:

func (ant *Ant) StartStepsPar(solverChan chan []Node) {
	for {
		...
		// Check if ant is done
		if (len(ant.Unvisited) == 0) {
			ant.Done = true
			ant.PathLength = ant.Tsp.Length(ant.Path)
			solverChan <- ant.Path
			break
		}
	}
}

By default, the channels in Go work with blocking read and write, which is fine for us since it does not matter if the ant is blocked after sending its path; we don't need the ant anymore.

Running Time Analysis

Running the ants parallelly significantly reduces running time, as it can be seen from the following tables (seq is for the sequential, par is for the parallel algorithm).

Filename	City Count	Ant Count	Iteration Count	Running Time
tsp_51_1.txt	51	51	100	Seq: 452.2219ms Par: 115.2868ms
tsp_200_1.txt	200	200	100	Seq: 24.9450421s Par: 5.4535014s
tsp_400_1.txt	400	400	50	Seq: 1m38.3396262s Par: 21.3277079s
tsp_1060_1.txt	1060	500	50	Seq: 12m2.2252983s Par: 2m49.2779642s

Filename	City Count	Ant Count	Iteration Count	Running Time	Tour Length
tsp_200_1.txt	200	5	100	Seq: 629.6975ms Par: 224.5737ms	39277.63
tsp_200_1.txt	200	50	100	Seq: 6.214545s Par: 1.4438176s	35840.48
tsp_200_1.txt	200	100	100	Seq: 12.4389998s Par: 2.8836675s	34838.08
tsp_200_1.txt	200	200	100	Seq: 24.8174813s Par: 5.4384603s	34432.38

An interesting point is that, if we increase the ant count while other parameters are constant, the running time of parallel algorithm also increases. This may seem weird, since one would think that increasing the parallel running ant count should not affect the run time this much. According to my observations, there could be two reasons for this.

Firstly, I detected that as the pheromone levels on the paths converge over time, the duration for the ants to complete their paths increases. The culprit for this situation is this code responsible for the node selection of the ants:

...
	// Select next node by calculating the p value
	n := ant.Rand.Float64()
	movingSum := float64(0)
	for i := range ant.Unvisited {
		p := ant.Tsp.P(&ant.CurrentNode, &ant.Unvisited[i]) / pDenom
		movingSum += p
		if movingSum >= n {
			ant.Path = append(ant.Path, ant.Unvisited[i])
			ant.CurrentNode = ant.Unvisited[i]
			ant.Unvisited = Remove(&ant.Unvisited[i], ant.Unvisited)
			break
		}
	}
...

Clearly, as the time goes on, few of the edges collect a lot of pheromones, causing their p probability values to be high while others remain very close to zero. This causes the for loop above to run for a longer time, since there are fewer edges that will boost the movingSum over our treshold probability value n. Collectively, this causes the algorithm to take more time as we converge to the solution.

Secondly, the ant count related running time increase could be simply caused by the way the goroutines work. As explained by Google, the concept of concurrency is not equivalent to parallellism. Running multiple goroutines does not mean that these will always run in parallel threads. Depending on the hardware resources and operating system scheduling, the goroutines may even run one by one. In other words, if we create 50 goroutines, only 25 may run parallelly at a time; our program will take 2x time compared to a run with 25 goroutines.

You can look at the code in my Github repository here. The project has been a great learning experience and practice of Go for me. Of course, I most probably did not use all of the capabilities of Go for a project like this. I tried to obey the coding standards for go such as naming conventions etc. but the code is not the cleanest I have written. If you could point to some of my mistakes, I would be happy to hear them.