Parallelisation of ant colony algorithms.

Intro :

  • Since Ant colony algorithms take long time to solve complex problems, a parallel ant colony algorithm is devised.
  • Parallelism is achieved by using multi-core CPUs. These models need extensive CPU support.

  • MMAS (Max Min Ant System) algorithm is parallelised based on a Apache Spark’s computing platform.
  • MMAS is combined with Apache Spark’s MapReduce. My previous blog post demonstrates usage of Apache Spark.

What happens in MMAS?

  • Initialisation of ants, cities and tabu list.
  • Pheromone value (It is an abstraction for a chemical substance produced by the ant) is initialised to maximum value.
  • m ants are placed randomly at n cities.
  • Each ant updates the tabu list of cities the ant has gone by.
  • Ant paths are then constructed.
  • Screen Shot 2017-04-17 at 6.44.05 PM.png
  • Where τij(t) is the amount of pheromone deposited for transition from city i to j in time t, α is a parameter to control the influence of τij(t), ηij(t) is the desirability of city transition ij (a prior knowledge, typically 1/dij, where d is the distance) in time t and β is a parameter to control the influence of ηij.
  • Screen Shot 2017-04-18 at 12.46.12 AM

MapReduce and Apache Spark overview.

  • Hadoop’s MapReduce and Apache Spark are important before we dive into implementation.
  • You can refer to MapReduce tutorial for more info.
  • My previous blog post has clear explanation about Apache Spark and its parallelism features.

Implementing the MMAS using Spark’s MapReduce Framework.

The code :

  • The code for the implementation can be accessed here.
  • Spark’s data level parallelism involves initialisation of m ants and use parallelize() to convert it into RDD.
  • Map stage involves iteration of every city at every stage.
    • map() will use the original algorithm.
    • It produces a key-value pair where, distance is the key and path travelled is the values.

    • Distances are sorted using sortByKey()

  • Reduce stage involves comparison between every two cities’ walking distances, resulting in shortest path.

  • take() to find shortest distance and the mapping path.

  • Iteration of MapReduce tasks to get the final shortest path value.

Experimental Setup and Results :

  • Make sure you have Apache Spark’s MapReduce framework installed.
  • If not, you can follow instructions from here.
  • If your system is ready with Apache Spark and Hadoop, Download the MMAS MapReduce code and run the following command.
    • bin/hadoop jar contrib/streaming/hadoop-streaming.jar
      -file /home/sb/mapper.py -mapper /home/sb/mmas_mapper.py
      -file /home/sb/reducer.py -reducer /home/sb/mmas_reducer.py
      -input /user/sb/cs336/input.txt -output /user/sb/cs336
    • Change the paths according to your HDFS directories.
  • The input and output files can be accessed in the same directory :
  • Mapper Input : <Edge, Pheromone>
    Reducer Output : <BestEdge, Its Pheromone>
    Code output : BestEdge

Conclusion :

The MapReduce approach to parallelise the MMAS shows significant performance gains when run on large clusters.

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s