ex0ns

about
A Computer Science Student's blog, about programming, algorithms, hack and a lot more.

[Ruby] Graph drawing and Minimum Spanning Tree

10 Feb 2015

Hi, Here is an other article about graphs, in the first time we'll see how to actually draw a graph, it'll help us to get a visual representation of our work. However, We'll not see how the graphs drawing algorithms work (not yet at least), we'll use GraphViz !

Graph Drawing

GraphViz has a ruby interface available on github, a simple:

gem install ruby-graphviz

should do the work ! From here, our work is really simple, we only need to add a draw function to our Graph class (see last article here, here is the code:

require 'graphviz'
def draw(edge_label = true, filename = "graph.png", format = :png)
  g = GraphViz.new(:G, :type => :graph)
  nodes = vertices.map { |v| [v, g.add_nodes(v.id.to_s)] }.to_h
  edges.each do |edge|
    _edge = g.add_edges(nodes[edge.either], nodes[edge.other(edge.either)])
    _edge[:label] = _edge.weight if(edge_label)
  end
  g.output(format => filename)
end

GraphViz uses its own Edge and Node class, thus, we need to convert our vertices and edges. To do so, we create an hash with the old vertices as keys and the GraphViz's vertices as the values. Here we are, we can now easily draw our graphs !
For the second part of the article, we'll study a Minimum Spanning Tree algorithm called Prim's algorithm !

Minimum spanning tree

Introduction

A spanning tree is a tree that connects all the vertices of a graph together, it may not use all the edges to do so. A minimum spanning tree is a spanning tree which total weight is the lowest. For a given graph, there might exists more than one minimum spanning tree. Minimum spanning tree can be used in many application, such as network design (minimum cost for electrical network) or some NP problems approximation (Traveling Salesman for example). There are many MST (Minimum spanning tree) algorithm, but we'll study Prim's one. With this algorithm we start from any vertex of the graph and we add to the MST the minimum-weighted edge connected to this vertex, we also add the other vertex of the edge to our tree, we then repeat this step without forgetting to check that not both vertices of the minimum-weighted edge already are in the tree, we do that until we reached all the vertices. Here is an illustration of the execution of Prim's algorithm on a graph: prim's algorithm With the work we have done before, it should not be too hard to implement this algoritm, here is the pseudo code, G is the Graph, w is the weight function, where w(u, v) is the weight of the edge from u to v, and r is the root (starting vertex)

Prim(G, w, r)
    Make a priority queue Q containing all the vertices of G
    Set the priority of the root to 0 in Q
    Set the priority to infinity for all other members of Q 
    Set the parent of the root to be null
    while Q is not empty
        Let's U be the minimum of the priority queue
        For each adjacent vertex V of U
            if V is in Q dans w(U,V) < priority of V
                Set the parent of V to U (we are coming from U to V)
                Set the priority of V to w(U,V)

As you can see, the trick here is to use a priority queue, with that we can get the vertex with minimum cost really fast, it's the core of the algorithm (once again, we see that knowing some data structure is really mandatory when it comes to algorithm). Unfortunately ruby does not provide any PriorityQueue implementation, but, as always, there is a gem which will save our lives ! (We'll implement our own priority queue soon, but it would be off topic to explain it here). As always

gem install PriorityQueue

Here is the documentation we are now ready to create our Minimum Spanning Tree class !

Implementation

Our MST class initializer is going to take a Graph (UndirectedGraph) as parameter:

class MST
  require 'priority_queue'

  def initialize(graph)
    @mst = Queue.new # The priority queue containing the vertices of the MST
    @graph = graph 
    @marked = Hash.new # We will add the visited vertices to this hash (Hash#has_key? is on O(1) whereas Array#include? is O(n)) 
    @minPQ = PriorityQueue.new # Our priority queue
    @weight = 0 # The MST weight
    @graph.vertices.each do |vertex|
      prim(vertex) if !marked?(vertex) # We run prim from vertex if not already visited
    end
  end

end

As you can see we need more than only one priority queue, we have to remember the visited vertex and keep track of edges in the MST. Before we actually write the "prim" function, let's code the marked? one:

def marked?(vertex)
  @marked.has_key?(vertex)
end

Pretty simple right, here is the prim's one

def prim(vertex)
  scan(vertex)
  until @minPQ.empty?
    edge = @minPQ.delete_min[0]
    from = edge.either
    to = edge.other(from)
    next if(marked?(from) && marked?(to))
    @mst.push(edge)
    @weight += edge.weight
    scan(from) if !marked?(from)
    scan(to) if !marked?(to)
  end
end

As you can see our implementation is not following the pseudo code, indeed, we use the priority queue to store the edge and not the vertices, and we are using the marked? function instead of setting the cost of the vertex to infinity. This code seems simple, but what is purpose of the scan function ? The scan function only checks if we already visited this node and add it to the @marked array if not, and then, for each connected edge, add it to the priority queue, here is the code

def scan(vertex)
  raise 'Already marked' if marked?(vertex)
  @marked[vertex] = true
  @graph.adj(vertex).each do |edge| # look in the adjacency list of the graph
    @minPQ[edge] = edge.weight if(!marked?(edge.other(vertex)))
  end
end

And we now have a working Prim's algorithm to generate a Minimum Spanning tree of a Graph ! But we are not done yet, it could be useful to get a graph from this MST and not only a Queue with all edges, let's create a fonction which generate an UndirectedGraph from this queue.

def generate_graph
  graph = UndirectedGraph.new(@graph.vertices)
  until @mst.empty?
    edge = @mst.pop
    graph.add_edge(edge)
  end
  graph
end

Now we are done ! Here is the final class, along with an example to test (this is the same graph as in the previous article)

class MST
  require 'priority_queue'

  attr_reader :weight

  def initialize(graph)
    @mst = Queue.new # The priority queue containing the vertices of the MST
    @graph = graph 
    @marked = Hash.new # We will add the visited vertices to this hash (Hash#has_key? is on O(1) whereas Array#include? is O(n)) 
    @minPQ = PriorityQueue.new # Our priority queue
    @weight = 0 # The MST weight
    @graph.vertices.each do |vertex|
      prim(vertex) if !marked?(vertex) # We run prim from vertex if not already visited
    end
  end

  def generate_graph
    graph = UndirectedGraph.new(@graph.vertices)
    until @mst.empty?
      edge = @mst.pop
      graph.add_edge(edge)
    end
    graph
  end

  private

  def prim(vertex)
    scan(vertex)
    until @minPQ.empty?
      edge = @minPQ.delete_min[0]
      from = edge.either
      to = edge.other(from)
      next if(marked?(from) && marked?(to))
      @mst.push(edge)
      @weight += edge.weight
      scan(from) if !marked?(from)
      scan(to) if !marked?(to)
    end
  end

  def scan(vertex)
    raise 'Already marked' if marked?(vertex)
    @marked[vertex] = true
    @graph.adj(vertex).each do |edge| # look in the adjacency list of the graph
      @minPQ[edge] = edge.weight if(!marked?(edge.other(vertex)))
    end
  end

  def marked?(vertex)
    @marked.has_key?(vertex)
  end

end

And the example (graph.txt contains the same values as in the previous article)

require './MST.rb'
require './UndirectedGraph.rb'
require './Edge.rb'
require './Vertex.rb'

response = open('graph.txt').read.split("\n")
graph = UndirectedGraph.new()
response.each do |stredge|
    l = stredge.split(" ")
    e = Edge.new(Vertex.new(l[0]), Vertex.new(l[1]), l[2].to_f)
    graph.add_edge(e)
end

mst = MST.new(graph)
p mst.weight
mst.generate_graph.draw

The resulting graph is: Graph Mst

A nice feature we could add is generate a gif file from this MST class, showing all the steps that leaded to the minimum spanning tree, I may add this in a near future ! Once again, thank you for reading, and don't forget to give me feedback !

[Ruby] Graph representation

30 Jan 2015

Hi, It's been a while since the last time I wrote something, I was quite busy, and I'm currently working on an C++ Game Engine, which takes almost all my spare time. Today we are talking about graphs, graphs are powerful and can be used to solve many problems, it's a pretty big topic so I'm going to split it in severals articles, this one is about graph representation because when it comes to graphs the main question is : "How do I represent a graph in a program ?", hence we will not see any algorithm on graph today, but we'll create the data structure we'll need later ! I'll use ruby for this article, as it's quite versatile, and will save us time.

Vertices

The vertices are the component of the graph we want to connect, they have a name (or any kind of identifier), may have a weight, and... and it's all, vertices are really simple objects, here is the class for vertex:

class Vertex
  include Comparable 

  attr_accessor :weight
  attr_reader :id

  def initialize(id, weight = 0)
    @id = id
    @weight = weight
  end

  def <=>(other)
    @id<=>other.id
  end

  def hash
    @id.hash
  end

  def eql?(other)
    hash == other.hash
  end

end

The code is pretty self explanatory, but We need to include Comparable in our class so that we can compare and distinguish vertices, we also need to add the hash and eql? methods as we'll use vertices as key in a Hash! The identifier can be anything (Fixnum, String, ...) but mush also be a Comparable. Not that we have our vertices, let's see how to connect them !

Edges

An edge is a connection between two vertices, it can be directed or undirected, and have a weight, the power of the graph theory is that this 'weight' can represent almost anything, from a distance to a flux or even a price, here is the code

class Edge
  include Comparable

  attr_reader :weight, :from, :to

  def initialize(from, to, weight)
    @from = from
    @to = to
    @weight = weight
  end

  def either
    @from
  end

  def other(vertex)
    (vertex == @from) ? @to : @from
  end

  def <=>(edge)
    @weight<=>edge.weight
  end

end

Once again, the implementation is pretty clear, we need to include Comparable to compare edge (only the weight comparison is interesting here), the methods either and other are a nice way to find the two vertices of an undirected edge !

As you can see, the two previous class were pretty simple, and we'll now need a graph class to connect both of them !

Weighted Graph

In a graph we need to keep track of the edge between to vertices, there is two way of doing that, either the adjency list or adjacency matrix both have their pros and cons, but we'll use ans adjacency list for our implementation, first we'll need a constructor which can take an array of vertices (or nil):

def initialize(vertices = nil)
  @adj = Hash.new
  if(!vertices.nil?)
    vertices.each { |vertex| @adj[vertex] = Array.new }
  end
  @edge_number = 0
end

We use a Hash to keep a track of the vertices, each of the vertices is associated with an empty array, this array will store the edges connected to it, let's code a method to add an edge to the graph, if one of the vertices of the edge is not in the graph, we'll add it.

def add_edge(edge)
  v1 = get_or_create(edge.either)
  v2 = get_or_create(edge.other(v1))
  @adj[v1].push(edge)
  if(v1 != v2) then @adj[v2].push(edge) end # If start == from we don't want to add the edge twice
  @edge_number += 1
end

As our vertices are objects, we need to be sure that we use the same reference for all the edge, this is why we need to create a helper method "getorcreate" which will search in @adj.keys if there already exists such a vertex, and create it if not found, here is this helper

def get_or_create(vertex)
  array_id = vertices.find_index(vertex)
  return vertices[array_id] if array_id
  @adj[vertex] = Array.new
  vertex
end

We now have an almost fully functional graph, but we need to add some getter methods to retrieve the edges, the vertices, and the adjacendy vertices.

def edges # returns an Array containing all the graph vertices
  edges = Array.new
  @adj.each do |vertex, _edges|
    _edges.each do |edge|
      if(edge.other(vertex) > vertex) # as we are currently using un undirected graph, we don't want to add the same edge twice
        edges.push(edge)
      end
    end
  end
end

def vertices_number
  @adj.size
end

def vertices
  @adj.keys
end


def adj(vertex)
  validate_vertex(vertex)
  @adj[vertex].dup
end

def degree(vertex)
  validate_vertex(vertex)
  @adj[vertex].size
end

We don't need to do an edge_number method as edge_number property is a attr_reader (getter created by ruby), the edges method loop through the hash and add the edges if not already added, it's one of the only method you would have to change to go from an undirected graph to a directed one (along with add_edge). In the previous method, there is a method validate_vertex, which throw an exception if the vertex is not in the graph, here is the method:

private
def validate_vertex(vertex)
  raise 'Vertex is not part of the graph' if !@adj.has_key?(vertex)
end

The UndirectedGraph class is ready to be used, here is the complete class:

class UndirectedGraph
  require './Edge.rb'
  require './Vertex.rb'

  attr_reader :edge_number

  def initialize(vertices = nil)
    @adj = Hash.new
    if(!vertices.nil?)
      vertices.each { |vertex| @adj[vertex] = Array.new }
    end
    @edge_number = 0
  end

  def add_edge(edge)
    v1 = get_or_create(edge.either)
    v2 = get_or_create(edge.other(edge.either))
    edge = Edge.new(v1, v2, edge.weight)
    @adj[v1].push(edge)
    if(v1 != v2) then @adj[v2].push(edge) end # If start == from we don't want to add the edge twice
    @edge_number += 1
  end

  def edges # returns an Array containing all the graph vertices
    edges = Array.new
    @adj.each do |vertex, _edges|
      _edges.each do |edge|
        if(edge.other(vertex) > vertex) # as we are currently using un undirected graph, we don't want to add the same edge twice
          edges.push(edge)
        end
      end
    end
    edges
  end

  def vertices_number
    @adj.size
  end

  def vertices
    @adj.keys
  end

  def adj(vertex)
    validate_vertex(vertex)
    @adj[vertex].dup
  end

  def degree(vertex)
    validate_vertex(vertex)
    @adj[vertex].size
  end


  private
  def validate_vertex(vertex)
    raise 'Vertex is not part of the graph' if !@adj.has_key?(vertex)
  end

  def get_or_create(vertex)
    array_id = vertices.find_index(vertex)
    return vertices[array_id] if !array_id.nil?
    @adj[vertex] = Array.new
    vertex
  end

end

Let's see how we can use it to represent an actual graph, we will read the following graph: graph_1 For testing purposes, I recommend you to create an text file with the following structure:

from to weight

Here is the one for the previous graph:

a b 4
a h 8
b h 11
b c 8
c i 2
c d 7
c f 4
d f 14
d e 9
e f 10
f g 2
g i 6
g h 1
h i 7

Here is a ruby code to read it and create the graph:

require './UndirectedGraph.rb'
require './Vertex.rb'
require './Edge.rb'

data = open('graph.txt').read.split("\n")
graph = UndirectedGraph.new
data.each do |strdata|
  strdata = strdata.split(" ")
  v1 = Vertex.new(strdata[0], 0) # We don't want our vertex to have a weight
  v2 = Vertex.new(strdata[1], 0) # Same here
  edge = Edge.new(v1, v2, strdata[2].to_f)
  graph.add_edge(edge)
end

graph.edges.each do |edge|
  p edge
end

Here is the ouput:

#<Edge:0x00000001e57558 @from=#<Vertex:0x00000001e57698 @id="a", @weight=0>, @to=#<Vertex:0x00000001e57670 @id="b", @weight=0>, @weight=4.0>
#<Edge:0x00000001e57328 @from=#<Vertex:0x00000001e57698 @id="a", @weight=0>, @to=#<Vertex:0x00000001e57418 @id="h", @weight=0>, @weight=8.0>
#<Edge:0x00000001e57120 @from=#<Vertex:0x00000001e57670 @id="b", @weight=0>, @to=#<Vertex:0x00000001e57418 @id="h", @weight=0>, @weight=11.0>
#<Edge:0x00000001e56f18 @from=#<Vertex:0x00000001e57670 @id="b", @weight=0>, @to=#<Vertex:0x00000001e57008 @id="c", @weight=0>, @weight=8.0>
#<Edge:0x00000001e55ac8 @from=#<Vertex:0x00000001e57418 @id="h", @weight=0>, @to=#<Vertex:0x00000001e56e00 @id="i", @weight=0>, @weight=7.0>
#<Edge:0x00000001e56d10 @from=#<Vertex:0x00000001e57008 @id="c", @weight=0>, @to=#<Vertex:0x00000001e56e00 @id="i", @weight=0>, @weight=2.0>
#<Edge:0x00000001e56b08 @from=#<Vertex:0x00000001e57008 @id="c", @weight=0>, @to=#<Vertex:0x00000001e56bf8 @id="d", @weight=0>, @weight=7.0>
#<Edge:0x00000001e56900 @from=#<Vertex:0x00000001e57008 @id="c", @weight=0>, @to=#<Vertex:0x00000001e569f0 @id="f", @weight=0>, @weight=4.0>
#<Edge:0x00000001e566f8 @from=#<Vertex:0x00000001e56bf8 @id="d", @weight=0>, @to=#<Vertex:0x00000001e569f0 @id="f", @weight=0>, @weight=14.0>
#<Edge:0x00000001e564f0 @from=#<Vertex:0x00000001e56bf8 @id="d", @weight=0>, @to=#<Vertex:0x00000001e565e0 @id="e", @weight=0>, @weight=9.0>
#<Edge:0x00000001e560e0 @from=#<Vertex:0x00000001e569f0 @id="f", @weight=0>, @to=#<Vertex:0x00000001e561d0 @id="g", @weight=0>, @weight=2.0>
#<Edge:0x00000001e562e8 @from=#<Vertex:0x00000001e565e0 @id="e", @weight=0>, @to=#<Vertex:0x00000001e569f0 @id="f", @weight=0>, @weight=10.0>
#<Edge:0x00000001e55ed8 @from=#<Vertex:0x00000001e561d0 @id="g", @weight=0>, @to=#<Vertex:0x00000001e56e00 @id="i", @weight=0>, @weight=6.0>
#<Edge:0x00000001e55cd0 @from=#<Vertex:0x00000001e561d0 @id="g", @weight=0>, @to=#<Vertex:0x00000001e57418 @id="h", @weight=0>, @weight=1.0>

Ok, I admit it does not seems really powerful here, but from this, we can really do a lot of things quite easily, the next time we'll see how to find the Minimum Spanning Tree of a graph.

Thank you for reading, and don't forget to give me feedback !

Concept algorithmique : la récursivité

07 Jul 2012

Je vais tenter d'aborder le sujet le plus simplement possible, et tenterai de rester le plus clair possible dans mes explications, comme d’habitude, si vous souhaitez rajouter des informations, faire des corrections sur ces articles, vous êtes bienvenus ! Au sujet de la programmation récursive, Wikipédia dit :

Un algorithme est dit récursif s’il s’appelle lui-même

Ainsi, nous allons résoudre un problème donné en résolvant des problèmes similaires plus petits. Comme le dit le dicton : "Une image vaut mieux que mille mots", j’utiliserai un exemple concret tout au long de cet article: L’escalier ! Si j’utilise cet exemple, c’est que c’est grâce à lui que j’ai compris la récursivité et j’espère que ce sera également le cas pour vous. L’énoncé est le suivant: vous avez un escalier composé de X marches, et vous pouvez descendre soit d’une marche, soit de deux. Combien y'a-t-il de possibilités pour descendre cet escalier ? Pour un escalier d’une marche, nous n’avons qu’une possibilitée : une marche De même, pour un escalier de deux marches, il existe deux façons de descendre l’escalier: deux marches Enfin, Pour un escalier de 3 marches: trois marches Il y a donc trois chemins possibles pour un escalier de trois marches.

Revenons-en à la décomposition du problème en sous problèmes: descendre de trois marches reviens à descendre deux marches puis d'une marche (peu importe l’ordre). Comme il n’y a qu’une seule et unique façon de descendre une marche, et qu'il en existe deux pour descendre deux marches, on a 2+1 = 3, trois possibilités pour descendre trois marches, pour 4 marches, on peut soit descendre d’une marche, il ne reste donc qu’un escalier à trois marches, soit descendre de deux marches et il ne reste ainsi qu’un escalier à deux marches, on peut donc dire que le nombre de façons de descendre un escalier de quatre marches revient à additionner le nombre de façons de descendre d’un escalier de 3 marches et celle d’un escalier à deux marches il y a donc 5 possibilitées différentes pour descendre un escalier de 4 marches.

Un dernier exemple pour la route, considérons un escalier à cinq marches, si l’on descend d’une marche, il ne reste plus que quatre marches, tandis que si nous descendons de deux marches, alors il ne reste plus q'un escalier de 3 marches, ainsi il y a 5+3 , c'est-à-dire 8 façons de descendre un escalier composé de 5 marches. Car descendre un escalier à 5 marches revient à additionner le nombre de possibilité de descendre un escalier à 5-1 marches et 5-2 marches. Vous l’aurez donc compris, pour connaitre le nombre de possibilités qu’il y a pour descendre un escalier à X marches, il faut connaitre le nombre de possibilités qu’il y a pour descendre un escalier à X-1 et X-2 marches.

Pour revenir à nous moutons, nous venons de créer les sous problèmes identiques à celui que l'on veut résoudre, le principe même de la récursivité. Mais, on ne peut pas calculer indéfiniment X-1 et X-2, il nous faut bien un point d’arrêt, et ce point (ou ces points) s’appelle(nt) des solutions évidentes (comme les racines évidentes sur les équations), ou les cas de base.

On peut également noter qu'en réalité, il s'agit de la suite de Fibonnacci que l'on calcule dans ce problème, où nous faisons la somme des deux termes précédents (On a donc, d'un point de vu mathématique : Un+2 = Un+1 + Un).

Pour en revenir à notre escalier, les solutions sont 1 pour un escalier à une marche, et 2 pour un escalier à deux marches. Attention, la gestion des solutions évidentes est primordiale, sans elle, vous vous retrouverez avec des boucles infinies très rapidement. Voyons maintenant comment implementer une telle récursion, en C++ (très simplement adaptable sur d'autres langages)

#include <iostream>

using namespace std;

int marches(int nombre){
    if(nombre == 0 ||nombre == 1 || nombre == 2) // On gère les solutions évidente, on ajoute le 0 pour eviter des problèmes
        return nombre;
    return marches(nombre-1)+marches(nombre-2);// On appelle la fonction avec un escalier de taille inférieur de 1 et inférieur de deux.
}

int main(int argc, char* argv[]){
    int taille;
    cout << "Combien de marche y a-t-il dans l'escalier ?" << endl;
    cin >> taille;
    cout << marches(taille) << endl; // Appelle de la fonction
    return 0;
}

On remarque que la fonction marches se rappelle, avec des arguments à chaque fois plus petits, il s'agit de résoudre les sous problèmes les plus simples en premier. L’un des problèmes majeurs d’une fonction récursive est son temps d’exécution, en effet si l’on demande les solutions pour un escalier de 40marches, celui va mettre plusieurs secondes à s’exécuter (voir plus sur certaines machines), cela s’explique par le fait que les valeurs sont calculées de nombreuses fois, pour illustrer cela, voici un arbre qui montre l’exécution de ce programme pour un escalier à 5 marches. Fibonacci Tree

On peut donc voir que rien qur pour un escalier à cinq marches, on calcul deux fois les solutions pour un escalier à trois marches. Imaginez le nombre de répétitions sur un escalier à 15, à 20 ou encore 40 marches. Une solution serait de stocker les valeurs calculées dans un tableau et de vérifier leur existence avant de les calculer, c’est ce qu’on appelle la programmation dynamique.

Voici donc la fin de cette introduction à la récursivité, en espérant, une fois de plus avoir été le plus clair possible.