Information Theory #4 - Airline Route Map

I am writing a series of posts about Information Theory problems. You may find the previous one here.

The problem we will analyze is Airline Route Map from the Japanese Olympiad in Informatics Spring Camp (JOI SC 2018). You may find the problem statement and a place to submit here .

A synthesis of the problem statement is : Alice needs to send to Bob an undirected graph of N vertices (N <= 1000) and M edges (M <= N*(N-1)/2). However, the information Alice sends suffers a series of shuffles before Bob receives it

  • The number of the vertices are changed. So each of the N vertices will receive a new unique number from 0 to N-1.
  • The edge list is shuffled. The edges remain consistent to the information you sent (if you sent A->B, then it will mean A* -> B*, where A* and B* are the new numbers for A and B) , but their order on the edge list is not the same.
  • The direction of the edge is changed. If it was A->B , then you may receive A*->B* or B*->A*

You may add or remove vertices and edges when you are sending the graph as Alice. When Bob receives the shuffled information, he needs to recove the original graph Alice wanted to send. You must submit two files:

  • One that given the undirected graph we want to send, generates a new graph and sends it.
  • One that given the shuffled sent graph, recovers the original one The scoring is based on the number of extra vertices you sent. The final subtasks uses only 12 extra vertices to recover the graph.

This time I will not focus on the subtasks but only at two solutions : one that uses 13 vertices and the improvement of this solution that uses 12 vertices.

A question : add or remove edges ?

We are allowed to send any graph, so we must choose if we will add or remove edges (or do both). It turns out that it is very hard to recover information of edges that were not sent using few additional vertices. So it is better to send the original graph with some additional vertices that will help to recover the original numbering of the vertices.

First step : how to recover the numbers ?

The first challenge is to recover the original numbers with such few extra vertices. This requires an efficient way of sending information.

It turns that sending the binary representation of the numbers is a good idea. To do that, we create 10 extra vertices, one for each possible set bit of a number. Then, we connect the i-th extra vertex to the numbers whose i-th bit is set.

Connection to bits

The image above does not show all the bits, but represent the connections of vertex 6 : it has an edge to the vertices that represent 2^1 and 2^2, but not to the one that is 2^0.

Second step : identifying the bit vertices

The next challenge is to identify those special vertices. My idea to solve this was to create an additional vertex that is connected to all vertices of the graph, except to the bit ones.

Additional vertex connected to everything except bits

In addition to that, I also connected the bit vertices to form a chain , connecting the i-th one to the (i+1)-th one.

Bit chain

So now we only need to identify the 10th bit vertex and the special vertex that is connected to all the nodes. If we do that, we can always discover the next bit of the chain and we have solve our problem.

Third step : identifying the last bit vertex and the special vertex

We have already used 11 extra vertices, so we do not want to use a lot more in order to identify these two vertices. So we must find some special property or implicit information that can helps us decode without sending many vertices.

One important piece of information that we have not used so far is the degree of the vertices. If we analyze the degrees, we discover an important property : for N >= 3, all of the vertices but two will always have their degree bigger than or equal to 2. The only two vertices that sometimes can break the rule are the vertex 0, when it does not have any edge in the original graph,  and the 10th bit vertex when N <= 512.

A trick for the two cases

Therefore, it is a good idea to use two extra vertices with degree one to try identify the special and bit vertex, because they will be unique in our graph. With the extra one degree vertex, the 10th bit vertex no longer can have a degree one. The only remaining vertex that can mess up with our idea is 0.

However, it is easy to overcome the problem with 0 : we can simply make the graph 1-indexed instead of 0-indexed. That way, 0 will become one and have a set bit, thus having a degree of at least two.

In the end , we do as follows : we make the graph 1-indexed and add the two 1-degree vertices. Then , we identify the two vertices that are connected to the 1-degree vertices and distinguish the special vertex based on the degree (the special one will always have more edges). Then , we process our chain of bits, discover the original numbering for each vertex and report the original graph. This algorithm uses 13 additional vertices and scores 91 points, which is a very good result.

Fourth step : final optimization

In order to fully solve the problem, we must get rid of exactly one extra vertex. Therefore, we will need to use again implicit information that we did not use before.

It turns out that we do not need the 1-degree vertex to identify the last bit vertex. Of all the bit vertices (that can be easily identified because of the special vertex), it will always be the one that has the smallest degree.

Because of that, we drop the 1-degree vertex that connects to the last bit and tweak the algorithm we used before. Notice that by doing this, the 10th bit vertex can sometimes have only one edge, but with some thinking we can handle this case.

In my implementation of this idea, I hard coded small cases to avoid corner cases. The remaining part of the code implements exactly the same idea as described above.

Ivan Carvalho
Software Developer

Always looking for a challenge!