I continue my little excursion into network science. In the last post, I gave a little introduction to simulating and visualizing undirected networks with community structure in R. In this post I want to explore a method to infer the community structure of a network from its adjacency matrix. That is, given that I know which nodes are connected to each other, I want to infer which nodes belong to the same community. I will focus on the case of two communities, but the method can be extended to three or more.

This post requires basic knowledge of matrix calculus, in particular eigen-decomposition. Understanding how the k-means clustering algorithm works is also useful. The R-code to reproduce all the figures is given at the bottom of the post.

Network science is potentially useful for certain problems in data analysis, and I know close to nothing about it.

In this short post I present my first attempt at network analysis: A minimal example to construct and visualize an artificial undirected network with community structure in R. No network libraries are loaded. Only basic R-functions are used.

The final product of this post is this plot:

I am going to walk you through the code to produce the above plots. The complete code is given at the bottom of the post.

As a command-line junky, I find most bibliographic managment software (such as mendeley) too bloated. All I want such software to be capable of is

- to add a new entry to my bibliography (bibtex), and
- search through the articles.

Here’s my minimal approach to implementing these two features:

Autocorrelation of a time series can be useful for prediction because the most recent observation of the prediction target contains information about future values. At the same time autocorrelation can play tricks on you because many standard statistical methods implicitely assume independence of measurements at different times.

Imagine you perform a statistical analysis on a time series of stock market data. After some transformation, averaging, and “renormalization” you find that the resulting quantity, let’s call it , behaves as a function of time like . Since you are a physicist you get excited because you have just discovered a power law. Physicists love power laws.

Now you analyze some more financial time series using the same technique and find similar behavior. Power laws all over the place. You get even more excited. In the paper about the analysis (which you submit to a physics journal) you may throw buzzwords like “scale-free”, “critical phase transition”, and “universality”. You can also add to your CV that you contributed to the understanding of market dynamics.

I was once told that the reason that such a shape was so commonly used for aeroplane wings was merely that then one could study it mathemtically by just employing the Zhoukowski transformation. I hope that this is not true!

(R. Penrose, “The Road to Reality”, p.150)

Penrose here talks about a complex holomorphic mapping also known as the aerofoil transformation.

The Monty Hall problem goes like this: You are at a game show in front of 3 doors. There is a car behind one door and goats behind the other two doors. After you have made your choice the show master opens one of the remaining two doors, namely one with a goat. You now have the chance to change your initial choice. Do you stick to your door? Do you change? Does it make any difference at all?

The thing is, it does make a difference. If you stick, you have a probability of 1/3 to win the car, but if you change, your chances go up to 2/3. The idea is that the showmaster conveys information about where the car is by opening a wrong door. There are mathematical arguments as to why you should change. If you find that unintuitive, you are in good company. Why should you be smarter after the showmaster has opened a wrong door than before? An intuitive explanation of why the showmaster conveys information is this:

Imagine there were not 3 but 1000 doors to begin with. There is still one car but now there are 999 goats. You get to choose a single door initially. Now the showmaster does not open 1 door but 998 doors, all with goats, and you are again left with your initial choice and one more door. Do you stick to your choice now or do you change to the door the showmaster did not open? He clearly gave you a hint by not opening that particular door.