F/X VisualizationsMay 14, 2010
The Network Structure of Baseball Blogs: Part 2
By Dave Allen

Two weeks ago I posted about the network structure of baseball blogs.  In the framework of a network (or graph) each blog is a node and two blogs are connected together by an edge if one links another.  The edges are directed, each link goes from one blog to another, and weighted, I looked over the course of 100 posts and counted the number of links so if there were more than one that edge was given a greater weight.

In the quick look in my last post I first showed the structure of the overall network, with Baseball Analysts and a number of other sabermeteric blogs clustering out together at the center of the larger network of baseball blogs.  Around the periphery were sub-clusters of team specific blogs, which tended to be heavily connected with blogs covering the same team.

In that post to keep the network fairly simple I only connected two blogs if they were linked three or more times.  This dropped many connections and blogs out of the network.  That was a good solution to look at the smaller set of central blogs, but it lost most of the structure.  I was also interested in how different sub-clusters of team focused blogs arranged in the network.  

To look at this I plotted out all the 150 or so team-specific of the top 200 blogs.  Here I included all links but weighted them by how many there were.  The nodes are labeled by my code for the blog name, which are color-coded for each team.  The colors are not perfect, but with the code from the name they should be clear.  Click on the image for a larger version.
all_team.png
There is a lot going on with this diagram and I couldn't begin to write about all of it, but I will note some of the things I find interesting.  At the bottom of the network are the Yankees and Mets blogs, which are well connected (we saw this last week too).  To the upper-right of the Mets is most of the NL East: the big constellation of Nats blogs, a couple Florida blogs off that, and then, more centrally located, four Phillies blogs.  To the left of the Yankees is a fairly large group of Red Sox blogs and not too far from that, but also more centrally located, the four Rays blogs.  Both the Rays and Phillies have most of their blogs close to the center of the web.  My guess is this because of their recent history of in the World Series.  Outside of the AL and NL East the structure is not as clear.  The NL Central clusters out fairly well in the upper right of the graph, but the other divisions are not as clear.  

This is a fairly qualitative analysis, it would be interesting to make it more quantitative looking at the percentage of potential links filled within versus without divisions, based on the geographical location of the teams.  

Comments

Just curious, what software do you use to draw these?

I used the statistics and graphics program, R. In R I used the package network.

Hi Hello