Over the last month or so, I’ve made a start reading through Mark Newman’s Networks: An Introduction, trying (though I’m not sure how successfully!) to bring an element of discipline to my otherwise osmotically acquired understanding of the techniques employed by various network analysis tools.
One distinction that made a lot of sense to me came from the domain of bibliometrics, specifically between the notions of bibliographic coupling and co-citation.
The idea of co-citation will be familiar to many – when one article cites a set of other articles, those other articles are “co-cited” by the first. When the same articles are co-cited by lots of other articles, we may have reason to believe that they are somehow related in a meaningful way.
In graph terms, we might also represent this as simpler graph within which edges between two articles indicate that they have been co-cited by documents within a particular corpus, with the weight of each edge representing the number of documents within that corpus that have co-cited them.
Bibliographic coupling is actually an earlier notion, describing the extent to which two works are related by virtue of them both referencing the same other work.
Again, in graph terms, we might think of a simpler undirected network in which edges between two articles act as an indicator that they have cited or referenced the same work, with the weight of the edge representing the number of documents that they have co-cited.
A comparison of co-citation and bibliographic coupling networks shows one to be “retrospective” and the other to be “forward looking”. The articles referenced in bibliographic coupling network can be generated directly from a corpus set of articles, and to this extent bibliographic coupling looks to the past. In a co-citation network, the edges that connect two articles can only be generated when a future published article cites them both.
Co-citation, Bibliographic Coupling and Company Director Networks
For some time I’ve been tinkering with the notion of co-director networks, using OpenCorporates data as a data source (eg Mapping Corporate Networks With OpenCorporates). What I’ve tended to focus on are networks built up from active companies and their current directors, looking to see which companies are currently connected by virtue of currently sharing the same directors. On the to do list are timelines showing the companies that a particular director has been associated with, and when, as well as directorial appointments and terminations within a particular company.
In both co-citation and bibliographic analyses, the nodes are the same type of thing (that is, works that are citated, such as articles). A work cites a work. (Note: does author co-citation analysis rely on mappings from works to cited authors, or citing authors to cited authors?). In company-director networks, we have bipartite representation, with directors and companies representing the two types of node and where edges connect companies and directors but not companies and companies or directors and directors; unless a company is a director, but we generally fudge the labelling there.
If we treat “companies that retain directors” as “articles that cite other articles”:
– under a “co-citation” style view, we generate links between companies that share common directors;
– under a “bibliographic coupling” style view, we generate links between directors of the same companies.
I’ve been doing this anyway, but the bibliographic coupling/co-citation distinction may help me tighten it up a little, as well as improving ways of calculating and analysing these networks by reusing analyses described by the bibliometricians?
Pondering the “future vs. past” distinction, the following also comes to mind:
– at the moment, I am generating networks based on current directors of active companies;
– could we construct a dynamic (temporal?) hypergraph from hyperedges that connect all the directors associated with a particular company at a particular time? If so, what could we do with this graph?! (As an aside, it’s probably worth noting that I know absolutely nothing about hypergraphs!)
I’ve also started wondering about ‘director pathways’ in which we define directors as nodes (where all we require was that a person was a director of a company at some time) and directed “citation” edges. These edges would go from one director to other director nodes under the condition that the “citing” director was appointed to a particular company within a particular time period t1..t2 before the appointment to the same company of a “cited” director. If one director follows another director into more than one company, we increase the weight of the edge accordingly. (We could maybe also explore modes in which edge weights represent the amount of time that two directors are in the same company together.)
The aim is… probably pointless and not that interesting. Unless it is… The sort of questions this approach would allow us to ask would be along the lines of: are there groups of directors whose directorial appointments follow similar trajectories through companies; or are there groups of directors who appear to move from one company to another along with each other?