tethne.networks.papers module¶
Methods for generating networks in which papers are vertices.
| author_coupling | Vertices are papers and edges indicates shared authorship. | 
| bibliographic_coupling | Generate a bibliographic coupling network. | 
| cocitation | Generate a cocitation network. | 
| direct_citation | Create a traditional directed citation network. | 
| topic_coupling | Two papers are coupled if they both contain a shared topic above threshold. | 
- Vertices are papers and edges indicates shared authorship. - Element - Description - Node - Papers, represented by node_id. - Edge - (a,b) in E(G) if a and b share x authors and x >= threshold - Edge Attributes - overlap: the value of x (above). - Parameters: - papers : list - A list of Paper - threshold : int - Minimum number of co-citations required to draw an edge between two authors. - node_id : string - Field in Paper used to identify nodes. - node_attribs : list - List of fields in Paper to include as node attributes in graph. - Returns: - acoupling : networkx.Graph - An author-coupling network. 
- tethne.networks.papers.bibliographic_coupling(papers, citation_id='ayjid', threshold=1, node_id='ayjid', node_attribs=['date'], weighted=False, **kwargs)[source]¶
- Generate a bibliographic coupling network. - Two papers are bibliographically coupled when they both cite the same, third, paper. You can generate a bibliographic coupling network using the networks.papers.bibliographic_coupling() method. - >>> BC = nt.papers.bibliographic_coupling(papers) >>> BC <networkx.classes.graph.Graph object at 0x102eec710> - Especially when working with large datasets, or disciplinarily narrow literatures, it is usually helpful to set a minimum number of shared citations required for two papers to be coupled. You can do this by setting the `threshold` parameter. - >>> BC = nt.papers.bibliographic_coupling(papers, threshold=1) >>> len(BC.edges()) 1216 >>> BC = nt.papers.bibliographic_coupling(papers, threshold=2) >>> len(BC.edges()) 542 - Element - Description - Node - Papers represented by node_id. - Node Attributes - node_attribs in Paper - Edge - (a,b) in E(G) if a and b share x citations where x >= threshold. - Edge Attributes - overlap: the number of citations shared - Parameters: - papers : list - A list of wos_objects. - citation_id: string - A key from Paper to identify the citation overlaps. Default is ‘ayjid’. - threshold : int - Minimum number of shared citations to consider two papers “coupled”. - node_id : string - Field in Paper used to identify the nodes. Default is ‘ayjid’. - node_attribs : list - List of fields in Paper to include as node attributes in graph. - weighted : bool - If True, edge attribute overlap is a float in {0-1} calculated as \cfrac{N_{ij}}{\sqrt{N_{i}N_{j}}} where N_{i} and N_{j} are the number of references in Paper i and j, respectively, and N_{ij} is the number of references shared by papers i and j. - Returns: - bcoupling : networkx.Graph - A bibliographic coupling network. - Raises: - KeyError : Raised when citation_id is not present in the meta_list. - Notes - Lists cannot be attributes? causing errors for both gexf and graphml also nodes cannot be none. 
- tethne.networks.papers.cocitation(papers, threshold=1, node_id='ayjid', topn=None, verbose=False, node_attribs=['date'], **kwargs)[source]¶
- Generate a cocitation network. - A cocitation network is a network in which vertices are papers, and edges indicate that two papers were cited by the same third paper. CiteSpace is a popular desktop application for co-citation analysis, and you can read about the theory behind it here. Co-citation analysis is generally performed with a temporal component, so building a GraphCollection from a :class`.Corpus` sliced by date is recommended. - You can generate a co-citation network using the networks.papers.cocitation() method: - >>> CC = nt.papers.cocitation(papers) >>> CC <networkx.classes.graph.Graph object at 0x102eec790> - For large datasets, you may wish to set a minimum number of co-citations required for an edge between two papers Keep in mind that all of the references in a single paper are co-cited once, so a threshold of at least 2 is prudent. Note the dramatic decrease in the number of edges when the threshold is changed from 2 to 3. - >>> CC = nt.papers.cocitation(papers, threshold=2) >>> len(CC.edges()) 8889 >>> CC = nt.papers.cocitation(papers, threshold=3) >>> len(CC.edges()) 1493 - Element - Description - Node - Cited papers represented by Paper ayjid. - Edge - (a, b) if a and b are cited by the same paper. - Edge Attributes - weight: number of times two papers are co-cited together. - Parameters: - papers : list - a list of Paper objects. - threshold : int - Minimum number of co-citations required to create an edge. - topn : int or float, or None - If provided, only the topn (int) or topn percent (float) most cited papers will be included in the cocitation network. If None (default), network will include all cited papers (NOTE: this can cause severe memory consumption for even moderately-sized datasets). - verbose : bool - If True, prints status messages. - Returns: - cocitation : networkx.Graph - A cocitation network. 
- tethne.networks.papers.direct_citation(papers, node_id='ayjid', node_attribs=['date'], **kwargs)[source]¶
- Create a traditional directed citation network. - Direct-citation graphs are directed acyclic graphs in which vertices are papers, and each (directed) edge represents a citation of the target paper by the source paper. The networks.papers.direct_citation() method generates both a global citation graph, which includes all cited and citing papers, and an internal citation graph that describes only citations among papers in the original dataset. - To generate direct-citation graphs, use the networks.papers.direct_citation() method. Note the size difference between the global and internal citation graphs. - >>> gDC, iDC = nt.papers.direct_citation(papers) >>> len(gDC) 5998 >>> len(iDC) 163 - Element - Description - Node - Papers, represented by node_id. - Edge - From a paper to a cited reference. - Edge Attribute - Publication date of the citing paper. - Parameters: - papers : list - A list of Paper instances. - node_id : int - A key from Paper to identify the nodes. Default is ‘ayjid’. - node_attribs : list - List of user provided optional arguments apart from the provided positional arguments. - Returns: - citation_network : networkx.DiGraph - Global citation network (all citations). - citation_network_internal : networkx.DiGraph - Internal citation network where only the papers in the list are nodes in the network. - Raises: - KeyError : If node_id is not present in the meta_list. 
- tethne.networks.papers.topic_coupling(papers, threshold=0.7, node_id='ayjid', **kwargs)[source]¶
- Two papers are coupled if they both contain a shared topic above threshold. - Element - Description - Node - Papers, represented by node_id. - Edge - (a,b) in E(G) if a and b share >= 1 topics with proportion >= threshold in both a and b. - Edge Attributes - weight: combined mean proportion of each shared topic. topics: list of shared topics. - Parameters: - papers : list - A list of Paper - threshold : float - Minimum representation of a topic in each paper. - node_id : string - Field in Paper used to identify nodes. - Returns: - tc : networkx.Graph - A topic-coupling network. 

