SciPy

tethne.readers package

Module contents

Methods for parsing bibliographic datasets.

dfr Methods for parsing JSTOR Data-for-Research datasets.
wos Reader for Web of Science field-tagged bibliographic data.
scopus Reader for Scopus CSV data files.

Each file reader provides methods to parse bibliographic data from a scholarly database (e.g. Web of Science or PubMed), resulting in a list of Paper instances containing as many as possible of the following keys (missing values are set to None):

Field Type Description
aulast list Authors’ surnames, as a list.
auinit list Authors’ initials, as a list.
institution dict Institutions with which the authors are affiliated.
atitle str Article title.
jtitle str Journal title or abbreviated title.
volume str Journal volume number.
issue str Journal issue number.
spage str Starting page of article in journal.
epage str Ending page of article in journal.
date int Date of publication.
abstract str  

These keys are associated with the meta data entries in the databases of organizations such as the International DOI Foundation and its Registration Agencies such as CrossRef and DataCite.

In addition, Paper instances will contain keys with information relevant to the networks of interest for Tethne including:

Field Type Description
citations list List of minimum Paper instances for cited references.
ayjid str First author’s name (last, fi), publication year, and journal.
doi str Digital Object Identifier.
pmid str PubMed ID.
wosid str Web of Science UT fieldtag.

Missing data here also results in the above keys being set to None.

exception tethne.readers.DataError(value)[source]

Bases: exceptions.Exception

tethne.readers.merge(P1, P2, fields=['ayjid'])[source]

Combines two lists (P1 and P2) of Paper instances into a single list, and attempts to merge papers with matching fields. Where there are conflicts, values from Paper in P1 will be preferred.

Parameters:

P1 : list

A list of Paper instances.

P2 : list

A list of Paper instances.

fields : list

Fields used to identify matching Paper

Returns:

combined : list

A list of Paper instances.

Examples

>>> import tethne.readers as rd
>>> P1 = rd.wos.read("/Path/to/data1.txt")
>>> P2 = rd.dfr.read("/Path/to/DfR")
>>> papers = rd.merge(P1, P2, ['ayjid'])