SciPy

tethne.serialize package

Submodules

tethne.serialize.paper module

This class has functionalities to serialize a TETHNE corpus object to persist in the database;

>>> from tethne.serialize import paper
class tethne.serialize.paper.Serialize(corpus, source)[source]

This class is used to serialize the Corpus object and has methods to create fixtures(JSON files) for different models in the TETHNE database.

get_affiliation_details(value, affiliation_id, institute_literal)[source]

This method is used to map the Affiliation between an author and Institution.

Parameters:

value - The author name

affiliation_id - Primary key of the affiliation table

institute_literal

Returns:

Affiliation details(JSON fixture) which can be written to a file

get_details_from_inst_literal(institute_literal, institution_id, institution_instance_id, paper_key)[source]

This method parses the institute literal to get the following 1. Department naame 2. Country 3. University name 4. ZIP, STATE AND CITY (Only if the country is USA. For other countries the standard may vary. So parsing these values becomes very difficult. However, the complete address can be found in the column “AddressLine1”

Parameters:

institute_literal -> The literal value of the institute

institution_id -> the Primary key value which is to be added in the fixture

institution_instance_id -> Primary key value which is to be added in the fixture

paper_key -> The Paper key which is used for the Institution Instance

paper_source_map = {1: 'doi', 2: 'url', 3: 'wosid'}
serializeAuthorInstances()[source]
This method creates a fixture for the “django-tethne_author” model.
Returns:Author Instance details which can be written to a file
serializeAuthors()[source]
This method creates a fixture for the “django-tethne_author” model.
Returns:Author details in JSON format, which can be written to a file.
serializeCitation()[source]

This method creates a fixture for the “django-tethne_citation” model.

Returns:citation details which can be written to a file
serializeCitationInstance()[source]

This method creates a fixture for the “django-tethne_citation_instance” model.

Returns:citation Instance details which can be written to a file
serializeCorpus()[source]

This method creates a fixture for the “django-tethne_corpus” model. Returns ——- corpus_details in JSON format which can written to a file.

serializeInstitution()[source]

This method creates a fixture for the “django-tethne_citation_institution” model.

Returns:institution details which can be written to a file
serializePaper()[source]

This method creates a fixture for the “django-tethne_paper” model.

Returns:paper_details in JSON format, which can written to a file.
class tethne.serialize.paper.SerializeUtility[source]
static get_auth_inst(address)[source]
tethne.serialize.paper.serialize(dirPath, corpus, source)[source]
Parameters:

dirPath - A valid directory path where you want the JSON files to be written

corpus - The corpus object which is to be serialized.

source - The source of the corpus

The possible values can be

1 for JSTOR, 2 for ZOTERO and 3 for WOS 4 for Scopus

Follwoing is an example to use the serialize method.

This method will raise an exception if

1. dirPath is not a valid directory

2. If Corpus object is none or has no papers to serialize

3. The source is not a valid source.

.. code-block:: python

>>> from tethne.readers import wos

>>> from tethne.serialize import paper

>>> wosCorpus = wos.read(‘/path/to/my/Corpus.txt’)

>>> paper.serialize(‘/path/to/my/FixturesDir/’, wosCorpus, 3)

Returns:

Writes the fixtures at the directory location.

Module contents