tethne.serialize package¶
Submodules¶
tethne.serialize.paper module¶
This class has functionalities to serialize a TETHNE corpus object to persist in the database;
>>> from tethne.serialize import paper
-
class
tethne.serialize.paper.
Serialize
(corpus, source)[source]¶ This class is used to serialize the Corpus object and has methods to create fixtures(JSON files) for different models in the TETHNE database.
-
get_affiliation_details
(value, affiliation_id, institute_literal)[source]¶ This method is used to map the Affiliation between an author and Institution.
Parameters: value - The author name
affiliation_id - Primary key of the affiliation table
institute_literal
Returns: Affiliation details(JSON fixture) which can be written to a file
-
get_details_from_inst_literal
(institute_literal, institution_id, institution_instance_id, paper_key)[source]¶ This method parses the institute literal to get the following 1. Department naame 2. Country 3. University name 4. ZIP, STATE AND CITY (Only if the country is USA. For other countries the standard may vary. So parsing these values becomes very difficult. However, the complete address can be found in the column “AddressLine1”
Parameters: institute_literal -> The literal value of the institute
institution_id -> the Primary key value which is to be added in the fixture
institution_instance_id -> Primary key value which is to be added in the fixture
paper_key -> The Paper key which is used for the Institution Instance
-
paper_source_map
= {1: 'doi', 2: 'url', 3: 'wosid'}¶
-
serializeAuthorInstances
()[source]¶ - This method creates a fixture for the “django-tethne_author” model.
Returns: Author Instance details which can be written to a file
-
serializeAuthors
()[source]¶ - This method creates a fixture for the “django-tethne_author” model.
Returns: Author details in JSON format, which can be written to a file.
-
serializeCitation
()[source]¶ This method creates a fixture for the “django-tethne_citation” model.
Returns: citation details which can be written to a file
-
serializeCitationInstance
()[source]¶ This method creates a fixture for the “django-tethne_citation_instance” model.
Returns: citation Instance details which can be written to a file
-
serializeCorpus
()[source]¶ This method creates a fixture for the “django-tethne_corpus” model. Returns ——- corpus_details in JSON format which can written to a file.
-
-
tethne.serialize.paper.
serialize
(dirPath, corpus, source)[source]¶ Parameters: dirPath - A valid directory path where you want the JSON files to be written
corpus - The corpus object which is to be serialized.
source - The source of the corpus
The possible values can be
1 for JSTOR, 2 for ZOTERO and 3 for WOS 4 for Scopus
Follwoing is an example to use the serialize method.
This method will raise an exception if
1. dirPath is not a valid directory
2. If Corpus object is none or has no papers to serialize
3. The source is not a valid source.
.. code-block:: python
>>> from tethne.readers import wos
>>> from tethne.serialize import paper
>>> wosCorpus = wos.read(‘/path/to/my/Corpus.txt’)
>>> paper.serialize(‘/path/to/my/FixturesDir/’, wosCorpus, 3)
Returns: Writes the fixtures at the directory location.