API Reference¶
lsa-program main part are the parsers that convert the different kinds of
database documents into manageable dictionaries that only contain the interest
metadata fields. Those are implemented in the record
module.
The record API¶
The main class of the record API is the RecordParser
class, which outlines an api that parses data out of a raw string, or raw data
structure into a dictionary with the desired interest fields, details about how
to extract that information will go into the RecordParser
child classes.
Furthermore, the RecordParser
is complemented by the
RecordIterator
class, that outlines an interface to iterate
over a file containing several records and returning (yielding) all the
records in a memory efficient fashion.
The utility module¶
The scripts package¶
Database manipulation¶
Utils to work with a mongo database, it contains a global connection to the database so that a new one is not created with every request which is a huge overhead. Furthermore it has a tool to use a pymongo collection as a context manager.
-
lsa.scripts.dbutil.
collection
(name, dbname='program', delete=True)¶ Yields a mongo collection with name the given name in the specified database it has the advantage of not having to create the collection everywhere in the program.
Parameters: - name (str like) – name of the collection
- dbname (str like) – name of the database to get the collection from
- delete (bool) – either delete the content of the collection or not
Returns: collection as a context manager
Return type: ContextManager
-
lsa.scripts.dbutil.
collection_name
(name)¶ Prepends ‘lsa-‘ to the given name, so that all collections for the lsa program have consistent names.
Script entry points¶
The entry points are organized in modules, this leads to some code duplication
but it can be reduced in the future. The populate script, which yields the
lsapopulate command is located in the populate
module,
and contains the information descripted bellow.
The model script, which yields the lsamodel command is located in the
model
module, and contains the information descripted
bellow.
The query script, which yields the lsaquery command is located in the
query
module, and contains the information descripted
bellow.