Class_EngineDVFileJson

Class EngineDVFileJson

This page describes the inner workings of class EngineDVFileJson which implements the engine for using external algorithms with dense vectors represented in JSON format (and handled out-of-memory).

Protocol of use

Currently (as of 2018-04-16), the invocation protocol for engines is a bit complex. The required protocol depends on the situation the engine gets used in (training versus application).

When training:

The engine class gets selected in the PR based on the trainingAlgorithm runtime PR
Engine.createEngine(trainingAlgorithm, algorithmParameters, featureInfo, TargetType, dataDirectory) is called
- this executes the non-static initializeAlgorithm(algorithm,parms) method (overriden but empty for EngineDVFileJson)
- then runs method initWhenCreating(directory, algorithm, parms, featureInfo, targetType): for EngineDVFileJson, this essentially creates the instance of the appropriate corpus representation and sets the mode to "adding".
- creates and initializes the Info instance
- returns the Engine instance
document processing uses the corpus representation retrieved from the engine to add new instances
After all documents have been processed, the engine's info gets updated
Then engine.trainModel(dataDir, instanceAnnotationType, algoParms) gets called:
- turns off adding for the corpus representation
- updates the info
- copies the whole wrapper software unless already there (based on WRAPPER_NAME)
- creates the command to invoke the training script, also using the settings in the config file WRAPPER_NAME.yaml which is treated as a key/value map
- this optionally uses settings shellcmd and shellparms for running the shell script
- TODO: this should also allow to configure the python path and python location
- before running the command, sets environment variable WRAPPER_HOME which is a subdirectory of the data directory.
- runs the command
- updates the info and saves it
- saves the featureInfo (NOTE: this is currently done again later in the saveEngine method)
Finally engine.saveEngine(dataDir) gets called (from base class Engine) which:
- saves the feature info using featureInfo.save(dir)
- invokes the engine-specific saveModel(dir) class, in this case, this does nothing since the model gets saved by the scripts we call
- invokes the engine-specific saveCorpusRepresentation(dir) class, which in this case does nothing, since the corpus representation is already out-of-memory and stored to a file

Brought to you by the GATE team

Home

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Class_EngineDVFileJson

Class EngineDVFileJson

Protocol of use

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally