There are a few more bits and pieces that are both good to know and maybe necessary for a particular node setup.
The NodeSoftware tries to automatically find out the URL with which it is accessed and uses this to fill the URL-information in /tap/capabilities, among other things. However, this does not always work (e.g. if you deploy behind a proxy) so there is a manual override. Simply set DEPLOY_URL in settings.py, ending with /tap/ like this:
DEPLOY_URL = 'http://your.server/some/path/tap/'
As you know, XSAMS is a hierarchical structure where certain parts reference other parts. For example, each (molecular or atomic) state has an ID, which can be used by a radiative transition to point to its initial and final states. Similarly, all species, bibliographic sources etc. have an ID that other parts use to point to them.
Here is a list of the most important Returnable names for IDs:
NodeID is “special” in the sense that it is not formally part of the schema. The XML generator uses it to make all the other IDs unique within VAMDC. Say, for example that you (in dictionaries.py) set your NodeID to “xyz” and fill the SourceID with numbers from your database. Then the XML output will look something like <Source sourceID=”Bxyz-1”> for your first source. This means that the generator takes care of adding the prefix “B” as mandated for sourceIDs by the schema, plus it inserts the NodeID to prevent clashed with IDs from other VAMDC nodes.
IDs are mandatory which means that you have to fill the Returnables from the list above, if you use the corresponding part of the schema.
Ideally the node’s database layout roughly matches the XSAMS structure which means for example that you have separate tables for the atoms/molecules and their states. The linking indexes between the tables (usually an integer) are then directly suited to be used as the IDs above because the generator formats it as described.
In order to do this, it is good to be aware of the following Djangoism: Consider the example data model from here and that s is an instance of the State model. Then s.energy gives the value of the energy column in the database, as you expect. s.species however is, contrary to non-ForeignKey fields, not the key value of the corresponding species, but the actual instance of the species model because Django tries to be smart and convenient. Now we could use s.species.id to get the key value, but this would be slow since we would unnecessarily traverse into the species table to get it. The better way is to use s.species_id which is provided automatically, i.e. for any ForeignKey field xyz there is a field xyz_id which holds the key value instead of the linked object.
Sometimes it is necessary to do something with your data before returning them and then it is not possible to directly use the field name in the right-hand-side of the Returnable. Now remember that the string there simply gets evaluated and that your models can not only have fields but also custom methods. Therefore the easiest solution is to write a small method in your class that returns what you want, and then call this function though the returnable.
For example, assume you for some reason have two energies for your states and want them both returned into the Returnable AtomStateEnergy which can handle vectors as input. Then, in your models.py, you do:
class State(Model):
energy1 = FloatField()
energy2 = FloatField()
def bothenergies(self):
return [self.energy1, self.energy2]
And correspondingly in your RETURNABLES in dictionaries.py:
RETURNABLES = {\
...
'AtomStateEnergy':'AtomState.bothenergies()',
}
Note
Use this sparingly since it adds some overhead. For doing simple calculations like unit conversions it is usually better to do them once and for all in the database, instead of doing them for every query.
The XML generator is aware of the Requestables and it only returns the parts of the schema that are wanted. Therefore the nodes need in principle not care about this. However, there are two issues that can interfere:
The solution is to make the queryfunction aware of the Returnables. These are attached to the object sql that comes as input. For example, one can test if the setup of atomic states is needed like this:
needAtomStates = not sql.requestables or 'atomstates' in sql.requestables
and then use the boolean variable needAtomStates to skip parts of the QuerySet building. This test checks first, if we have requestables at all (otherwise “ALL” is default) and then whether ‘atomstates’ is one of them.
Note
The query parser tries to be smart and adds the Requestables that are implied by another one. For example it adds ‘atomstates’ and ‘moleculestates’ when the client asks for ‘states’. Therefore it is enough to test for the most explicit one in the query functions.
Note
The keywords in sql.requestables are all lower-case!
There can arise situations where it might be easier for a node to create a piece of XML itself than filling the Returnable and letting the generator handle this. This is allowed and the generator checks every time it loops over an object, if the loop variable, e.g. AtomState has an attribute called XML. If so, it returns AtomState.XML() instead of trying to extract the values from the Retunable for the current block of XSAMS. Note the execution of .XML() which means that this needs to be coded as a function/method in your model, not as an attribute.
Sometimes it is necessary to go manually go though the steps that happen when a query comes in in order to find out where omething goes wrong. A good tool for this is in interactive python session which you start from within your node directory with:
./manage.py shell
From within the Python shell, you can run:
# import the relevant part of the NodeSoftware
from vamdctap import views as V
# import your queryfunction
from node import queryfunc as Q
# set up a query
foo = {'LANG':'VSS2','FORMAT':'XSAMS',
'QUERY':'select all where radtranswavelength < 1000 and radtranswavelength > 900'}
# run the parser
foo = V.TAPQUERY(foo)
# check basic validity
print foo.isvalid
...
# look at the parsed where clause
print foo.where
# put it into your query function and see what happens
Q.setupResults(foo)
You can also manually run the first step from the queryfunction:
from vamdctap import sqlparse as S
q = S.sql2Q(foo)
print q
It is possible in dictionaries.py to apply a function to the values that come in the WHERE-clause of a query together with the Restrictables:
from vamdctap.unitconv import *
RESTRICTABLES = {\
'RadTransWavelength':'wave',
'RadTransWavenumber':('wave',invcm2Angstr),
...
Here we give a two-tuple as the right-hand-side of the Restrictable RadTransWavenumber where the first element is the name of the model field (as usual) and the second is the function that is to be applied.
Note
The second part of the tuple needs to be the function itself, not its name as a string. This allows you to write custom functions in the same file, just above where you use them.
Note
The common functions for unit conversion reside in vamdctap/unitconv.py. This set is far from complete and you are welcome to ask for additions that you need.
Perhaps a unit conversion (see above) is not enough to handle a Restrictable, e. g. because you do not have the quantity available in your database but know it anyway. Suppose a database has information on one atom only, say iron. For the output one would simply hardcode the information on iron in the Returnables as constant strings. For the query on the other hand, you would like to support AtomSymbol but have no field in your database to check against - after all it would be wasteful to have a database column that is the same everywhere.
One way of handling this is to use a custom function as the value of the Restrictable in dictionaries.py:
'AtomSymbol':checkIron,
where checkIron would be a function, e.g. defined in the same file (before referencing it, of course) as:
def checkIron(restrictable,operator,value):
value = string.strip('\'"')
if value == 'Fe' and operator in ('=','=='):
return return Q(pk=F('pk'))
else:
return ~Q(pk=F('pk'))
Note
Q(pk=F(‘pk’)) is a restriction that is always true and should be fast. The operator ~ negates it.
Note
This (and the alternative below) do not cover all possible query cases, for example the operators LIKE or IN. In practice, some more lines of code will therefore be needed to manually handle a Restrictable.
Note
If this topic is relevant for you, please also have a look into vamdctap/unitconv.py where there are some examples.
Note
For the easy example of comparing to a constant string, we have a ready solution: One can use ‘SomeRestrictable’:test_constant([‘Fe’,’U’]), where the function test_constant takes a single string or a list of strings that the value will be compared to.
Another solution is to manipulate the set of restrictions by hand instead of letting sql2Q() handle it automatically. sql2Q() is a shorthand function that does these steps after each other:
- [‘r0’, ‘and’, ‘r1’, ‘and’, ‘(‘, ‘r2’, ‘or’, ‘r3’, ‘)’]
- {‘1’: [u’RadTranswavelength’, ‘<’, u‘3100’], ‘0’: [u’RadTranswavelength’, ‘>’, u‘3000’], ‘3’: [u’AtomSymbol’, ‘=’, u“‘Mg’”], ‘2’: [u’AtomSymbol’, ‘=’, u“‘Fe’”]}
So, in summary, the call q=sql2Q(sql) at the start of the query function can be replaced by:
logic,restrictions,count = splitWhere(sql.where)
q_dict = {}
for i,restriction in restrictions.items():
restriction = applyRestrictFu(restriction)
q_dict[i] = restriction2Q(restriction)
q = mergeQwithLogic(q_dict, logic)
Now, depending on what you want to do, you can manipulate this process at any intermediate step. To continue the example with iron only, we could insert the following at the start of the loop over the restrictions:
if restriction[0].lower() == 'atomsymbol':
if restriction[1] in ('=','==):
if restriction[3] == 'Fe':
q_dict[i] = Q(pk=F('pk'))
continue
Currently, only queries with FORMAT=XSAMS are officially supported. Since some nodes wanted to be able to return other formats (that are only useful for their community, for example to inculde binary data like an image of a molecule) there is a mechanism to to do this.
Whenever FORMAT is something else than XSAMS, the NodeSoftware checks whether there is a function called returnResults() in a node’s queryfunc.py. If so, it completely hands the responsibility to assemble the output to this function.
Note
This means that you have to return a HttpResponse object from it and know a little more about Django views. In addition you are on your own to assembe your custom data format.
Django offers a plethora of features that we do not use for the purpose of a bare VAMDC node but that might be useful for adding custom funcitonality. For example you could:
For more information on all this have a look into Django’s excellect documentation at https://docs.djangoproject.com/
For extending your node beyond the VAMDC-TAP interface, you would normally add a second app to your node directory, besides the existing one called node. Then you simply tell your urls.py to serve the new app at a certain URL.