ConStruct and structWSF Interaction
From TechWiki
The conStruct Drupal modules rely on the structWSF Web service framework to manage and publish structured data content. All actions performed using any conStruct module ends up being a series of queries sent to various structWSF Web service endpoints. This article describes how a conStruct node is interacting with one, or multiple, structWSF instances; and, it shows all of the internal registries used to manage that interaction.
Contents |
conStruct Is A Proxy
A conStruct Drupal node can be understood as being a proxy between a user and a structWSF instance. More precisely, a conStruct module should be understood as a User Interface Proxy; that is, a proxy that has a user interface that generates queries that are sent to one, or multiple, structWSF instances.
But in any case, all queries sent by conStruct are sent on the behalf of the user. This means that all conStruct queries (except for a few rare exceptions) sent by the instance are authenticated twice:
- One time to make sure that the conStruct node (as a proxy) has the rights to perform the requested action
- Another time to make sure that the user of the conStruct node has the rights to perform the requested action.
This means that even if a user has the permissions to do a certain action, if he uses a proxy (in this case, a conStruct node) that doesn't have the rights to perform that action, then the structWSF instance will return an unauthorized error to the user.
This double authentication is to prevent security breaches where a user, or a proxy, pretends to be someone that he is not.
conStruct Permissions on structWSF
A conStruct node normally has full rights on a structWSF instance. It is possible that it doesn't, but most of the time it does. These permissions are defined by the administrator of the structWSF instance (see below).
Datasets Registries in conStruct
A conStruct node has two kind of datasets:
- The ones that it creates on a structWSF instance, and
- The ones that are already created on a structWSF instance, but that get linked to the conStruct node.
We refer to this distinction for datasets #1 as datasets and to the datasets #2 as linked datasets.
Linked datasets are just a way to aggregate datasets from different structWSF instances into the same conStruct Web site portal. That way, the data managed by different people and organizations can live in the same conStruct Web site. More information about linked datasets can be read here.
Now let's discuss how a conStruct instance manages all of the structWSF instances registered to it, and how it manages the datasets created in, or linked from, these structWSF instances.
structWSF Instances Registry
With respect to access, the first thing conStruct does is to maintain a registry of all the structWSF instances registered to the instance. The module that is used to administer these instances is called structNetwork. This module lets conStruct node administrators to either subscribe or un-subscribe remote structWSF instances to it.
The registry is saved in the WSF-Registry Drupal variable. That registry is a simple array of structWSF base URLs. The registry can easily be accessed by using this Drupal API call:
variable_get("WSF-Registry", array());
Datasets Registry
Then we have a registry of datasets that have been created from, or linked to, the conStruct instance. This registry of datasets is composed of multiple Drupal variables that share the following pattern.
First, each dataset has two variables:
-
Dataset-GID-WSF -
Dataset-GID-ID
Where GID is the ID of the Organic Group that is attached to this dataset.
The variable #1 above holds the provenance of the dataset: that is, the URL of the structWSF where it is hosted.
The variable #2 above holds the URI of the dataset: that is, its unique identifier.
Datasets Permissions in structWSF
Each time a user interacts with conStruct, a series of queries are sent to one of the structWSF instance registered on the conStruct instance. These queries get validated by structWSF and things get displayed in the conStruct user interface (results of the queries, access error messages, processing errors, etc).
The entire validation workflow on structWSF's side is described in the Datasets and Access Rights (structWSF) page.
conStruct Users On structWsf
This section describes how conStruct users are handled on a structWSF instance. The basic process is that conStruct users (proxied users) get authenticated, and then various conStruct modules interact with structWSF, including specific use rights authentication.
In the Datasets and Access Rights (structWSF) page, we discuss how structWSF authenticate queries directly send by a user. In this section, we extend that basic behavior to show how it authenticates proxied queries.
structWSF Requester(s)
There is a registered_ip parameter for most of the structWSF Web service endpoints. This parameter is what is used by any proxy systems (such as conStruct) to tell the structWSF endpoint that the query is being issued by the specific IP address for the given user.
When a Web service endpoint receives such a query, then it may authenticate the query based on two things (the two points above):
- The
registered_ip, so to make sure that user has the permissions to perform that action on the structWSF instance - The
requester_ip, so to make sure that the proxy system also has the permissions to perform that action on the structWSF instance.
If one of these two IP addresses doesn't have the permissions to perform that action, than the endpoint will return an unauthenticated message.
Remember the conStruct instances normally have full rights on the registered structWSF networks. These rights are granted by the structWSF administrator(s). If this is the case, then this means that there is a strong trust between the conStruct and the structWSF administrator(s) (if they are two different persons or organizations).
Overloaded IPs
For structWSF, a user has a single IP address at any given time. But, IP addresses may vary from home to work or other locations. A conStruct proxy is used to let its users having access to one, or multiple, structWSF instances. However, as is evident, all users of a given conStruct instance share the same IP address: the one of the conStruct node's Web server. So, under default conditions, all users of the proxy normally have the same privileges.
It is why the concept of overloading IP addresses has been introduced in structWSF. Overloading an IP address is nothing other than appending some value to it. This overloaded IP then gets used in the registered_ip parameter of any structWSF web service. This additional information is normally the ID of a user managed by the requesting proxy service.
An IP address is overloaded by using the :: symbol. This symbol is appended at the end of the overloaded IP address. What comes after is the ID. An overloaded IP address looks like 192.168.0.1::456. The burden of managing these overloaded IP addresses is put on the shoulders of the proxy service. It is the proxy (in this case conStruct) that has to manage the linkage between these overloaded IP addresses that get defined on the different structWSF instances and its own internal users ID.
Such an overloaded IP address could be used in an Datasets_and_Access_Rights_(structWSF) like this:
wsf:registeredIP->192.168.0.1:456wsf:create->truewsf:read->truewsf:update->truewsf:delete->truewsf:datasetAccess->http://locahost/ws/datasets/test/wsf:webServiceAccess->http://localhost/wsf/ws/crud/create/wsf:webServiceAccess->http://localhost/wsf/ws/crud/read/wsf:webServiceAccess->http://localhost/wsf/ws/crud/update/wsf:webServiceAccess->http://localhost/wsf/ws/crud/delete/
The Network Effect
A single conStruct instance can interact with multiple structWSF instances at the same time. If a conStruct node manages datasets hosted on, say, 3 different structWSF instances, then each time a Browse, Search, etc., page gets loaded, each of these structWSF instance will be queried to check what data is accessible by the requesting user. This is where the network effect of the structWSF design kicks in (see further the Distributed Networks with structWSF document).
This distributed ability is possible because all capabilities of the network are funneled through the various Web service endpoint queries. This characteristic is quite powerful. Via this design, for example, big datasets may be managed on their own structWSF instances, structWSF could host datasets specific to a certain domain, etc.
SID (server ID)
Each structWSF instance has its own SID (server ID). This server ID is used by conStruct to make sure that the same structWSF instance is not being queried multiple times with the same query. Theoretically, the same structWSF instance could answer to queries that are sent to different domains such as localhost, my-domain-1.com, my-domain-2.com, etc.
This usecase can happen if a user creates new datasets, or links to existing datasets, by using these different domains, that refers to the same structWSF instance.
SID on structWSF
The SID is created by the structWSF instance. The SID file is created by the root index.php structWSF file. A SID is really just a unique identifier string created from the MD5 string of the current microtime. The SID directory is specified by the $sidDirectory variable of the index.php root file.
SID on conStruct
The SID registry is saved in a Drupal variable. The variable is called SID-Registry. It can be accessed, within Drupal, by using this API call:
variable_get("SID-Registry", array());
This will return an array with all of the URLs from which you can access a given unique structWSF instance (SID).