SOS for CLIDB: Mid-term - we got data

Last week finally the customised data backend integration is pretty much aligned with the basic SOS operations and we can see what's in the database (GetCapabilities) and actually get data out of it (GetObservation).

(These adaptive works happen within a SoeR project. Also a response/suggestion to the 52North teaser, what to do with this off-the-shelf SOS server product :-) )

1. Plain description of functionality and status


Based on the "vanilla" code from the 52North SVN repository I basically added a CLIDBHelper class to the "52n-sos-hibernate-core" Maven module that encapsulates the CLIDB database connection and re-directed most of the original Hibernate calls from the core DAOs to the helper. Especially the Cache Feeder DAO needed a lot of attention, as it would fill the internal "capabilities" cache, which is the basic thing that needs to happen before the SOS instance is actually usable. Then Describe Sensor DAO and GetObservation DAO are pretty straightforward, as they would "just" do specific queries, like give me the sensor description for sensor station XY or give rainfall data for station XY in the timeperiod ...

The build would run with a "mvn -Pdevelop,pox,sosV100 package" and produce a .war file, which simply can be thrown into an Apache Tomcat servlet container. Secondly I was quite happy to also implement a WFS backend to get station information. So I would call the WFS to get stations and locations and verify the stations against the climate database, if there is data available and built the necessary metadata for the cache. This process is yet quite time consuming (several minutes). 

Nevertheless the complex and rich data in the climate database present about 120 phenomena or observable properties and 2500+ sensor stations. Filling the SOS internal maps for the constellations results in a huge XML repsonse (somewhere 30-70 MB) for the GetCapabiltites operation. This is not really made for browser testing and human readability.

Additionally I am working on the "back porting" of SOS 1.0.0 operations. GetCapabilities, DesccribeSensor and GetObservation work partially aready (and a included in the 52North SVN repo). There are some XML namespace quirks in the SWE data array output and by far not all necessary input filters have been implemented. I assume, that besides the core profile at least GetFeatureOfInterest would be quite desirable. Any comments on that?

2. Overview 52N SOS 4.0.0 architecture

 

At this point I thought it would also be helpful to understand the 52North SOS 4.0.0 internal software architecture. I created to little diagrams that help to understand the flow within the server, especially if you already had a first glance on the source. At first 52North present a design overview on their website, which present a good high level persepective. I prepared two additional ones for the interested developer:

52North SOS 4.0.0 flow 1

And a more straightforward UML-like sequence diagram:

52North SOS 4.0.0 sequence diagram

This pretty much meets also the coarse Maven structure. There different modules for the separation of concerns, eg 52n-sos-… api, coding (sos1/2, sensorml, swe,ows,gml), profiles (core, transactional, enhanced) and data backend (DAOs hibernate … per profile). Also webapp and webapp-preconfigured (war files, w/o database connection details) are seprate Maven modules.

I actually only exchanged hibernate-core with clidb core and disabled additional (enhanced and transactional) profiles and their data backends in the main pom.xml as this service is only considered to be read-only. The original SOS comes with predefined postgres9.x/postgis2 datamodel and sql files and testdata. Especially for the admin backend I had to keep the Postgres-DB and connection, though.

Generic 52N-vanilla related issues

 

The 52n-sos-400 is still in heavy development. I have to fix/extend the SOS v1.0.0 encoders and decoders to align them with the sos-internal SOS v2.0 oriented design, as SOS v2.0 expects tight relations around their observation type. Sos and OgcDecoderv100 (spatial filters), SweCommonEncoderv100, GmlEncoder and Decoderv311 and OmEncoderv100 need some more work (SWE and OM and Feature encoding) Then GetFeatureOfInterest for SOS v1.0.0 could also be added for the SWE community to support some legacy clients. 

SensorML creation for DescribeSensor was a bit quirky in 52n-sos-400, as there is much XML generation happening before the actual encoding in the dedicated encoders. Also I find challenging that GetCapabilties for SOS v2.0 would provide one content section per Offering – procedure combination, in SOS v1.0.0 all procedures for one offering listed inline. That multiplies the content sections quite a lot with SOS v2.0, although there is more specific information already available in the Capabilities document.

Generally I find the OGC SOS v2.0 conceptual model quite complicated and clunky (less human readable). But this is intended, as SOS and other geospatial OGC webservices are supposed to be chaniable, like in SOA (service oriented architecture) orchestration.

Specific CLIDB connection bugs/issues

 

I am getting about 6000 stations from WFS and verify each per CLIDB station summary leaves 2500 stations, but some are missing (eg didn’t see 5digit stationIds) and this is quite slow (8+ minutes) for cache. And as descibed GetCapabilities response is getting quite big through 2500 stations by 177 measures (about 50000 combinations), although the time spans are nicely visible per offering/procedure. I limited the query funtionality to allow only queries for one station and one property (like rainfall) and a time range must be provided, otherwise the query would take very long and the result is getting quite big XML (even as DataArray block).

DescribeSensor does not yet build nice SensorML from StationsMetadata and GetObservation doesn’t fully encode sampling feature in the observation response (only anchor reference, but the anchor item is not in there).

  • Mapping of Offerings, procedures, observableProperties, FeatureOfInterest 
  • Station: identical Procedure and Feature (long stationId)
  • Measure: identical Offering and ObsProp (String measure.code)
  • Doesn’t look nice, but works, no prefixes for stations or the like (eg for URI resolver for more details on items)

A interesting suggestion came up regarding the long cache filling process and the still connected original Postgres-DB, we could persist metadata in integrated Postgres schema to speed-up start (otherwise will first fill cache from WFS and CLIDB). And I didn't didn’t find place to change cache update frequency yet :-/ (is about 300 seconds?), because stations metadata might not change that often, just the time ranges.

Then my CLIDB Oracle connection handling probably not optimal? Without extra care, it will cause Exception “too many open cursors” (also with vanilla ojdbc14.jar) 300 is limit. I use one instance of the connector and every 100 calls I’ll “refresh” the connection, by disposing the old one and creating a new one – workaround works apparently. Basically I added a CLIDBHelper class and redirect most calls to the Hibernate layer to the CLIDB
Therefore almost no changes to the original DAO classes, but I have to handle and translate some of the Hibernate and Sos-internal representations of Offerings, Procedures etc, which is a bit of unnecessary overhead.


CLIDB-SOS project considerations and further time frame

 

I started to add some unit tests for the WFS and CLIDB integration, but I cannot verify the proper filling of all associative maps yet. Then Re-Naming/linking/mapping should/could be considered in addition of Prefixes/URI for Offerings, Procedures etc? Identifiers (00, 123SPEED..) vs Descriptive Names? But I am not sure about that yet, because this needs to be also reversable in the decoders etc.
Well, my official last day is 1st of March, which makes 4 weeks to go. Priorities and milestones will look some like the following:

  1. Talk about commercial / auth interface possibilities, maybe demo concept (some days)
  2. Finalise SOS v1.0.0 works (as far as necessary/possible – 1 week)
  3. Tidy up clidb connector integration and deployment/test stuff (only making robust, no new features – 1 week with some internal support)
  4. 1 week buffer, discussion for “data modelling” in SOS and test SOS clients against it
  5. AND OF COURSE, MAKE ACCESSIBLE ONLINE :-)

I think that's pretty much all what can happen :-) Looking forward...

Submitted by AlexKmoch on