Advanced Topic Map Parsing Using TopicMapBuilder

The org.tm4j.topicmap.source.SerializedTopicMapSource interface provides a set of constructors which take, in addition to a File, Reader or InputSource parameter, a TopicMapBuilder parameter which specifies the parsing stratgey to be applied to the input source. These methods allow you to override the default choice of parser either to provide a different parser entirely or to provide a parser with settings different to the default configuration.

Configuring Parsing

The TopicMapBuilder interface provides a method which enables the programmer to configure the behaviour of the builder at run-time. This is done by setting properties on the TopicMapBuilder instance before using it in a call to addTopicMap(). Each implementation of the TopicMapBuilder interface supports a different set of properties, and these are documented in the Javadoc for the implementation classes. For convenience, the properties are listed in the table below as well.

The current version of TM4J supports properties only on the org.tm4j.topicmap.utils.XTMBuilder class. The other TopicMapBuilder implementation - org.tm4j.topicmap.utils.LTMBuilder does not support any configuration properties). The two properties supported are http://www.tm4j.org/tm4j/xtmbuilder/validation and http://www.tm4j.org/tm4j/xtmbuilder/failonveto. Both properties take a boolean as their value.

If the property http://www.tm4j.org/tm4j/xtmbuilder/validation is 'true' then the XTMBuilder will use a validating XML parser. Otherwise a non-validating parser will be used. Note that a validating parser will fail to validate any XTM file without a valid DOCTYPE declaration. The default for this property is 'false'.

If the property http://www.tm4j.org/tm4j/xtmbuilder/failonveto is 'true', then the parse will be aborted if the creation/update of a topic map object is vetoed during the parsing process. If the value is not 'true', then the builder will attempt to keep going even if some topic map objects were not created/updated properly. The default value for this property is 'true'.

Creating a Customised TopicMapSource

The TopicMapSource interface consists of a number of methods meant to be used by a TopicMapProvider the TopicMapSource is added to. The normal way of usage is to add a instance of a TopicMapSource to a TopicMapProvider by one of its addTopicMap(..) methods. If no TopicMap is specified when calling addTopicMap(..) the Provider will first call getBaseLocator() on the TopicMapSource, if the returned Locator is not null, it will be used to create a new TopicMap. If it is null, the TopicMapProvider will call getBaseAddress() and use the returned String value ass address for a Locator in order to create the new TopicMap. After that, the Provider will call one of the populateTopicMap(..) Methods using the TopicMap (either the specified one or a new one) as a Parameter. Inside the populateTopicMap(..) method, the TopicMapSource can now create TopicMapObjects to populate the TopicMap.

So in order to implement a custom TopicMapSource one has to implement at last the getBaseAddress() method and the populateTopicMap(..) method with its 4 overloaded versions. For an example, we will now implement a TopicMapSource for a simple comma seperated files where a Topic id and a Topic BaseName can be specified.

	    tm4j, TM4J
	    kal_ahmed, Kal Ahmed
	    

The first thing we will do is to create a Java Class (CSVTopicMapSource) that implements TopicMapSource and create a constructor taking a File as its only parameter. From this File object we also determine the baseAddressString

	    
	    public CSVTopicMapSource(File file) {
        fileReader = new FileReader(file);
        baseAddress = file.toURI().toString();
      }
      
	    

Now we implement the getBaseAddress() and getBaseLocator() method´s. getBaseLocator() will return null and getBaseAddress() the baseAddress String retrieved in the constructor

	    
	     public String getBaseAddress() {
        return baseAddress;
      }

      public Locator getBaseLocator() {
        return null;
      }
      
	    

Finally we can start to implement the populateTopicMap method (we will ignore the other versions of populateTopicMap for the time beeing and just add calls to this one there).

	    
	     public void populateTopicMap(TopicMap map) throws TopicMapProcessingException {
         // we use a LineNumberReader as we accept only one topic, name pair in a line
         LineNumberReader reader = new LineNumberReader(fileReader);
         String line = reader.readLine();
         ...
         while (line!=null) {
           // here we extract the data from this line
           int index = line.indexOf(',');
           String id = line.substring(0, index);
           String name = line.substring(index+1, line.length());
           // and use it to create a new Topic and a new BaseName
           Topic newTopic = map.createTopic(id);
           newTopic.createName(name.toLowerCase(), name);
           // lets go to the next line
           line = reader.readLine();
         }
         ....
       }
      
	    

For the complete code of this example see CSVTopicMapSource.java in the examples.advanced package