Lab 3 - Configure a metadata extractor

During the lab, you will be defining a metadata extractor to extract metadata from troubleshooting topics.

To complete the lab, you will need the troubleshooting content model and share configuration files:
  • tsModel.xml
  • share-config-custom.xml

Configuring these files is not part of this lab.

Extract all metadata defined in the troubleshooting content model (tsModel.xml).
  1. Open the context.xml file in your favorite editor.
  2. Create a new XPath Extractor bean.
  3. Add the following properties to the XPath Extractor bean as required:
    Property Type Mandatory Description
    xmlSchemaFilter XmlSchemaFilter Required A filter for which the bean applies.
    xpathQueries Map <QName,String> Required Map defining the XPath expressions to used for each predicate. Use a fully-qualified name with Clark notation as the key of the entry.
    <bean id="samples.extractor.troubleshooting.xpath" parent="rdf.extractor.xpath.abstract">
      <property name="xmlSchemaFilter" ref="samples.dtd.troubleshooting" />
      <property name="xpathQueries">
          <entry key="{}causes" value="/tsTroubleshooting/tsBody/tsCauses" />
          <entry key="{}environment" value="/tsTroubleshooting/tsBody/tsEnvironment" />
          <entry key="{}shortdesc" value="/tsTroubleshooting/abstract/shortdesc" />
          <entry key="{}symptoms" value="/tsTroubleshooting/tsBody/tsSymptoms" />
          <entry key="{}tasks" value="/tsTroubleshooting/task/title" />
  4. Save your changes.
  5. Test your metadata extractor with the com.componize.samples.lab03.MetadateExtractorTest unit test.
  6. Deploy your files to the application server.
  7. Restart the application server.
  8. Once the application server has started, connect to the Alfresco Share interface.
  9. Remove the Metadata & Link Management aspect from the troubleshooting-sample.dita file and then add it back again.
    Note: Metadata and links are extracted from files each time they are updated or when the Metadata & Link Management aspect is added. By removing and then adding the aspect back again, the metadata and links will be reextracted even though the content hasn't been changed.
The troubleshooting-specific metadata (Causes, Environment, etc.) should be extracted automatically.