Quality Assurance of Semantic Annotations for Services
Acquiring Trusted Semantic Annotations
Web services offer a convenient mechanism for packaging specialist services and resources for use across organisational and technical boundaries. They can facilitate rapid application development and the sharing of computational know-how on a global scale. In order to realise these benefits, however, it must be possible for potential service consumers to be able to locate service implementations that offer the kind of functionality required, and to understand how two or more services will function when connected together.
The basic signature information offered by WSDL descriptions is a useful starting point but is insufficient by itself to support the discovery, composition and diagnosis of Web services. Even when mnemonic, service and operation names convey very little information about the detailed function of the code behind them, and parameter types are not sufficiently discriminating (especially given the tendency in some domains to type parameters as plain strings, even when they contain some internal structuring).
Semantic annotations have been proposed as a means of providing richer information about the behaviour of Web services to potential users. Ontologies of terms used in a particular application domain, or by a particular community, can be associated with Web service components (e.g. as task descriptions for specific operations, or as richer typing information for specific input or output messages). Users familiar with that ontology can then use the annotations to search for suitable service implementations, or to determine whether the outputs of one service are suitable for use as inputs to another. For example, in the biological domain, a user might wish to convert a protein sequence into its equivalent gene sequence, and might therefore ask a service discovery engine for information on services which take protein sequences as input and return gene sequences as output.
Such semantic annotations are only of value if:
- They are trustworthy; that is, they offer an accurate characterisation of the semantics of the Web service with which they are associated.
- They can be captured in a cost-effective manner. Given the rate at which new Web services are arising, and the number of potential user communities each with their own preferred ontologies for annotations, individual human attention for each individual annotation is not a practically realisable goal.
- They can evolve to keep track of changes in the ontology and the Web services they describe, in a cost-effective manner.
The QuASAR project aims to provide a toolkit to assist in the cost-effective creation and evolution of reliable semantic annotations Web services. In particular, we have developed tools to assist human annotators in verifying the annotations they develop before they are deployed into public repositories, and to gain maximum value from manually created annotations, by using them as the starting point from which to infer new annotations.
Verification of Semantic Annotations
Our work with annotations produced for Web services in the bioinformatics domain has shown that manual creation of annotations is a much challenging task than might at first be thought. Typically, the annotator is neither the person who wrote the code for the Web service being annotated, nor one of the team who developed the ontology from which the annotations are being taken. Under these circumstances, finding the correct annotation for a service parameter, for example, or (as is necessary in many cases) the closest match, is non-trivial. We have constructed a workbench for annotators that assists in verifying annotations using techniques borrowed from software testing.
Inference of New Semantic Annotations
The number of public Web services is growing much more rapidly than the number of annotations that are registered in public repositories. Annotators need to have both technical and domain skills in order to be able to bridge the gap between the Web service implementation and the ontology that is to be used to annotate it. Such people are rare, and mechanisms to fund their time are even more hard to come by.
Because of this, it is especially important that the maximum value be extracted from any annotation work done by a human annotator (especially when he or she has devoted additional time to verifying the correctness of the annotation). One approach, which we have explored, is to use these annotations as the basis to infer new annotations, based on the connections made between Web services in trusted workflows. As well as maximising the information that can be obtained from each manually created annotation, the technique can also flag up errors in annotations, when annotations for valid parameter connections are found to conflict.
Applications of Trusted Semantic Annotations
Once created, semantic annotations for Web services can be used for much more than just service discovery and orchestration. For example, they can be used to detect potential mismatches and incorrect compositions within workflows. This can allow workflow builders to work first at a more abstract level, concentrating on connecting the major services that make up the business process or computation that they wish to specify. Once these are in place, we can use semantic annotations to automatically search for any mismatches between the connected parameters that indicate where lower-level transformational services (e.g. shims) need to be inserted to allow the workflow as a whole to execute.