Ontology and Vocabulary Design
From TechWiki
This document is a summary with tips of specific steps that should be taken in local ontology design to support an OSF installation. You are also encouraged to review the Ontology Best Practices document as well.
Labels and Definitions
- Name all concepts (classes) as single nouns. Use CamelCase notation for these classes (that is, class names should start with a capital letter and not contain any spaces, such as MyNewConcept)
- Name all properties (attributes) as verb senses (so that triples may be actually read); e.g., hasProperty. Try to use mixedCase notation for naming these predicates (that is, begin with lower case but still capitalize thereafter and don't use spaces)
- Try to use common and descriptive prefixes and suffixes for related properties or classes (while they are just labels and their names have no inherent semantic meaning, it is still a useful way for humans to cluster and understand your vocabularies). For examples, properties about languages or tools might contain suffixes such as 'Language' or 'Tool' for all related properties
- Provide a preferred label annotation property that is used for human readable purposes and in user interfaces. For this purpose, use the property of skos:prefLabel
- Include explicit consideration for the idea of a “semset”, which means a series of alternate labels and terms to describe the concept. These alternatives include true synonyms, but may also be more expansive and include jargon, slang, acronyms or alternative terms that usage suggests refers to the same concept. The semset construct is similar to the "synsets" in Wordnet, but with a broader use understanding. Included in the semset construct is the single (per language) preferred (human-readable) label for the concept, the prefLabel, an embracing listing of alternative phrase and terms for the concept (including acronyms, synonyms, and matching jargon), the altLabels, and a listing of prominent or common misspellings for the concept or its alternatives, the hiddenLabels. The semset construct is an integral part of Structured Dynamics' approach to using ontologies for information extraction and tagging of unstructured text
- Give all concepts and properties a definition. The matching and alignment of things is done on the basis of concepts (not simply labels) which means each concept must be defined. Providing clear definitions (along with the coherency of its structure) gives an ontology its semantics. Remember not to confuse the label for a concept with its meaning. (This approach also aids multi-linguality). In its own ontologies, Structured Dynamics uses the property of skos:definition, though others such as rdfs:comment or dc:description are also commonly used
- Enable multi-lingual capabilities in all definitions and labels. This is a rather complicated best practice in its own right. For the time being, it means being attentive to the xml:lang="en" (for English, in this case) property for all annotation properties
- (If you disagree with these naming conventions, use your own, but in any event, be consistent!!).
Other Tips
- Each property used to define other properties or classes has to be defined within the ontology as owl:AnnotationProperty.
- If a single class or property is needed from an external ontology, re-define it in the current ontology (with the property URI) instead of importing the whole ontology.