Help:Semantic search

Semantic MediaWiki includes an easy-to-use query language which enables users to access the wiki's knowledge. The syntax of this query language is very similar to the syntax of annotations in Semantic MediaWiki. This query language can be used on the special page Special:Ask or in inline queries.

Answering queries requires additional resources. The administration will monitor the traffic load closely and take it under advisement in deciding whether to switch off or restrict most of the features given below, or upgrade to a faster server and/or higher-bandwidth service.

Introduction
Among other things, all queries must state some conditions that describe what is asked for. For example, the query:

is a query for all pages within the category "Bible character", i.e. for all characters out of the Bible. If you enter this in Special:Ask and click "Find results", SMW executes the query and displays results as a simple list of all requested pages. If there are many results, you can browse these via the navigations links at the top and bottom of the query results, for example a query for all articles on the Bible.

Much more complex queries are possible, but understanding these requires a basic knowledge of queries. A query is a request to find a number of pages that satisfy certain requirements. The query must answer two questions:


 * 1) Which pages are requested?
 * 2) What information should be displayed about those pages?

The first point is obvious. The second point is important to retrieve more knowledge. For example, one might be interested in all the Bible characters and their dates of flourishing (the "fleurit" or "fl." notation in most genealogical records). This requires two steps: first find all the characters; second print out their names and dates of prominence, reign, etc. Both points are now explained independently in the sections below.

Page selection
In the example above, we gave the single condition to describe which pages we were interested in. The condition here is exactly what one would otherwise write to assert that some page is in the category of Bible character. The enclosing ask-tags invert its meaning to return all such pages (actually some more; but read on). This is a general scheme: The syntax for asking for pages that satisfy some condition is exactly the syntax for explicitly asserting that this condition holds.

The following queries show what this means:
 * 1)  gives all pages directly or indirectly (through a sub-, subsub-, etc. category) in the category.
 * 2) born in::Israel gives all pages  annotated as being about someone born in the Land of Israel.
 * 3) Height::180cm  gives all pages  annotated as being about someone having a height of 180cm.

By using other categories or properties than above, we can already ask for pages which have certain annotations. Next let us combine those requirements:

born in::Israel height::180cm

asks for everybody who is a Bible character and was born in Israel and was 180cm tall. In other words: when many conditions are written into one query, the result is narrowed down to those pages that meet all the requirements. Thus we have a logical AND. By the way: queries can also include line-breaks in order to make them more readable. So we could as well write:

born in::Israel height::180cm

to get the same result as above. Note that queries return only the articles that are positively known to satisfy the required properties: if there is no property for the height of a Bible character, that character will not be selected.

Wildcards and disjunctions
In the examples above, we gave very concrete conditions, using "Bible character", "Israel", and "180cm" as fillers. It is possible to weaken these conditions in several ways.

Wildcards are written as "+" and allow any filler for a given condition. For example, born in::+ returns all pages that have annotations for the relation "born in", and height::+ returns all pages that have been assigned some height. For categories, this feature makes little sense: [[Category:+]] just returns everything that has some category.

Disjunctions are written as "||" and allow queries to require (at least) one out of several possible fillers. For example, retrieves every article on the subject of creationism or intelligent design. This also includes everything that is both, i.e. we really have a logical OR here. We can also specify a list of pages as relation target, e.g. |Egypt and a list of property values.

Subqueries
To ask for pages having a particular relation to any page in a more complex set, you can write the latter in the form of a query. In this case, instead of a concrete (list of) page names, one enters a new query enclosed in and. For instance, one can ask for all Bible characters that are born in an Israelite city by writing

born in:: [[located in::Israel ]]

Arbitrary levels of nesting are possible, though nesting might be restricted for a particular site to improve performance.

For another example, assume that we are interested in all cities of the Assyrian Empire (as far as specified within this wiki). This is done by the following query:

located in:: [[member of::Assyrian Empire ]]

( located in::  [[member of::Assyrian Empire ]]  located in::* Population::*  )

Asking for categories
Conditions with categories are generally simple, but they are more powerful than they might at first appear:

When searching for pages within a category, the result also involves all pages that are contained in subcategories of this category.

For example, in CreationWiki we have a category "Bible character" which is a subcategory of "Bible". Then the query will also return those "special" characters that are in the category "Hebrew king" only. This makes sense in many situations, but you can still view the pages that were directly put into the category of Bible character by just going to the page of that category (by following the link Category:Bible character ).

Conditions with properties
With property values, we are usually not looking for exact results, but rather for entities that are included within a certain range. For example

height::>6 ft height::<7 ft

asks for all Bible characters that are at least 6 feet and at most 7 feet tall. Here we take advantage of the automatic unit conversion: even if the height of the character involved was set with height::195cm (or the equivalent number of cubits), it would be recognized as a correct answer (provided that a suitable datatype was chosen for height, see Help:Semantic custom units).

Such range conditions on properties are mostly relevant for properties with values that can be ordered in a natural way. For example, it makes sense to ask start date::>May 6 2006 but is is not really helpful to say homepage URL::>http://www.somewhere.org.

If a datatype has no natural linear ordering, Semantic MediaWiki will just apply the alphabetical order to the normalized datavalues as they are used in the RDF export.

Direct conditions on pages
So far, all conditions depended on some or the other annotation given within an page. But there are also conditions that directly select some pages, or pages from a given namespace.

By directly giving some pagename (possibly including a namespace prefix), or a list of such pagenames separated by ||, the existing pages with those names are selected, e.g.

|Samaria||User:Ashcraft

Note that the result does not display any namespace prefixes; see the hover box or status bar of the browser, or follow the links to determine the namespace. Restricting the set based on an property value one could ask, e.g., "Who of Goliath, David, Solomon, and Rehoboam was taller than 7ft?". But direct selection of articles is most useful if further properties of those articles are asked for, as is described below.

To select a category, you must put a ":" before the category name; this avoids confusing (return all characters) and Category:Bible character (return the category of "Bible character").

Restricting to a namespace
A less strict way of selecting given pages is via namespaces. The default is to only return pages in the main namespace.

To return pages in a particular namespace, add the namespace with a wildcard, e.g. write Help:+

to return every page in the "Help" namespace. Since the main namespace usually has no prefix, write +.

For example, to return pages in either the main or "User" namespace, write |User:+.

To return pages in the "Category" namespace, again you need a ":" in front of the namespace label to prevent confusion.

Data to be displayed
Simple queries using conditions as above will merely return a list of pages. To display properties of these pages, one adds statements such as height::* to show the height (if any) of the selected pages. Using "*" as a filler indicates that this code does not specify a condition for the selection of pages, but specifies what should be displayed about the selected pages. Thus, we can also write


 * born in::* to show all pages that have a "born in" relation to the result page,
 * [[Category:*]] to show all categories that the result page has directly been stated to be in.

Even if there are no "born in" relations for a page, the page is still in the selection, and an empty field will be printed. Likewise, if some article has been assigned many different values for one property, all of them will be displayed.

For properties that support units, queries can also determine which unit should be used for the output. For example

height::*cm

returns the values of the property height converted into cm.

''Currently, every property is displayed once, even if several different units are asked for. The developers are aware of the inconvenience that this causes and have pledged to address it in future releases and release candidates of Semantic MediaWiki.''

For properties of Type:Boolean, the editor can customize the display of the true and false units, or even suppress the display of true or false units. To do this, specify the desired symbols or strings immediately following the property-resolution and star operators, like this:

satellites::*√,none

to show whether astronomical bodies have satellites or not.

Sorting results
Special:Ask has a special field for ordering results according to some property. This requires all selected pages to have a value for this property, and thus the query must impose this additional restriction. For example, in order to sort the cities in this wiki by population, the following is needed:


 * the query should be population::+
 * in an ask statement, specify the column to be sorted by inside the ask-tag: &lt;ask sort="population"&gt;
 * using the special:ask-page, use the input field "sort by column" instead - for example:  population::+
 * Ascending or descending order can be chosen by specifying order="ascending" in the same way or by clicking the little sorter-icons in the header of a results-table.

''At the moment, the condition population::+ is crucial for this to work. This might change in future versions.''

Using templates and variables
Within a query, arbitrary templates and variables can be used. This can be used to create a standard query that displays all future events (where "future" gets its meaning from the current date):

end date::>2024-August-28

Many other uses are possible, especially when using queries inline. However, editors can never use template parameters (the things in ) within a query.

Another very useful variable for inline queries is which allows you to customize a query that is used unchanged on many pages. Read about inline queries for more information.

Linking to Semantic Search Results
The easiest way to do this is to create a page with the semantic search in an inline query (see next help section). If you want to link to the results of a query in Special:Ask, you need to handle the ?, [, and ] characters in its URL. To hide the ? introducing the query parameters, use a template like Wikipedia's Querylink. To escape the brackets, use &amp;#91; and &amp;#93; to represent them