Microdata is a method of marking up elements with additional machine-readable data, so that crawlers, search engines, or browsers can extract information from the page. Unlike WAI-ARIA
, it’s actually part of HTML5.
With Microdata, page authors can add specific labels to HTML elements to annotate them so that they are able to be read by machines or bots. This is done by means of a customized vocabulary.
It’s similar to RDFa (a W3C standard) and microformats (a popular set of conventions), and is already indexed by the Google search engine if used in markup.
--A basic example of microdata is given below.
<p itemscope> <span itemprop="inventor">Tim Berners-Lee</span> created the <span itemprop="invention">World Wide Web</span> </p>
itemscope
and itemprop
attributes, along with descriptive property names, to label your content.
Attribute | Description |
---|---|
itemscope | It is used to identify the scope of the microdata item—an item being a set of name-value pairs. |
itemprop | Defines the property names, and their associated values. |
This example yields the following name value pairs:
Inventor: Tim Berners-Lee Invention: World Wide Web --A name is a property defined with the help of the itemprop
attribute. In our example, the first property name happens to be one called name. There are two additional property names in this scope: photo and url.
<aside itemscope> <h1 itemprop="name">John Peter</h1> <p><img src="http://www.yourdomain.com/bio-photo.jpg" alt="John Peter" itemprop="photo"></p> <p><a href="http://www.yourdomain.com" itemprop="url">Author’s website</a></p> </aside>
For most elements, the value is taken from its text content. For instance, the name property in our example would get its value from the text content between the opening and closing <h1> tags. Other elements are treated differently. The photo property takes its value from the src attribute of the image, so the value consists of a URL pointing to the author’s photo. The url property, although defined on an element that has text content (namely, the phrase “Author’s website”), doesn’t use this text content to determine its value; instead, it gets its value from the href attribute.
--Third-party scripts and page authors can access the namevalue pairs. This is the real power of Microdata.
To achieve this define each item by means of the itemtype
attribute. An item in the context of Microdata is the element that has the itemscope
attribute set. Every element and name-value pair inside that element is part of that item. The value of the itemtype
attribute, therefore, defines the namespace for that item’s vocabulary.
<aside itemscope itemtype="http://www.data-vocabulary.org/Person"> <h1 itemprop="name">John Peter</h1> <p><img src="http://www.yourdomain.com/bio-photo.jpg" alt="John Peter" itemprop="photo"></p> <p><a href="http://www.yourdomain.com" itemprop="url">Author’s website</a></p> </aside>
We’re using the URL http://www.data-vocabulary.org/, a domain owned by Google. It houses a number of Microdata vocabularies, including Organization, Person, Review, Breadcrumb, and more.
To add additional meaning about the content—for example, that the content identifies a person—so. By doing this popular search engines can extrapolate this data.
This can be achieved by specifying an itemtype
and apply the appropriate property names from the Schema.org vocabulary in addition to using the itemscope
and itemprop
attributes.
<section itemscope itemtype="http://schema.org/Person"> <h1 itemprop="name">Tim Berners-Lee</h1> <img itemprop="image" src="http://www.w3.org/Press/Stock/Berners-Lee/2001-europaeum-eighth.jpg"> <p> <span itemprop="jobTitle">Director</span>, <span itemprop="affiliation" itemscope itemtype="http://schema.org/Organization" itemprop="name"><p itemprop="address" itemscope itemtype="http://schema.org/PostalAddress"> <span itemprop="addressLocality">Cambridge</span> <span itemprop="addressRegion">MA</span> </p> <a itemprop="url" href="http://www.w3.org/People/Berners-Lee/">Web site at W3C</a> </section>
The start of this microdata item is again indicated by the use of itemscope on the section element, but also added to this element is the itemtype attribute. Use item type with a URL in order to identify the item data type. In this case, we’re using the Schema.org structure to identify a person.
Just as with the previous section, the itemprop attribute is applied with property names to give meaning to the content in the markup. By looking at the properties and pairing them with the content, we know that “Tim Berners-Lee” is a person’s name and that his job title is “Director.”
The use of itemprop for both the image and URL properties works a bit differently–the corresponding values in these cases are the src and href attribute values, respectively.
If you’ve worked with microformats in the past, this concept won’t be new to you. A final special case in this example can be seen with the affiliation and address item prop attributes—here, new items are nested inside of the main item. In both cases, the itemprop identifies the property that is directly related to the person item but, within the same tag, also establishes the property as an item itself with the itemscope attribute. Going on step further, itemtype is also applied to indicate the URL that describes the item data type.
While this might seem a bit complicated at first, realize that it’s not much different than combining multiple microformats (like hCard and hCalendar on a resume) or creating an XML object to represent nested data. Whether you’ve worked on project like these before or not, there is an easy way to check to see that you’re making progress in applying the Schema.org vocabularies. You can use the Google Rich Snippets Testing Tool (available at http://www.google.com/webmasters/tools/richsnippets) to validate that your structured data markup can be parsed.n use the Google Rich Snippets Testing Tool (available at http://www.google.com/webmasters/tools/richsnippets) to validate that your structured data markup can be parsed.