An Entity of Type: Thing, from Named Graph: http://dbpedia.org, within Data Space: dbpedia.org

The IOB format (short for inside, outside, beginning), also commonly referred to as the BIO format, is a common tagging format for tagging tokens in a chunking task in computational linguistics (ex. named-entity recognition). It was presented by Ramshaw and Marcus in their paper "Text Chunking using Transformation-Based Learning", 1995 The I- prefix before a tag indicates that the tag is inside a chunk. An O tag indicates that a token belongs to no chunk. The B- prefix before a tag indicates that the tag is the beginning of a chunk that immediately follows another chunk without O tags between them. It is used only in that case: when a chunk comes after an O tag, the first token of the chunk takes the I- prefix.

Property Value
dbo:abstract
  • The IOB format (short for inside, outside, beginning), also commonly referred to as the BIO format, is a common tagging format for tagging tokens in a chunking task in computational linguistics (ex. named-entity recognition). It was presented by Ramshaw and Marcus in their paper "Text Chunking using Transformation-Based Learning", 1995 The I- prefix before a tag indicates that the tag is inside a chunk. An O tag indicates that a token belongs to no chunk. The B- prefix before a tag indicates that the tag is the beginning of a chunk that immediately follows another chunk without O tags between them. It is used only in that case: when a chunk comes after an O tag, the first token of the chunk takes the I- prefix. Another similar format which is widely used is IOB2 format, which is the same as the IOB format except that the B- tag is used in the beginning of every chunk (i.e. all chunks start with the B- tag). A readable introduction to entity tagging is given in Bob Carpenter's blog post, "Coding Chunkers as Taggers". An example with IOB format: Alex I-PERis Ogoing Oto OLos I-LOCAngeles I-LOCin OCalifornia I-LOC Notice how "Alex", "Los" and "California", although first tokens of their chunk, have the "I-" prefix. The same example after filtering out stop words: Alex I-PERgoing OLos I-LOCAngeles I-LOCCalifornia B-LOC Notice how "California" now has the "B-" prefix, because it immediately follows another LOC chunk. The same example with IOB2 format (with tagging unaffected by stop word filtering): Alex B-PERis Ogoing Oto OLos B-LOCAngeles I-LOCin OCalifornia B-LOC Related tagging schemes sometimes include "START/END: This consists of the tags B, E, I, S or O where S is used to represent a chunk containing a single token. Chunks of length greater than or equal to two always start with the B tag and end with the E tag." Other Tagging Scheme's include BIOES/BILOU, where 'E' and 'L' denotes Last or Ending character is such a sequence and 'S' denotes Single element or 'U' Unit element. An Example with BIOES format: Alex S-PERis Ogoing Owith OMarty B-PERA. I-PERRick E-PERto OLos B-LOCAngeles E-LOC (en)
dbo:wikiPageID
  • 40324167 (xsd:integer)
dbo:wikiPageLength
  • 5829 (xsd:nonNegativeInteger)
dbo:wikiPageRevisionID
  • 1102105633 (xsd:integer)
dbo:wikiPageWikiLink
dbp:wikiPageUsesTemplate
dcterms:subject
rdfs:comment
  • The IOB format (short for inside, outside, beginning), also commonly referred to as the BIO format, is a common tagging format for tagging tokens in a chunking task in computational linguistics (ex. named-entity recognition). It was presented by Ramshaw and Marcus in their paper "Text Chunking using Transformation-Based Learning", 1995 The I- prefix before a tag indicates that the tag is inside a chunk. An O tag indicates that a token belongs to no chunk. The B- prefix before a tag indicates that the tag is the beginning of a chunk that immediately follows another chunk without O tags between them. It is used only in that case: when a chunk comes after an O tag, the first token of the chunk takes the I- prefix. (en)
rdfs:label
  • Inside–outside–beginning (tagging) (en)
owl:sameAs
prov:wasDerivedFrom
foaf:isPrimaryTopicOf
is dbo:wikiPageRedirects of
is dbo:wikiPageWikiLink of
is foaf:primaryTopic of
Powered by OpenLink Virtuoso    This material is Open Knowledge     W3C Semantic Web Technology     This material is Open Knowledge    Valid XHTML + RDFa
This content was extracted from Wikipedia and is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License