(Sponging disallowed)

About: Inside–outside–beginning (tagging)     Goto   Sponge   NotDistinct   Permalink

An Entity of Type : owl:Thing, within Data Space : dbpedia.org associated with source document(s)
QRcode icon
http://dbpedia.org/describe/?url=http%3A%2F%2Fdbpedia.org%2Fresource%2FInside%E2%80%93outside%E2%80%93beginning_%28tagging%29

The IOB format (short for inside, outside, beginning), also commonly referred to as the BIO format, is a common tagging format for tagging tokens in a chunking task in computational linguistics (ex. named-entity recognition). It was presented by Ramshaw and Marcus in their paper "Text Chunking using Transformation-Based Learning", 1995 The I- prefix before a tag indicates that the tag is inside a chunk. An O tag indicates that a token belongs to no chunk. The B- prefix before a tag indicates that the tag is the beginning of a chunk that immediately follows another chunk without O tags between them. It is used only in that case: when a chunk comes after an O tag, the first token of the chunk takes the I- prefix.

AttributesValues
rdfs:label
  • Inside–outside–beginning (tagging) (en)
rdfs:comment
  • The IOB format (short for inside, outside, beginning), also commonly referred to as the BIO format, is a common tagging format for tagging tokens in a chunking task in computational linguistics (ex. named-entity recognition). It was presented by Ramshaw and Marcus in their paper "Text Chunking using Transformation-Based Learning", 1995 The I- prefix before a tag indicates that the tag is inside a chunk. An O tag indicates that a token belongs to no chunk. The B- prefix before a tag indicates that the tag is the beginning of a chunk that immediately follows another chunk without O tags between them. It is used only in that case: when a chunk comes after an O tag, the first token of the chunk takes the I- prefix. (en)
dcterms:subject
Wikipage page ID
Wikipage revision ID
Link from a Wikipage to another Wikipage
sameAs
dbp:wikiPageUsesTemplate
has abstract
  • The IOB format (short for inside, outside, beginning), also commonly referred to as the BIO format, is a common tagging format for tagging tokens in a chunking task in computational linguistics (ex. named-entity recognition). It was presented by Ramshaw and Marcus in their paper "Text Chunking using Transformation-Based Learning", 1995 The I- prefix before a tag indicates that the tag is inside a chunk. An O tag indicates that a token belongs to no chunk. The B- prefix before a tag indicates that the tag is the beginning of a chunk that immediately follows another chunk without O tags between them. It is used only in that case: when a chunk comes after an O tag, the first token of the chunk takes the I- prefix. Another similar format which is widely used is IOB2 format, which is the same as the IOB format except that the B- tag is used in the beginning of every chunk (i.e. all chunks start with the B- tag). A readable introduction to entity tagging is given in Bob Carpenter's blog post, "Coding Chunkers as Taggers". An example with IOB format: Alex I-PERis Ogoing Oto OLos I-LOCAngeles I-LOCin OCalifornia I-LOC Notice how "Alex", "Los" and "California", although first tokens of their chunk, have the "I-" prefix. The same example after filtering out stop words: Alex I-PERgoing OLos I-LOCAngeles I-LOCCalifornia B-LOC Notice how "California" now has the "B-" prefix, because it immediately follows another LOC chunk. The same example with IOB2 format (with tagging unaffected by stop word filtering): Alex B-PERis Ogoing Oto OLos B-LOCAngeles I-LOCin OCalifornia B-LOC Related tagging schemes sometimes include "START/END: This consists of the tags B, E, I, S or O where S is used to represent a chunk containing a single token. Chunks of length greater than or equal to two always start with the B tag and end with the E tag." Other Tagging Scheme's include BIOES/BILOU, where 'E' and 'L' denotes Last or Ending character is such a sequence and 'S' denotes Single element or 'U' Unit element. An Example with BIOES format: Alex S-PERis Ogoing Owith OMarty B-PERA. I-PERRick E-PERto OLos B-LOCAngeles E-LOC (en)
prov:wasDerivedFrom
page length (characters) of wiki page
foaf:isPrimaryTopicOf
is Link from a Wikipage to another Wikipage of
is Wikipage redirect of
is foaf:primaryTopic of
Faceted Search & Find service v1.17_git139 as of Feb 29 2024


Alternative Linked Data Documents: ODE     Content Formats:   [cxml] [csv]     RDF   [text] [turtle] [ld+json] [rdf+json] [rdf+xml]     ODATA   [atom+xml] [odata+json]     Microdata   [microdata+json] [html]    About   
This material is Open Knowledge   W3C Semantic Web Technology [RDF Data] Valid XHTML + RDFa
OpenLink Virtuoso version 08.03.3330 as of Mar 19 2024, on Linux (x86_64-generic-linux-glibc212), Single-Server Edition (61 GB total memory, 49 GB memory in use)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software