An Entity of Type: software, from Named Graph: http://dbpedia.org, within Data Space: dbpedia.org

In machine learning, local case-control sampling is an algorithm used to reduce the complexity of training a logistic regression classifier. The algorithm reduces the training complexity by selecting a small subsample of the original dataset for training. It assumes the availability of a (unreliable) pilot estimation of the parameters. It then performs a single pass over the entire dataset using the pilot estimation to identify the most "surprising" samples. In practice, the pilot may come from prior knowledge or training using a subsample of the dataset. The algorithm is most effective when the underlying dataset is imbalanced. It exploits the structures of conditional imbalanced datasets more efficiently than alternative methods, such as case control sampling and weighted case control s

Property Value
dbo:abstract
  • In machine learning, local case-control sampling is an algorithm used to reduce the complexity of training a logistic regression classifier. The algorithm reduces the training complexity by selecting a small subsample of the original dataset for training. It assumes the availability of a (unreliable) pilot estimation of the parameters. It then performs a single pass over the entire dataset using the pilot estimation to identify the most "surprising" samples. In practice, the pilot may come from prior knowledge or training using a subsample of the dataset. The algorithm is most effective when the underlying dataset is imbalanced. It exploits the structures of conditional imbalanced datasets more efficiently than alternative methods, such as case control sampling and weighted case control sampling. (en)
dbo:wikiPageID
  • 46963137 (xsd:integer)
dbo:wikiPageLength
  • 5755 (xsd:nonNegativeInteger)
dbo:wikiPageRevisionID
  • 1105939445 (xsd:integer)
dbo:wikiPageWikiLink
dbp:wikiPageUsesTemplate
dcterms:subject
gold:hypernym
rdf:type
rdfs:comment
  • In machine learning, local case-control sampling is an algorithm used to reduce the complexity of training a logistic regression classifier. The algorithm reduces the training complexity by selecting a small subsample of the original dataset for training. It assumes the availability of a (unreliable) pilot estimation of the parameters. It then performs a single pass over the entire dataset using the pilot estimation to identify the most "surprising" samples. In practice, the pilot may come from prior knowledge or training using a subsample of the dataset. The algorithm is most effective when the underlying dataset is imbalanced. It exploits the structures of conditional imbalanced datasets more efficiently than alternative methods, such as case control sampling and weighted case control s (en)
rdfs:label
  • Local case-control sampling (en)
owl:sameAs
prov:wasDerivedFrom
foaf:isPrimaryTopicOf
is dbo:wikiPageRedirects of
is dbo:wikiPageWikiLink of
is foaf:primaryTopic of
Powered by OpenLink Virtuoso    This material is Open Knowledge     W3C Semantic Web Technology     This material is Open Knowledge    Valid XHTML + RDFa
This content was extracted from Wikipedia and is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License