About: Random indexing

Facets (new session)
Description
Metadata
Settings
- Rule:
- Inverse Functional Properties:
- "Same As":

About: Random indexing Goto Sponge NotDistinct Permalink

An Entity of Type : dbo:Software, within Data Space : dbpedia.org associated with source document(s)
QRcode icon

http://dbpedia.org/describe/?url=http%3A%2F%2Fdbpedia.org%2Fresource%2FRandom_indexing

Random indexing is a dimensionality reduction method and computational framework for distributional semantics, based on the insight that very-high-dimensional vector space model implementations are impractical, that models need not grow in dimensionality when new items (e.g. new terminology) are encountered, and that a high-dimensional model can be projected into a space of lower dimensionality without compromising L2 distance metrics if the resulting dimensions are chosen appropriately.

Attributes	Values
rdf:type	software
rdfs:label	Random indexing (en) Случайное индексирование (ru)
rdfs:comment	Random indexing is a dimensionality reduction method and computational framework for distributional semantics, based on the insight that very-high-dimensional vector space model implementations are impractical, that models need not grow in dimensionality when new items (e.g. new terminology) are encountered, and that a high-dimensional model can be projected into a space of lower dimensionality without compromising L2 distance metrics if the resulting dimensions are chosen appropriately. (en) Случайное индексирование — это метод понижения размерности и один из подходов дистрибутивной семантики, основанный на убеждении, что варианты векторной модели (Vector Space Model) с высокой размерностью малоприменимы на практике и что модели не должны наращивать размерность при появлении не виденных ранее объектов (термов, документов и т. д.) Предполагается возможность проецирования модели с большими размерностями в пространство с меньшими — без ущерба для L2-метрик, если правильно подобрать итоговые измерения, что и представляет собой основной подход к случайным проекциям как методу понижения размерности, сформулированный как лемма Джонсона — Линденштрауса. (ru)
dcterms:subject	Machine learning Dimension reduction
Wikipage page ID	37697003 (xsd:integer)
Wikipage revision ID	1044010457 (xsd:integer)
Link from a Wikipage to another Wikipage	Information retrieval Sparse distributed memory Machine learning Distributional semantics Johnson–Lindenstrauss lemma Locality-sensitive hashing Dimensionality reduction Hamming distance Document clustering Dimension reduction Random projection Manhattan distance Vector space model Pentti Kanerva Bit vector
Link from a Wikipage to an external page	http://pars.ie/publications/papers/pre-prints/random-indexing-dr-explained.pdf
sameAs	Random indexing Random indexing Random indexing Random indexing
dbp:wikiPageUsesTemplate	dbt:Reflist
has abstract	Random indexing is a dimensionality reduction method and computational framework for distributional semantics, based on the insight that very-high-dimensional vector space model implementations are impractical, that models need not grow in dimensionality when new items (e.g. new terminology) are encountered, and that a high-dimensional model can be projected into a space of lower dimensionality without compromising L2 distance metrics if the resulting dimensions are chosen appropriately. This is the original point of the random projection approach to dimension reduction first formulated as the Johnson–Lindenstrauss lemma, and locality-sensitive hashing has some of the same starting points. Random indexing, as used in representation of language, originates from the work of Pentti Kanerva on sparse distributed memory, and can be described as an incremental formulation of a random projection. It can be also verified that random indexing is a random projection technique for the construction of Euclidean spaces—i.e. L2 normed vector spaces. In Euclidean spaces, random projections are elucidated using the Johnson–Lindenstrauss lemma. The TopSig technique extends the random indexing model to produce bit vectors for comparison with the Hamming distance similarity function. It is used for improving the performance of information retrieval and document clustering. In a similar line of research, Random Manhattan Integer Indexing (RMII) is proposed for improving the performance of the methods that employ the Manhattan distance between text units. Many random indexing methods primarily generate similarity from co-occurrence of items in a corpus. Reflexive Random Indexing (RRI) generates similarity from co-occurrence and from shared occurrence with other items. (en) Случайное индексирование — это метод понижения размерности и один из подходов дистрибутивной семантики, основанный на убеждении, что варианты векторной модели (Vector Space Model) с высокой размерностью малоприменимы на практике и что модели не должны наращивать размерность при появлении не виденных ранее объектов (термов, документов и т. д.) Предполагается возможность проецирования модели с большими размерностями в пространство с меньшими — без ущерба для L2-метрик, если правильно подобрать итоговые измерения, что и представляет собой основной подход к случайным проекциям как методу понижения размерности, сформулированный как лемма Джонсона — Линденштрауса. LSH устроен аналогично. Случайное индексирование как представление объектов естественного языка впервые предлагается в работе о и может быть описано как инкрементальное построение случайных проекций. Можно также показать, что случайное индексирование — это вариант случайных проекций для построения евклидовых пространств. (ru)
gold:hypernym	Method
prov:wasDerivedFrom	wikipedia-en:Random_indexing?oldid=1044010457&ns=0
page length (characters) of wiki page	5320 (xsd:nonNegativeInteger)
foaf:isPrimaryTopicOf	wikipedia-en:Random_indexing
is Link from a Wikipage to another Wikipage of	Sparse distributed memory Roger W. Schvaneveldt Magnus Sahlgren Distributional semantics Locality-sensitive hashing Random projection Vector space model Pentti Kanerva Outline of machine learning Word embedding
is foaf:primaryTopic of	wikipedia-en:Random_indexing

Faceted Search & Find service v1.17_git139 as of Feb 29 2024

Alternative Linked Data Documents: ODE Content Formats:

RDF

ODATA

Microdata

About

OpenLink Virtuoso version 08.03.3330 as of Mar 19 2024, on Linux (x86_64-generic-linux-glibc212), Single-Server Edition (62 GB total memory, 43 GB memory in use)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software