About: Attention (machine learning)

Property	Value
dbo:abstract	In artificial neural networks, attention is a technique that is meant to mimic cognitive attention. The effect enhances some parts of the input data while diminishing other parts — the motivation being that the network should devote more focus to the small, but important, parts of the data. Learning which part of the data is more important than another depends on the context, and this is trained by gradient descent. Attention-like mechanisms were introduced in the 1990s under names like multiplicative modules, sigma pi units, and hypernetworks. Its flexibility comes from its role as "soft weights" that can change during runtime, in contrast to standard weights that must remain fixed at runtime. Uses of attention include memory in neural Turing machines, reasoning tasks in differentiable neural computers, language processing in transformers, and multi-sensory data processing (sound, images, video, and text) in perceivers. (en) У контексті нейронних мереж, ува́га (англ. attention) — це методика, що імітує когнітивну увагу. Це явище підсилює важливі частини даних входу, та пригнічує решту — вважається, що мережа повинна приділяти більше обчислювальної потужності цій маленькій, але важливій частині даних. Яка частина даних є важливішою за інші, залежить від контексту, й цього навчаються з тренувальних даних за допомогою градієнтного спуску. Увагу використовують у різноманітних моделях машинного навчання, включно з обробкою природної мови та комп'ютерним баченням. Трансформерні мережі широко використовують механізми уваги для досягання своєї виразної потужності. Отримувати вигоду від механізмів уваги можуть і системи комп'ютерного бачення на основі згорткових нейронних мереж.[джерело?] Модель Персівер використовує асиметричну увагу для застосування трансформерів безпосередньо до аудіовізуальних та просторових даних без застосування згорток, за обчислювальних витрат, що є субквадратичними відносно розмірності даних. Двома найпоширенішими методиками уваги є скалярнодо́буткова ува́га (англ. dot-product attention), що для визначання уваги використовує скалярний добуток векторів, і багатоголо́ва ува́га (англ. multi-head attention), яка для спрямування загальної уваги мережі або підмережі поєднує декілька різних механізмів уваги. (uk) 注意力机制（英語：attention）是人工神经网络中一种模仿认知注意力的技术。这种机制可以增强神经网络输入数据中某些部分的权重，同时减弱其他部分的权重，以此将网络的关注点聚焦于数据中最重要的一小部分。数据中哪些部分比其他部分更重要取决于上下文。可以通过梯度下降法对注意力机制进行训练。类似于注意力机制的架构最早于1990年代提出，当时提出的名称包括乘法模块（multiplicative module）、sigma pi单元、超网络（hypernetwork）等。注意力机制的灵活性来自于它的“软权重”特性，即这种权重是可以在运行时改变的，而非像通常的权重一样必须在运行时保持固定。注意力机制的用途包括中的记忆功能、中的推理任务、Transformer模型中的语言处理、Perceiver（感知器）模型中的多模态数据处理（声音、图像、视频和文本）。 (zh)
dbo:thumbnail	wiki-commons:Special:FilePath/attention-animated.gif?width=300
dbo:wikiPageExternalLink	https://web.stanford.edu/~jurafsky/slp3/ https://www.youtube.com/watch%3Fv=AIiwuClvH6k&vl=en-GB https://www.youtube.com/watch%3Fv=yGTUuEx3GkA
dbo:wikiPageID	66001552 (xsd:integer)
dbo:wikiPageLength	14832 (xsd:nonNegativeInteger)
dbo:wikiPageRevisionID	1122638945 (xsd:integer)
dbo:wikiPageWikiLink	dbr:Neural_Turing_machine dbr:DeepMind dbr:Perceiver dbr:University_College_London dbr:Lexical_analysis dbr:GloVe dbr:Gradient_descent dbc:Machine_learning dbr:Thought_vector dbr:Dan_Jurafsky dbr:Explainable_artificial_intelligence dbr:Differentiable_neural_computer dbr:Recurrent_neural_network dbr:Attention dbr:Artificial_neural_network dbr:Alex_Graves_(computer_scientist) dbr:Transformer_(machine_learning_model) dbr:Softmax_function dbr:Word_embedding dbr:Teacher_forcing dbr:Word2Vec dbr:1-hot dbr:File:Attn-pytorch-tutorial.png dbr:File:Attn-xx-dot.png dbr:File:Attn-xx-qkv.png dbr:File:Attn-xy-dot.png dbr:File:Attn-xy-qkv.png
dbp:wikiPageUsesTemplate	dbt:Reflist dbt:Short_description dbt:Slink dbt:Plain_image_with_caption dbt:Machine_learning dbt:Differentiable_computing
dcterms:subject	dbc:Machine_learning
rdfs:comment	注意力机制（英語：attention）是人工神经网络中一种模仿认知注意力的技术。这种机制可以增强神经网络输入数据中某些部分的权重，同时减弱其他部分的权重，以此将网络的关注点聚焦于数据中最重要的一小部分。数据中哪些部分比其他部分更重要取决于上下文。可以通过梯度下降法对注意力机制进行训练。类似于注意力机制的架构最早于1990年代提出，当时提出的名称包括乘法模块（multiplicative module）、sigma pi单元、超网络（hypernetwork）等。注意力机制的灵活性来自于它的“软权重”特性，即这种权重是可以在运行时改变的，而非像通常的权重一样必须在运行时保持固定。注意力机制的用途包括中的记忆功能、中的推理任务、Transformer模型中的语言处理、Perceiver（感知器）模型中的多模态数据处理（声音、图像、视频和文本）。 (zh) In artificial neural networks, attention is a technique that is meant to mimic cognitive attention. The effect enhances some parts of the input data while diminishing other parts — the motivation being that the network should devote more focus to the small, but important, parts of the data. Learning which part of the data is more important than another depends on the context, and this is trained by gradient descent. (en) У контексті нейронних мереж, ува́га (англ. attention) — це методика, що імітує когнітивну увагу. Це явище підсилює важливі частини даних входу, та пригнічує решту — вважається, що мережа повинна приділяти більше обчислювальної потужності цій маленькій, але важливій частині даних. Яка частина даних є важливішою за інші, залежить від контексту, й цього навчаються з тренувальних даних за допомогою градієнтного спуску. Увагу використовують у різноманітних моделях машинного навчання, включно з обробкою природної мови та комп'ютерним баченням. (uk)
rdfs:label	Atenció (aprenentatge automàtic) (ca) Attention (machine learning) (en) 주의 기제 (ko) Увага (машинне навчання) (uk) 注意力机制 (zh)
owl:sameAs	wikidata:Attention (machine learning) dbpedia-ca:Attention (machine learning) dbpedia-fa:Attention (machine learning) dbpedia-ko:Attention (machine learning) dbpedia-uk:Attention (machine learning) dbpedia-zh:Attention (machine learning) https://global.dbpedia.org/id/FRXJB
prov:wasDerivedFrom	wikipedia-en:Attention_(machine_learning)?oldid=1122638945&ns=0
foaf:depiction	wiki-commons:Special:FilePath/Attention-1-sn.png wiki-commons:Special:FilePath/Attn-pytorch-tutorial.png wiki-commons:Special:FilePath/Attn-xx-dot.png wiki-commons:Special:FilePath/Attn-xx-qkv.png wiki-commons:Special:FilePath/Attn-xy-dot.png wiki-commons:Special:FilePath/Attn-xy-qkv.png wiki-commons:Special:FilePath/attention-animated.gif
foaf:isPrimaryTopicOf	wikipedia-en:Attention_(machine_learning)
is dbo:wikiPageDisambiguates of	dbr:Attention_(disambiguation)
is dbo:wikiPageRedirects of	dbr:Attention_mechanism dbr:Dot-product_attention dbr:Attention_unit dbr:Multi-head_attention
is dbo:wikiPageWikiLink of	dbr:Attention_(disambiguation) dbr:Attention_network dbr:Perceiver dbr:Saliency_map dbr:GPT-2 dbr:Convolutional_neural_network dbr:AlphaFold dbr:Graph_neural_network dbr:Random_forest dbr:Attention_mechanism dbr:Transformer_(machine_learning_model) dbr:Stable_Diffusion dbr:Syntactic_parsing_(computational_linguistics) dbr:Seq2seq dbr:Video_super-resolution dbr:Theaitre dbr:Vision_transformer dbr:Self-attention dbr:Dot-product_attention dbr:Attention_unit dbr:Multi-head_attention
is foaf:primaryTopic of	wikipedia-en:Attention_(machine_learning)