An Entity of Type: Thing, from Named Graph: http://dbpedia.org, within Data Space: dbpedia.org

The swish function is a mathematical function defined as follows: where β is either constant or a trainable parameter depending on the model. For β = 1, the function becomes equivalent to the Sigmoid Linear Unit or SiLU, first proposed alongside the GELU in 2016. The SiLU was later rediscovered in 2017 as the Sigmoid-weighted Linear Unit (SiL) function used in reinforcement learning. The SiLU/SiL was then rediscovered as the swish over a year after its initial discovery, originally proposed without the learnable parameter β, so that β implicitly equalled 1. The swish paper was then updated to propose the activation with the learnable parameter β, though researchers usually let β = 1 and do not use the learnable parameter β. For β = 0, the function turns into the scaled linear function f(x)

Property Value
dbo:abstract
  • La función swish es una función matemática definida por la siguiente fórmula: Donde β puede ser constante o un parámetro entrenable según el modelo. En el caso en que β=1, la función es equivalente a la función con ponderación sigmoide que se usa en aprendizaje de refuerzo (Sigmoid-weighted Linear Unit, SiL), mientras que para β=0, swish se convierte en la función lineal f(x)=x/2. Con β→∞, el componente sigmoideo se acerca a una función escalón unitario, por lo que swish tiende a la función ReLU. Así, puede ser vista como una interpolación no lineal entre una función lineal y la ReLU. (es)
  • The swish function is a mathematical function defined as follows: where β is either constant or a trainable parameter depending on the model. For β = 1, the function becomes equivalent to the Sigmoid Linear Unit or SiLU, first proposed alongside the GELU in 2016. The SiLU was later rediscovered in 2017 as the Sigmoid-weighted Linear Unit (SiL) function used in reinforcement learning. The SiLU/SiL was then rediscovered as the swish over a year after its initial discovery, originally proposed without the learnable parameter β, so that β implicitly equalled 1. The swish paper was then updated to propose the activation with the learnable parameter β, though researchers usually let β = 1 and do not use the learnable parameter β. For β = 0, the function turns into the scaled linear function f(x) = x/2. With β → ∞, the sigmoid component approaches a 0-1 function, so swish approaches the ReLU function. Thus, it can be viewed as a smoothing function which nonlinearly interpolates between a linear function and the ReLU function. This function uses non-monotonicity, and may have influenced the proposal of other activation functions with this property such as Mish. When considering positive values, Swish is a particular case of sigmoid shrinkage function defined in (see the doubly parameterized sigmoid shrinkage form given by Equation (3) of this reference). (en)
  • Swish функція це математична функція, що описується виразом: де β є константою або параметром, який залежить від типу моделі. Похідна функції. (uk)
dbo:wikiPageID
  • 63822450 (xsd:integer)
dbo:wikiPageLength
  • 4395 (xsd:nonNegativeInteger)
dbo:wikiPageRevisionID
  • 1124205518 (xsd:integer)
dbo:wikiPageWikiLink
dbp:cs1Dates
  • y (en)
dbp:date
  • June 2020 (en)
dbp:wikiPageUsesTemplate
dcterms:subject
rdfs:comment
  • La función swish es una función matemática definida por la siguiente fórmula: Donde β puede ser constante o un parámetro entrenable según el modelo. En el caso en que β=1, la función es equivalente a la función con ponderación sigmoide que se usa en aprendizaje de refuerzo (Sigmoid-weighted Linear Unit, SiL), mientras que para β=0, swish se convierte en la función lineal f(x)=x/2. Con β→∞, el componente sigmoideo se acerca a una función escalón unitario, por lo que swish tiende a la función ReLU. Así, puede ser vista como una interpolación no lineal entre una función lineal y la ReLU. (es)
  • Swish функція це математична функція, що описується виразом: де β є константою або параметром, який залежить від типу моделі. Похідна функції. (uk)
  • The swish function is a mathematical function defined as follows: where β is either constant or a trainable parameter depending on the model. For β = 1, the function becomes equivalent to the Sigmoid Linear Unit or SiLU, first proposed alongside the GELU in 2016. The SiLU was later rediscovered in 2017 as the Sigmoid-weighted Linear Unit (SiL) function used in reinforcement learning. The SiLU/SiL was then rediscovered as the swish over a year after its initial discovery, originally proposed without the learnable parameter β, so that β implicitly equalled 1. The swish paper was then updated to propose the activation with the learnable parameter β, though researchers usually let β = 1 and do not use the learnable parameter β. For β = 0, the function turns into the scaled linear function f(x) (en)
rdfs:label
  • Función Swish (es)
  • Swish function (en)
  • Swish функція (uk)
owl:sameAs
prov:wasDerivedFrom
foaf:isPrimaryTopicOf
is dbo:wikiPageDisambiguates of
is dbo:wikiPageRedirects of
is dbo:wikiPageWikiLink of
is foaf:primaryTopic of
Powered by OpenLink Virtuoso    This material is Open Knowledge     W3C Semantic Web Technology     This material is Open Knowledge    Valid XHTML + RDFa
This content was extracted from Wikipedia and is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License