About: Swish function

Property	Value
dbo:abstract	La función swish es una función matemática definida por la siguiente fórmula: Donde β puede ser constante o un parámetro entrenable según el modelo. En el caso en que β=1, la función es equivalente a la función con ponderación sigmoide que se usa en aprendizaje de refuerzo (Sigmoid-weighted Linear Unit, SiL), mientras que para β=0, swish se convierte en la función lineal f(x)=x/2. Con β→∞, el componente sigmoideo se acerca a una función escalón unitario, por lo que swish tiende a la función ReLU. Así, puede ser vista como una interpolación no lineal entre una función lineal y la ReLU. (es) The swish function is a mathematical function defined as follows: where β is either constant or a trainable parameter depending on the model. For β = 1, the function becomes equivalent to the Sigmoid Linear Unit or SiLU, first proposed alongside the GELU in 2016. The SiLU was later rediscovered in 2017 as the Sigmoid-weighted Linear Unit (SiL) function used in reinforcement learning. The SiLU/SiL was then rediscovered as the swish over a year after its initial discovery, originally proposed without the learnable parameter β, so that β implicitly equalled 1. The swish paper was then updated to propose the activation with the learnable parameter β, though researchers usually let β = 1 and do not use the learnable parameter β. For β = 0, the function turns into the scaled linear function f(x) = x/2. With β → ∞, the sigmoid component approaches a 0-1 function, so swish approaches the ReLU function. Thus, it can be viewed as a smoothing function which nonlinearly interpolates between a linear function and the ReLU function. This function uses non-monotonicity, and may have influenced the proposal of other activation functions with this property such as Mish. When considering positive values, Swish is a particular case of sigmoid shrinkage function defined in (see the doubly parameterized sigmoid shrinkage form given by Equation (3) of this reference). (en) Swish функція це математична функція, що описується виразом: де β є константою або параметром, який залежить від типу моделі. Похідна функції. (uk)
dbo:wikiPageID	63822450 (xsd:integer)
dbo:wikiPageLength	4395 (xsd:nonNegativeInteger)
dbo:wikiPageRevisionID	1124205518 (xsd:integer)
dbo:wikiPageWikiLink	dbr:Vanishing_gradient_problem dbr:Function_(mathematics) dbr:Google dbr:Trainable_parameter dbr:Backpropagation dbr:Activation_function dbc:Functions_and_mappings dbr:Rectifier_(neural_networks) dbr:Reinforcement_learning dbr:Interpolate dbr:Artificial_neural_network dbc:Artificial_neural_networks dbr:Sigmoid_function dbr:ImageNet dbr:Mish_(function) dbr:ReLU
dbp:cs1Dates	y (en)
dbp:date	June 2020 (en)
dbp:wikiPageUsesTemplate	dbt:Reflist dbt:Short_description dbt:Use_dmy_dates
dcterms:subject	dbc:Functions_and_mappings dbc:Artificial_neural_networks
rdfs:comment	La función swish es una función matemática definida por la siguiente fórmula: Donde β puede ser constante o un parámetro entrenable según el modelo. En el caso en que β=1, la función es equivalente a la función con ponderación sigmoide que se usa en aprendizaje de refuerzo (Sigmoid-weighted Linear Unit, SiL), mientras que para β=0, swish se convierte en la función lineal f(x)=x/2. Con β→∞, el componente sigmoideo se acerca a una función escalón unitario, por lo que swish tiende a la función ReLU. Así, puede ser vista como una interpolación no lineal entre una función lineal y la ReLU. (es) Swish функція це математична функція, що описується виразом: де β є константою або параметром, який залежить від типу моделі. Похідна функції. (uk) The swish function is a mathematical function defined as follows: where β is either constant or a trainable parameter depending on the model. For β = 1, the function becomes equivalent to the Sigmoid Linear Unit or SiLU, first proposed alongside the GELU in 2016. The SiLU was later rediscovered in 2017 as the Sigmoid-weighted Linear Unit (SiL) function used in reinforcement learning. The SiLU/SiL was then rediscovered as the swish over a year after its initial discovery, originally proposed without the learnable parameter β, so that β implicitly equalled 1. The swish paper was then updated to propose the activation with the learnable parameter β, though researchers usually let β = 1 and do not use the learnable parameter β. For β = 0, the function turns into the scaled linear function f(x) (en)
rdfs:label	Función Swish (es) Swish function (en) Swish функція (uk)
owl:sameAs	wikidata:Swish function dbpedia-es:Swish function dbpedia-uk:Swish function https://global.dbpedia.org/id/CjAXA
prov:wasDerivedFrom	wikipedia-en:Swish_function?oldid=1124205518&ns=0
foaf:isPrimaryTopicOf	wikipedia-en:Swish_function
is dbo:wikiPageDisambiguates of	dbr:Swish
is dbo:wikiPageRedirects of	dbr:Sigmoid-weighted_Linear_Unit dbr:Sigmoid-weighted_linear_unit dbr:Swish-beta dbr:Swish-beta_function dbr:Swish-β dbr:Swish-β_function dbr:Swish_(function)
is dbo:wikiPageWikiLink of	dbr:List_of_mathematical_abbreviations dbr:Sigmoid-weighted_Linear_Unit dbr:Sigmoid-weighted_linear_unit dbr:Swish-beta dbr:Swish-beta_function dbr:Swish-β dbr:Swish-β_function dbr:Swish_(function) dbr:Swish dbr:Backpropagation dbr:Rectifier_(neural_networks) dbr:Sigmoid_function
is foaf:primaryTopic of	wikipedia-en:Swish_function