Browse using
OpenLink Faceted Browser
OpenLink Structured Data Editor
LodLive Browser
Formats
RDF:
N-Triples
N3
Turtle
JSON
XML
OData:
Atom
JSON
Microdata:
JSON
HTML
Embedded:
JSON
Turtle
Other:
CSV
JSON-LD
Faceted Browser
Sparql Endpoint
About:
Reinforcement learning from human feedback
An Entity of Type:
Thing
,
from Named Graph:
http://dbpedia.org
,
within Data Space:
dbpedia.org
Variant of reinforcement learning
Property
Value
dbo:
description
és una tècnica que entrena un \model de recompensa\ directament a partir de la retroalimentació humana.
(ca)
Methode des maschinellen Lernens
(de)
stiimulõppe versioon
(et)
technique pour entraîner une IA
(fr)
variant of reinforcement learning
(en)
wariant uczenia przez wzmacnianie
(pl)
أسلوب من أساليب تعلم الآلة
(ar)
以回饋內容來訓練機器學習的技術
(zh)
dbo:
thumbnail
wiki-commons
:Special:FilePath/RLHF_diagram.svg?width=300
dbp:
wikiPageUsesTemplate
dbt
:Good_article
dbt
:!
dbt
:Cite_web
dbt
:Main
dbt
:Reflist
dbt
:Citation
dbt
:Machine_learning
dbt
:Pg
dbt
:Short_description
dbt
:Artificial_intelligence_navbox
dct:
subject
dbc
:Language_modeling
dbc
:Reinforcement_learning
rdfs:
label
Reinforcement learning from human feedback
(en)
prov:
wasDerivedFrom
wikipedia-en
:Reinforcement_learning_from_human_feedback?oldid=1289935045&ns=0
foaf:
depiction
wiki-commons
:Special:FilePath/RLHF_diagram.svg
foaf:
homepage
http://yiyangfeng.me
foaf:
isPrimaryTopicOf
wikipedia-en
:Reinforcement_learning_from_human_feedback
is
foaf:
primaryTopic
of
wikipedia-en
:Reinforcement_learning_from_human_feedback
This content was extracted from
Wikipedia
and is licensed under the
Creative Commons Attribution-ShareAlike 4.0 International