Discuz! Board

 找回密碼
 立即註冊
搜索
熱搜: 活動 交友 discuz
查看: 15|回復: 0
打印 上一主題 下一主題

weight update very slow. The opposite

[複製鏈接]

1

主題

1

帖子

5

積分

新手上路

Rank: 1

積分
5
跳轉到指定樓層
樓主
發表於 2024-3-6 16:39:00 | 只看該作者 回帖獎勵 |倒序瀏覽 |閱讀模式
gradient explosion problem of gradient explosion and gradient disappearance means that the gradient may become very large during the training process that causes too much weight update, so that the network cannot converge. In RNN, if the sequence is very long, the gradient in the reverse propagation may need to pass the multiplication operation of many steps, which may cause the gradient to become very large, making the weight update too




large, so that the network cannot converge. 3. Optimized algorithm LLN HR-ERY L is a special RNN. It solves gradient disappearance and explosion problems by introducing door control mechanisms by introducing door control mechanisms. The door control mechanism is a way Rich People Phone Number List of control information flow. In L, each unit has a memory cell and three types of doorless FRE E that determines which information should be forgotten or abandoned. The input door INU E determines which new information should be stored in the cell state. The output door UU E determines which information in the cell state should be read and output.





Each door has a II neural network layer and a point of accumulation. II layer output numbers that determine the amount of information should be passed. Said that "let all information pass" let all information pass. L solved the problem of gradient disappearance and explosion of traditional RNN through its door control mechanism, so that L can avoid the problem of gradient disappearance and gradient explosion when dealing with long sequences to learn long -distance dependencies. The figure below is the principle of L. The specific principle of the diagram of L is not here to detail students who are interested can inquire themselves. Gate -control cycle unit RUE Reurren Uni Ru is another high -level RNN and L's structure than RU. Only two types of door update doors. Information. Rewinding the door Ree E determines how many old hidden state should be ignored when generating a new hidden state. Ru's door mechanism allows it to learn long -distance dependencies when processing long sequences. At the same time, because its structure is

回復

使用道具 舉報

您需要登錄後才可以回帖 登錄 | 立即註冊

本版積分規則

Archiver|手機版|自動贊助|GameHost抗攻擊論壇  

GMT+8, 2025-5-7 16:27 , Processed in 1.484936 second(s), 6 queries , File On.

抗攻擊 by GameHost X3.3

© 2001-2017 Comsenz Inc.

快速回復 返回頂部 返回列表
一粒米 | 中興米 | 論壇美工 | 設計 抗ddos | 天堂私服 | ddos | ddos | 防ddos | 防禦ddos | 防ddos主機 | 天堂美工 | 設計 防ddos主機 | 抗ddos主機 | 抗ddos | 抗ddos主機 | 抗攻擊論壇 | 天堂自動贊助 | 免費論壇 | 天堂私服 | 天堂123 | 台南清潔 | 天堂 | 天堂私服 | 免費論壇申請 | 抗ddos | 虛擬主機 | 實體主機 | vps | 網域註冊 | 抗攻擊遊戲主機 | ddos |