当前位置: 首页> 房产> 家装 > 西安建站套餐_网页设计图片跟随鼠标移动_网站seo查询站长之家_汕头seo代理

西安建站套餐_网页设计图片跟随鼠标移动_网站seo查询站长之家_汕头seo代理

时间:2025/7/10 13:24:04来源:https://blog.csdn.net/SashiMoore/article/details/143448559 浏览次数:0次
西安建站套餐_网页设计图片跟随鼠标移动_网站seo查询站长之家_汕头seo代理

2024/11/2-2024/11/4:

        大部分思路来自于:PyTorch RNN的原理及其手写复现icon-default.png?t=O83Ahttps://www.bilibili.com/video/BV13i4y1R7jB/?spm_id_from=333.880.my_history.page.click&vd_source=db0d5acc929b82408b1040d67f2b1dde        改动了一部分使代码实现更加符合公式和直觉。

参数设置与官方api验证:

        初始化一些必要的参数:

import torch
import torch.nn as nn# batch size, sequence length
bs, T = 2, 3
input_size, hidden_size = 2, 3
input = torch.randn(bs, T, input_size)
# 初始化隐藏状态,作为第一个时间步的隐藏状态输入
h_prev = torch.randn(bs, hidden_size)

         可以看到input的size是 [batch size, sequence length, input_size] 。参考下图,我们的输入shape设置参考了官方api中提供的对input的要求,当rnn实例化时的参数batch_first设置为true时,个人觉得这样的设置也比较符合直觉和习惯。

        至于h0的设置,在代码中对应我们的h_prev。它和任何一个h的shape都一样,只不过在计算当前的sequence的隐藏状态ht时,前一个sequence的隐藏状态ht-1是必要的,如下图。那么当我们计算第一个sequence的h1时,因为它前面已经没有sequence了,所以我们需要给他提供一个h0用作计算。

        而h0或者说所有h的shape取决于RNN的层数以及是否采用RNN的双向参数bidirectional。

        这里简单验证一下h的shape与num_layers以及bidirectional的关系,可以看到与官方文档中给出的信息相符合。

import torch
# 单向 单层RNN
import torch.nn as nn
single_rnn = nn.RNN(4, 3, 1, batch_first=True)
input = torch.randn(1, 2, 4)
output, h_n = single_rnn(input)
output
Out[8]: 
tensor([[[-0.8700, -0.8963, -0.9267],[ 0.0953, -0.4410, -0.9181]]], grad_fn=<TransposeBackward1>)
h_n
Out[9]: tensor([[[ 0.0953, -0.4410, -0.9181]]], grad_fn=<StackBackward0>)
# 双向 单层RNN
bi_rnn = nn.RNN(4, 3, 1, batch_first=True, bidirectional=True)
bi_output, bi_h_n = bi_rnn(input)
bi_output.shape
Out[13]: torch.Size([1, 2, 6])
bi_h_n.shape
Out[14]: torch.Size([2, 1, 3])
output.shape
Out[15]: torch.Size([1, 2, 3])
h_n.shape
Out[16]: torch.Size([1, 1, 3])
# 单向 多层RNN
sm_rnn = nn.RNN(4, 3, 2, batch_first=True)
sm_output, sm_h = sm_rnn(input)
sm_output.shape
Out[21]: torch.Size([1, 2, 3])
sm_h.shape
Out[22]: torch.Size([2, 1, 3])
# 双向 多层RNN
bm_rnn = nn.RNN(4, 3, 2, batch_first=True, bidirectional=True)
bm_output, bm_h = bm_rnn(input)
bm_output
Out[26]: 
tensor([[[ 0.6568, -0.0670, -0.7799,  0.2645, -0.9087,  0.9372],[ 0.7104,  0.3997, -0.6929, -0.3831, -0.7795,  0.7643]]],grad_fn=<TransposeBackward1>)
bm_output.shape
Out[27]: torch.Size([1, 2, 6])
bm_h.shape
Out[28]: torch.Size([4, 1, 3])

         上个代码块还顺便验证了输出output的shape,可以看到也与下图官方给出的信息符合,若采用双向bidirectional参数,则双向的RNN将前向和后向的输出在最后一个维度拼接。

代码实现:

         根据前一部分的验证,对照RNN关于h的公式,我们可以得到如下代码:

def rnn_forward(input, weight_ih, weight_hh, bias_ih, bias_hh, h_prev):# input: [bs, T, input_size]# weight_ih: [hidden_size, input_size]# weight_hh: [hidden_size, hidden_size]# bias_ih: [hidden_size]# bias_hh: [hidden_size]# h_prev: [bs, hidden_size]bs, T, input_size = input.shapeh_dim = weight_ih.shape[0]h_out = torch.zeros(bs, T, h_dim)  # 初始化一个输出张量for t in range(T):x = input[:, t, :]  # [bs, input_size]w_times_x = torch.matmul(x, weight_ih.T)  # [bs, h_dim]w_times_h = torch.matmul(h_prev, weight_hh.T)  # [bs, h_dim]h_prev = torch.tanh(w_times_x + w_times_h + bias_ih + bias_hh)  # [bs, h_dim]h_out[:, t, :] = h_prevreturn h_out, h_prev.unsqueeze(0)

        将该函数以下面的测试用例进行测试,输出结果如下:

rnn = nn.RNN(input_size=input_size, hidden_size=hidden_size, batch_first=True)
# unsqueeze(0)增加一个维度,变成3维张量, 作为第一个时间步的隐藏状态输入
rnn_output, state_final = rnn(input, h_prev.unsqueeze(0))
print("pytorch RNN API:")
print(rnn_output)
print(state_final)# 验证rnn_forward函数的正确性
for p, n in rnn.named_parameters():print(p, n.shape)# 直接采用rnn的参数
custom_rnn_output, custom_state_final = rnn_forward(input, rnn.weight_ih_l0, rnn.weight_hh_l0,rnn.bias_ih_l0, rnn.bias_hh_l0, h_prev)
print("custom rnn forward:")
print(custom_rnn_output)
print(custom_state_final)

pytorch RNN API:
tensor([[[ 0.3759, -0.7116, -0.8993],
         [-0.5924, -0.1507,  0.9623],
         [-0.2508, -0.2265,  0.3904]],

        [[-0.6226, -0.6587,  0.5304],
         [-0.4655, -0.1730,  0.8652],
         [-0.1209, -0.5013,  0.0275]]], grad_fn=<TransposeBackward1>)
tensor([[[-0.2508, -0.2265,  0.3904],
         [-0.1209, -0.5013,  0.0275]]], grad_fn=<StackBackward0>)
weight_ih_l0 torch.Size([3, 2])
weight_hh_l0 torch.Size([3, 3])
bias_ih_l0 torch.Size([3])
bias_hh_l0 torch.Size([3])
custom rnn forward:
tensor([[[ 0.3759, -0.7116, -0.8993],
         [-0.5924, -0.1507,  0.9623],
         [-0.2508, -0.2265,  0.3904]],

        [[-0.6226, -0.6587,  0.5304],
         [-0.4655, -0.1730,  0.8652],
         [-0.1209, -0.5013,  0.0275]]], grad_fn=<CopySlices>)
tensor([[[-0.2508, -0.2265,  0.3904],
         [-0.1209, -0.5013,  0.0275]]], grad_fn=<UnsqueezeBackward0>)

        可以看到官方api与我们写的函数输出相同。

        运用写出的rnn_forward函数,同理可得双向RNN的函数如下:

# 定义双向RNN
def bidirectional_rnn_forward(input, weight_ih, weight_hh, bias_ih, bias_hh, h_prev, weight_ih_reverse,weight_hh_reverse, bias_ih_reverse, bias_hh_reverse, h_prev_reverse):bs, T, input_size = input.shapeh_dim = weight_ih.shape[0]h_out, h_prev = rnn_forward(input, weight_ih, weight_hh, bias_ih, bias_hh, h_prev)h_out_reverse, h_prev_reverse = rnn_forward(torch.flip(input, [1]), weight_ih_reverse, weight_hh_reverse,bias_ih_reverse, bias_hh_reverse, h_prev_reverse)h_out_bidirectional = torch.cat([h_out, torch.flip(h_out_reverse, [1])], dim=-1)return h_out_bidirectional, torch.cat([h_prev, h_prev_reverse], dim=0)

         结合前面的图片,该函数理解起来还是比较简单的。主要在于通过flip函数将input反向的操作。当然,若不调用之前写的函数,直接反向遍历重新算也行。

        验证如下:

# 验证双向RNN的正确性
brnn = nn.RNN(input_size=input_size, hidden_size=hidden_size, batch_first=True, bidirectional=True)
h_prev = torch.zeros(2, bs, hidden_size)
brnn_output, brnn_state_final = brnn(input, h_prev)
print("pytorch bidirectional RNN API:")
print(brnn_output)
print(brnn_state_final)# 直接采用brnn的参数
custom_brnn_output, custom_brnn_state_final = bidirectional_rnn_forward(input, brnn.weight_ih_l0, brnn.weight_hh_l0, brnn.bias_ih_l0, brnn.bias_hh_l0, h_prev[0],brnn.weight_ih_l0_reverse, brnn.weight_hh_l0_reverse, brnn.bias_ih_l0_reverse, brnn.bias_hh_l0_reverse,h_prev[1])
print("custom bidirectional rnn forward:")
print(custom_brnn_output)
print(custom_brnn_state_final)

pytorch bidirectional RNN API:
tensor([[[ 0.5844,  0.8434, -0.6575,  0.7862,  0.2991, -0.1547],
         [-0.6368,  0.8893, -0.1286,  0.1199,  0.0642, -0.1274],
         [ 0.6029,  0.9831, -0.1443,  0.5292, -0.3179,  0.3351]],

        [[-0.0248,  0.6121, -0.4242,  0.4264,  0.2619, -0.5996],
         [ 0.4378,  0.9871, -0.1419,  0.6944, -0.6108,  0.8240],
         [-0.6801,  0.1686, -0.5515,  0.0044,  0.9227, -0.6906]]],
       grad_fn=<TransposeBackward1>)
tensor([[[ 0.6029,  0.9831, -0.1443],
         [-0.6801,  0.1686, -0.5515]],

        [[ 0.7862,  0.2991, -0.1547],
         [ 0.4264,  0.2619, -0.5996]]], grad_fn=<StackBackward0>)
custom bidirectional rnn forward:
tensor([[[ 0.5844,  0.8434, -0.6575,  0.7862,  0.2991, -0.1547],
         [-0.6368,  0.8893, -0.1286,  0.1199,  0.0642, -0.1274],
         [ 0.6029,  0.9831, -0.1443,  0.5292, -0.3179,  0.3351]],

        [[-0.0248,  0.6121, -0.4242,  0.4264,  0.2619, -0.5996],
         [ 0.4378,  0.9871, -0.1419,  0.6944, -0.6108,  0.8240],
         [-0.6801,  0.1686, -0.5515,  0.0044,  0.9227, -0.6906]]],
       grad_fn=<CatBackward0>)
tensor([[[ 0.6029,  0.9831, -0.1443],
         [-0.6801,  0.1686, -0.5515]],

        [[ 0.7862,  0.2991, -0.1547],
         [ 0.4264,  0.2619, -0.5996]]], grad_fn=<CatBackward0>)

        复现成功。 

 reference:

循环神经网络(RNN, Recurrent Neural Networks)介绍-CSDN博客

RNN详解(Recurrent Neural Network)-CSDN博客

关键字:西安建站套餐_网页设计图片跟随鼠标移动_网站seo查询站长之家_汕头seo代理

版权声明:

本网仅为发布的内容提供存储空间,不对发表、转载的内容提供任何形式的保证。凡本网注明“来源:XXX网络”的作品,均转载自其它媒体,著作权归作者所有,商业转载请联系作者获得授权,非商业转载请注明出处。

我们尊重并感谢每一位作者,均已注明文章来源和作者。如因作品内容、版权或其它问题,请及时与我们联系,联系邮箱:809451989@qq.com,投稿邮箱:809451989@qq.com

责任编辑: