Terence888 发表于 2024-10-26 18:26:54

解读自定义的神经网络模型

class Net(torch.nn.Module):
    def __init__(self):
      super(Net,self).__init__()
      self.hidden_layer1 = nn.Linear(331,1024)
      self.hidden_layer2 = nn.Linear(1024,1024)
      self.hidden_layer3 = nn.Linear(1024,1024)
      self.hidden_layer4 = nn.Linear(1024,1024)
      self.output_layer = nn.Linear(1024,1)
      self.dropout = nn.Dropout(p=0.2)
      nn.init.xavier_uniform_(self.hidden_layer1.weight)
      nn.init.xavier_uniform_(self.hidden_layer2.weight)
      nn.init.xavier_uniform_(self.hidden_layer3.weight)
      nn.init.xavier_uniform_(self.hidden_layer4.weight)
      nn.init.xavier_uniform_(self.output_layer.weight)
    def forward(self,x):
      inputs = x
      layer1_out = torch.nn.functional.gelu(self.hidden_layer1(inputs))
      layer1_out = self.dropout(layer1_out)
      layer2_out = torch.nn.functional.gelu(self.hidden_layer2(layer1_out))
      layer2_out = self.dropout(layer2_out)
      layer3_out = torch.nn.functional.gelu(self.hidden_layer3(layer2_out))
      layer3_out = self.dropout(layer3_out)
      layer4_out = torch.nn.functional.gelu(self.hidden_layer4(layer3_out))
      layer4_out = self.dropout(layer4_out)
      output = torch.relu(self.output_layer(layer4_out))
      return output

这个网络的gelu函数是什么?nn.init.xavier_uniform_是什么方法?

FishC 发表于 2024-10-26 18:33:08

这个自定义的神经网络模型有几个关键点:

1. **GELU函数**:GELU(Gaussian Error Linear Unit)是激活函数,常用于神经网络中。它结合了ReLU和Sigmoid的优点,通过保留输入的高斯分布特性,以提升模型性能,公式为:
   \[
   \text{GELU}(x) = x \cdot \Phi(x)
   \]
   其中,\(\Phi(x)\) 是标准正态分布的累积分布函数。

2. **Xavier初始化**:`nn.init.xavier_uniform_`是用来初始化网络权重的方法。通过维持每层输出的方差一致,能有效减轻梯度消失或爆炸问题。Xavier均匀初始化在输入和输出神经元数量的基础上,使用一个均匀分布来初始化权重。

这个模型使用了多层线性层和GELU激活函数,并加入了Dropout正则化,以减少过拟合。最终输出用ReLU激活函数处理。

以上回复来自 FishC AI 小助理,如未能正确解答您的问题,请继续追问。

Terence888 发表于 2024-10-26 19:01:41

FishC 发表于 2024-10-26 18:33
这个自定义的神经网络模型有几个关键点:

1. **GELU函数**:GELU(Gaussian Error Linear Unit)是激活 ...

归一化之前损失特别高,我对特征进行了归一化,损失一下子降到零点几,是什么原因

FishC 发表于 2024-10-26 19:14:02

Terence888 发表于 2024-10-26 19:01
归一化之前损失特别高,我对特征进行了归一化,损失一 ...

特征归一化能够帮助模型更快收敛,主要原因如下:

1. **数值稳定性**:归一化后的特征值通常更小且在相似的范围内,这可以增加数值计算的稳定性。

2. **梯度下降效率**:特征值范围一致,使得梯度下降算法能够更有效地调整权重,避免某些特征主导梯度更新。

3. **避免偏置问题**:未归一化的特征可能导致模型偏向处理某些特征,归一化能够防止这种情况。

因此,通过归一化,模型可以更好地训练和收敛,从而明显降低损失。

以上回复来自 FishC AI 小助理,如未能正确解答您的问题,请继续追问。

Terence888 发表于 2024-10-26 21:13:00

Terence888 发表于 2024-10-26 19:01
归一化之前损失特别高,我对特征进行了归一化,损失一下子降到零点几,是什么原因


#使用dnn模型(k折交叉验证)
import torch
import torch.nn as nn
from torch.utils import data
from torch.utils.data import Dataset,DataLoader
from torch import optim

#定义神经网络模型

class SimpleNN(nn.Module):
    def __init__(self):
      super(SimpleNN, self).__init__()# 继承需要用 SimpleNN
      self.hidden_layer1 = nn.Linear(154,1024)
      self.hidden_layer2 = nn.Linear(1024,1024)
      self.hidden_layer3 = nn.Linear(1024,1024)
      self.hidden_layer4 = nn.Linear(1024,1024)
      self.output_layer = nn.Linear(1024,1)
      self.dropout = nn.Dropout(0.2)
      nn.init.xavier_uniform_(self.hidden_layer1.weight) #初始化每层网络的权重参数
      nn.init.xavier_uniform_(self.hidden_layer2.weight)
      nn.init.xavier_uniform_(self.hidden_layer3.weight)
      nn.init.xavier_uniform_(self.hidden_layer4.weight)
      nn.init.xavier_uniform_(self.output_layer.weight)
      
    def forward(self, X):
      inputs = X
      layer1_out = torch.nn.functional.gelu(self.hidden_layer1(inputs))
      layer1_out = self.dropout(layer1_out)
      layer2_out = torch.nn.functional.gelu(self.hidden_layer2(layer1_out))
      layer2_out = self.dropout(layer2_out)
      layer3_out = torch.nn.functional.gelu(self.hidden_layer3(layer2_out))
      layer3_out = self.dropout(layer3_out)
      layer4_out = torch.nn.functional.gelu(self.hidden_layer4(layer3_out))
      layer4_out = self.dropout(layer4_out)
      output = torch.relu(self.output_layer(layer4_out))
      return output
   
#初始化模型和优化器
dnn_model = SimpleNN()
loss = nn.MSELoss() #定义损失函数
optimizer = optim.Adam(dnn_model.parameters(),lr=0.0001,weight_decay=0) #定义优化器


#k折交叉验证选取训练集与验证集
def get_k_fold_data(k, i, X, y):
    assert k > 1
    fold_size = len(X) // k
    X_train, y_train = None, None
    for j in range(k):
      start = j * fold_size
      end = (j + 1) * fold_size
      if j == i:
            X_valid, y_valid = X.iloc, y.iloc
      elif X_train is None:
            X_train, y_train = X.iloc, y.iloc
      else:
            X_train = pd.concat(], ignore_index=True)
            y_train = pd.concat(], ignore_index=True)
    return X_train, y_train, X_valid, y_valid


# 开始训练
k = 5
batch_size = 64
num_epochs = 100
#weight_decay = 0

#初始化损失
train_l_sum, valid_l_sum = 0, 0

#初始化列表
train_ls, valid_ls = [], []

for i in range(k):
    X_train, y_train, X_valid, y_valid = get_k_fold_data(k, i, X, y)
    print(f'FOLD {i}')
    print('--------------------------------')
   

    #将DataFrame数据转换为NumPy数组,然后再转换为PyTorch张量
    X_train = torch.tensor(X_train.astype(np.float32).values, dtype=torch.float32)
    y_train = torch.tensor(y_train.astype(np.float32).values, dtype=torch.float32)
    X_valid = torch.tensor(X_valid.astype(np.float32).values, dtype=torch.float32)
    y_valid = torch.tensor(y_valid.astype(np.float32).values, dtype=torch.float32)
   
    #创建数据集
    train_dataset = data.TensorDataset(X_train, y_train)
    valid_dataset = data.TensorDataset(X_valid, y_valid)

    # 获取一个数据迭代器
    train_iter = DataLoader(dataset=train_dataset,batch_size=batch_size,shuffle=True,num_workers=2)#shuffle=True相当于sampler=RandomSampler(dataset)
    valid_iter = DataLoader(dataset=valid_dataset,batch_size=batch_size,shuffle=True,num_workers=2)
   
    #开始迭代
    for epoch in range(num_epochs):
      train_loss = 0
      for tensor_x, tensor_y in train_iter:#训练集执行梯度更新
            tensor_x = tensor_x.float()
            tensor_y = tensor_y.float().reshape(-1, 1)
            optimizer.zero_grad() #梯度清零
            pre_train = dnn_model(tensor_x)
            train_l = loss(pre_train, tensor_y) #损失应避免与全局变量loss重名
            train_l.backward()#前向传播
            optimizer.step()#梯度下降

            train_loss += train_l.item() * len(tensor_x) #批量规模损失累加
            
      train_loss /= len(train_dataset) #每次迭代平均损失
      
      if epoch % 20 == 0:
            print('Loss: {}Epoch:{}'.format(train_loss, epoch))
      
      with torch.no_grad():
            valid_loss = 0
            
            for tensor_x, tensor_y in valid_iter:
                tensor_x = tensor_x.float()
                tensor_y = tensor_y.float().reshape(-1, 1)
                pre_valid = dnn_model(tensor_x)
                valid_l = loss(pre_valid, tensor_y)
                valid_loss += valid_l.item() * len(tensor_x)
                  
            valid_loss /= len(valid_dataset)
                  
            if epoch % 20 == 0:
                print('Valid Loss: {}Epoch:{}'.format(valid_loss, epoch))
      
    if i == 0:
      plot(list(range(1, num_epochs + 1)), ,xlabel='epoch', ylabel='rmse', xlim=,legend=['train', 'valid'], yscale='log')
    #将每折的损失添加到列表中
    train_ls.append(train_loss)
    valid_ls.append(valid_loss)

      
print('Training Ended')
print('Train Average Loss: {} Valid Average Loss: {}'.format(np.mean(train_ls),np.mean(valid_ls)))
自定义的神经网络如上,结果如下,代码存在什么问题?
FOLD 0
--------------------------------
Loss: 36344742366.57045Epoch:0
Valid Loss: 29782921659.381443Epoch:0
Loss: 3065413660.5910654Epoch:20
Valid Loss: 2382111456.329897Epoch:20
Loss: 2432799436.09622Epoch:40
Valid Loss: 2090321211.8213058Epoch:40
Loss: 2391088845.85567Epoch:60
Valid Loss: 1990942727.9175258Epoch:60
Loss: 2284166063.065292Epoch:80
Valid Loss: 2163259427.1890035Epoch:80
FOLD 1
--------------------------------
Loss: 2206484537.182131Epoch:0
Valid Loss: 2204314296.742268Epoch:0
Loss: 1998103985.2646048Epoch:20
Valid Loss: 2287907708.9209623Epoch:20
Loss: 2017798130.1443298Epoch:40
Valid Loss: 2118876600.3024056Epoch:40
Loss: 1974789904.274914Epoch:60
Valid Loss: 2002172580.5085912Epoch:60
Loss: 1853698483.4639175Epoch:80
Valid Loss: 1899964549.718213Epoch:80
FOLD 2
--------------------------------
Loss: 1769865234.5841925Epoch:0
Valid Loss: 1894314067.573883Epoch:0
Loss: 1821377592.742268Epoch:20
Valid Loss: 2032567193.95189Epoch:20
Loss: 1743828709.6082475Epoch:40
Valid Loss: 2015808563.024055Epoch:40
Loss: 1722286362.8316152Epoch:60
Valid Loss: 1898440543.0103092Epoch:60
Loss: 1575469064.7972507Epoch:80
Valid Loss: 1905074117.9381444Epoch:80
FOLD 3
--------------------------------
Loss: 1642522463.8900344Epoch:0
Valid Loss: 1701700081.4845362Epoch:0
Loss: 1600776652.975945Epoch:20
Valid Loss: 1750697497.5120275Epoch:20
Loss: 1523211206.1580756Epoch:40
Valid Loss: 1828443650.8591065Epoch:40
Loss: 1462216860.5910654Epoch:60
Valid Loss: 1648079266.3092782Epoch:60
Loss: 1537743156.5635738Epoch:80
Valid Loss: 1658678443.1065292Epoch:80
FOLD 4
--------------------------------
Loss: 1444227294.790378Epoch:0
Valid Loss: 1351599504.274914Epoch:0
Loss: 1354902019.079038Epoch:20
Valid Loss: 1279266340.7285223Epoch:20
Loss: 1205427271.257732Epoch:40
Valid Loss: 1247919564.5360825Epoch:40
Loss: 1265103737.8419244Epoch:60
Valid Loss: 1256292511.0103092Epoch:60
Loss: 1193805096.467354Epoch:80
Valid Loss: 1190893154.969072Epoch:80
Training Ended
Train Average Loss: 1595598623.2302403 Valid Average Loss: 1713117385.5450172
页: [1]
查看完整版本: 解读自定义的神经网络模型