鱼C论坛

 找回密码
 立即注册
查看: 333|回复: 1

gpu训练神经网络时报错

[复制链接]
发表于 2024-10-26 23:11:05 | 显示全部楼层 |阅读模式

马上注册,结交更多好友,享用更多功能^_^

您需要 登录 才可以下载或查看,没有账号?立即注册

x
  1. #plus 1 to avoid log(0)
  2. def MSLE_loss(pred, target):
  3.     log_pred = torch.log(pred + 1)  
  4.     log_target = torch.log(target + 1)  
  5.     loss = nn.MSELoss()(log_pred, log_target)  
  6.     return loss

  7. model = Net().cuda()
  8. optimizer = optim.AdamW(model.parameters(),lr=1e-3,weight_decay=1e-3)
  9. for epoch in range(3500):
  10.     pred = model(X_train)
  11.     pred = pred.squeeze()
  12.     loss = MSLE_loss(pred,y_train)
  13.    
  14.     if epoch % 500 == 0 :
  15.         print(loss)
  16.         print(pred)
  17.         print(y_train)
  18.     loss.backward()
  19.     optimizer.step()
  20.     optimizer.zero_grad()
复制代码


报错如下
  1. ---------------------------------------------------------------------------
  2. RuntimeError                              Traceback (most recent call last)
  3. Cell In[54], line 11
  4.       9 optimizer = optim.AdamW(model.parameters(),lr=1e-3,weight_decay=1e-3)
  5.      10 for epoch in range(3500):
  6. ---> 11     pred = model(X_train)
  7.      12     pred = pred.squeeze()
  8.      13     loss = MSLE_loss(pred,y_train)

  9. File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1553, in Module._wrapped_call_impl(self, *args, **kwargs)
  10.    1551     return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
  11.    1552 else:
  12. -> 1553     return self._call_impl(*args, **kwargs)

  13. File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1562, in Module._call_impl(self, *args, **kwargs)
  14.    1557 # If we don't have any hooks, we want to skip the rest of the logic in
  15.    1558 # this function, and just call forward.
  16.    1559 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
  17.    1560         or _global_backward_pre_hooks or _global_backward_hooks
  18.    1561         or _global_forward_hooks or _global_forward_pre_hooks):
  19. -> 1562     return forward_call(*args, **kwargs)
  20.    1564 try:
  21.    1565     result = None

  22. Cell In[45], line 23, in Net.forward(self, x)
  23.      21 def forward(self,x):
  24.      22     inputs = x
  25. ---> 23     layer1_out = torch.nn.functional.gelu(self.hidden_layer1(inputs))
  26.      24     layer1_out = self.dropout(layer1_out)
  27.      25     layer2_out = torch.nn.functional.gelu(self.hidden_layer2(layer1_out))

  28. File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1553, in Module._wrapped_call_impl(self, *args, **kwargs)
  29.    1551     return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
  30.    1552 else:
  31. -> 1553     return self._call_impl(*args, **kwargs)

  32. File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1562, in Module._call_impl(self, *args, **kwargs)
  33.    1557 # If we don't have any hooks, we want to skip the rest of the logic in
  34.    1558 # this function, and just call forward.
  35.    1559 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
  36.    1560         or _global_backward_pre_hooks or _global_backward_hooks
  37.    1561         or _global_forward_hooks or _global_forward_pre_hooks):
  38. -> 1562     return forward_call(*args, **kwargs)
  39.    1564 try:
  40.    1565     result = None

  41. File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/linear.py:117, in Linear.forward(self, input)
  42.     116 def forward(self, input: Tensor) -> Tensor:
  43. --> 117     return F.linear(input, self.weight, self.bias)

  44. RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat1 in method wrapper_CUDA_addmm)
复制代码
小甲鱼最新课程 -> https://ilovefishc.com
回复

使用道具 举报

发表于 2024-10-26 23:11:10 | 显示全部楼层
报错显示您的张量在不同设备上:有的在GPU上(cuda:0),有的在CPU上。解决办法如下:

1. **将数据移动到GPU**:确保`X_train`和`y_train`在使用时都在GPU上。使用`.cuda()`方法将它们移到GPU上。例如:

  
  1. X_train = X_train.cuda()
  2.    y_train = y_train.cuda()
复制代码


2. **检查模型参数和其他张量**:确保模型的所有参数和输入张量都在同一设备上。可以在定义模型和数据加载时直接使用`.to(device)`,其中`device`是定义的计算设备,如:

  
  1. device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
  2.    model = Net().to(device)
  3.    X_train = X_train.to(device)
  4.    y_train = y_train.to(device)
复制代码


确保所有的计算都在同一设备上,将解决这个错误。

以上回复来自 FishC AI 小助理,如未能正确解答您的问题,请继续追问。
小甲鱼最新课程 -> https://ilovefishc.com
回复 支持 反对

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

小黑屋|手机版|Archiver|鱼C工作室 ( 粤ICP备18085999号-1 | 粤公网安备 44051102000585号)

GMT+8, 2025-4-19 17:25

Powered by Discuz! X3.4

© 2001-2023 Discuz! Team.

快速回复 返回顶部 返回列表