Hey I have a probably very simple question, but I couldn't find answers to it when searching. I've been testing how PyTorch manages GPU memory, and found a memory leak that I am not sure how to resolve. Consider this piece of code
a = torch.randn(1000000,10) a = a.to(device=device) #device here is is my GPU b = torch.randn(1000000,10) b = b.to(device= device) def sum2(c,d): return c+d
Now the GPU's memory usage as measured by nvidia-smi
and torch.cuda.memory_allocated()
increases by 80 MiB
(40 MiB
each for a and b)
Now if I were to repeatedly call sum2(a,b)
, without storing the result (I am doing this in a jupyter notebook, not sure if that is relevant), the GPU memory usage keeps increasing by 40 MiB
with every call, indicating that the result somehow is stored in GPU even though the result tensor isn't really referenced and the result is "lost".
This behavior persists even when I wrap the return c+d
in a with torch.no_grad():
Any ideas what is going on here and how to make sure that memory doesn't baloon up in this way?
https://stackoverflow.com/questions/66947763/gpu-memory-usage-accumulating-when-calling-function-even-though-tensor-isnt-sto April 05, 2021 at 11:00AM
没有评论:
发表评论