728x90
반응형

 

Pytorch 에서 CUDA 호출이 비동기식이기 때문에 

타이머를 시작 또는 중지 하기 전에 torch.cuda.synchronize() 를 통해 코드를 동기화 시켜주어야 한다. 

 

start = torch.cuda.Event(enable_timing=True)
end = torch.cuda.Event(enable_timing=True)

start.record()
z = x + y
end.record()

# Waits for everything to finish running
torch.cuda.synchronize()

print(start.elapsed_time(end))

 

 

참고자료 1 : https://discuss.pytorch.org/t/best-way-to-measure-timing/39496

 

Best way to measure timing?

Hello, I’m looking for the best way to measure the timing of a process: time.perf_counter or time.process_time? I have seen in several topics that people use more perf_counter but process_time is process-wide (1). But in the docs, you can see that proces

discuss.pytorch.org

참고자료 2 : https://discuss.pytorch.org/t/how-to-measure-time-in-pytorch/26964/2

 

How to measure time in PyTorch

There are many things you can do CPU-only benchmarking: I’ve used timeit as well as profilers. CUDA is asynchronous so you will need some tools to measure time. CUDA events are good for this if you’re timing “add” on two cuda tensors, you should sa

discuss.pytorch.org

 

728x90
반응형