[TensorRT] ONNX 에서 TensorRT 변환 시 Upsample scale_factor 문제
Pytorch 모델을 이용하여 ONNX 모델로 변환 후,
ONNX 모델을 TensorRT 모델로 변환할 시 아래와 같은 에러가 발생 할 때가 있다.
[TensorRT] ERROR: Network must have at least one output
[TensorRT] ERROR: Network validation failed.
위와 같은 에러는 ONNX parser 를 통해 Network 를 읽고 나서,
TensorRT Engine 으로 변환하는 과정에서 지원하지 않는 노드가 있을 때,
Network 를 더 이상 읽지 못하고 결과를 반환하여 최소한의 output 이 있어야 한다고 에러를 내뱉는 것이다.
TensorRT 로그를 통해 어디서 끊긴지 찾아보면 되는데, ONNX 모델의 노드들과 TensorRT 에서 읽은 노드들을 비교해보면 된다. 이 과정에서 노하우가 약간 필요하다.
필자는 upsample layer 쪽에서 변환이 안됨을 발견하고 해당되는 이슈를 찾아보니,
scale_factor 는 constant 하지 않기 때문에 TensorRT 에서 지원이 안된다는 사실을 발견했다.
이는 nearest, bilinear 등의 upsample 에서 scale factor 가 지원이 안됨을 의미한다.
마찬가지로 TensorRT 에서는 constant , freeze, explict 와 같은 성질을 좋아하는 것 같다.
아래와 같이 두 가지 모델을 테스트 해보았다.
class upsample_test(nn.Module):
def forward(self, x):
return torch.nn.functional.interpolate(x, mode='nearest', scale_factor=2)
class upsample_test_NoScaleFactor(nn.Module):
def forward(self, x):
return torch.nn.functional.interpolate(x, mode='nearest', size=200)
1. upsample_test (with scale_factor)
scale_factor 가 존재하는 upsample 의 ONNX 모델 구조는 다음과 같다.
graph(%input : Float(1, 5, 100, 100)):
%1 : Tensor = onnx::Shape(%input)
%2 : Tensor = onnx::Constant[value={2}]()
%3 : Long() = onnx::Gather[axis=0](%1, %2) # /home/seohee/.local/lib/python3.5/site-packages/torch/nn/functional.py:2493:0
%4 : Float() = onnx::Cast[to=1](%3) # /home/seohee/.local/lib/python3.5/site-packages/torch/nn/functional.py:2493:0
%5 : Float() = onnx::Constant[value={2}]()
%6 : Float() = onnx::Mul(%4, %5)
%7 : Float() = onnx::Cast[to=1](%6) # /home/seohee/.local/lib/python3.5/site-packages/torch/nn/functional.py:2493:0
%8 : Float() = onnx::Floor(%7) # /home/seohee/.local/lib/python3.5/site-packages/torch/nn/functional.py:2493:0
%9 : Tensor = onnx::Shape(%input)
%10 : Tensor = onnx::Constant[value={3}]()
%11 : Long() = onnx::Gather[axis=0](%9, %10) # /home/seohee/.local/lib/python3.5/site-packages/torch/nn/functional.py:2493:0
%12 : Float() = onnx::Cast[to=1](%11) # /home/seohee/.local/lib/python3.5/site-packages/torch/nn/functional.py:2493:0
%13 : Float() = onnx::Constant[value={2}]()
%14 : Float() = onnx::Mul(%12, %13)
%15 : Float() = onnx::Cast[to=1](%14) # /home/seohee/.local/lib/python3.5/site-packages/torch/nn/functional.py:2493:0
%16 : Float() = onnx::Floor(%15) # /home/seohee/.local/lib/python3.5/site-packages/torch/nn/functional.py:2493:0
%17 : Tensor = onnx::Unsqueeze[axes=[0]](%8)
%18 : Tensor = onnx::Unsqueeze[axes=[0]](%16)
%19 : Tensor = onnx::Concat[axis=0](%17, %18)
%20 : Tensor = onnx::Constant[value= 1 1 [ CPUFloatType{2} ]]()
%21 : Tensor = onnx::Cast[to=1](%19)
%22 : Tensor = onnx::Shape(%input)
%23 : Tensor = onnx::Slice[axes=[0], ends=[9223372036854775807], starts=[2]](%22)
%24 : Tensor = onnx::Cast[to=1](%23)
%25 : Tensor = onnx::Div(%21, %24)
%26 : Tensor = onnx::Concat[axis=0](%20, %25)
%27 : Float(1, 5, 200, 200) = onnx::Upsample[mode="nearest"](%input, %26) # /home/seohee/.local/lib/python3.5/site-packages/torch/nn/functional.py:2512:0
return (%27)
Done.
그리고 이를 TensorRT 에서 onnx parsing 한 결과는 다음과 같다.
2020-09-02 16:58:12 - __main__ - INFO - TRT_LOGGER Verbosity: Severity.VERBOSE
Beginning ONNX file parsing
[TensorRT] VERBOSE: Registered plugin creator - ::GridAnchor_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::NMS_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Reorg_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Region_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Clip_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::LReLU_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::PriorBox_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Normalize_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::RPROI_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::BatchedNMS_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::FlattenConcat_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::CropAndResize version 1
[TensorRT] VERBOSE: Registered plugin creator - ::DetectionLayer_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Proposal version 1
[TensorRT] VERBOSE: Registered plugin creator - ::ProposalLayer_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::PyramidROIAlign_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::ResizeNearest_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Split version 1
[TensorRT] VERBOSE: Registered plugin creator - ::SpecialSlice_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::InstanceNormalization_TRT version 1
[TensorRT] VERBOSE: ModelImporter.cpp:202: Adding network input: input with dtype: float32, dimensions: (1, 5, 100, 100)
[TensorRT] VERBOSE: ImporterContext.hpp:116: Registering tensor: input for ONNX tensor: input
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: [Shape]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: input
[TensorRT] VERBOSE: ModelImporter.cpp:125: [Shape] inputs: [input -> (1, 5, 100, 100)],
[TensorRT] VERBOSE: ImporterContext.hpp:141: Registering layer: (Unnamed Layer* 0) [Shape] for ONNX node:
[TensorRT] VERBOSE: ImporterContext.hpp:116: Registering tensor: 1 for ONNX tensor: 1
[TensorRT] VERBOSE: ModelImporter.cpp:179: [Shape] outputs: [1 -> (4)],
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: [Constant]
[TensorRT] VERBOSE: ModelImporter.cpp:125: [Constant] inputs:
[TensorRT] WARNING: onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[TensorRT] VERBOSE: ModelImporter.cpp:179: [Constant] outputs: [2 -> ()],
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: [Gather]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 1
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 2
[TensorRT] VERBOSE: ModelImporter.cpp:125: [Gather] inputs: [1 -> (4)], [2 -> ()],
[TensorRT] VERBOSE: builtin_op_importers.cpp:986: Using Gather axis: 0
[TensorRT] VERBOSE: ImporterContext.hpp:141: Registering layer: (Unnamed Layer* 2) [Gather] for ONNX node:
[TensorRT] VERBOSE: ImporterContext.hpp:116: Registering tensor: 3 for ONNX tensor: 3
[TensorRT] VERBOSE: ModelImporter.cpp:179: [Gather] outputs: [3 -> ()],
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: [Cast]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 3
[TensorRT] VERBOSE: ModelImporter.cpp:125: [Cast] inputs: [3 -> ()],
[TensorRT] VERBOSE: builtin_op_importers.cpp:320: Casting to type: float32
[TensorRT] VERBOSE: ImporterContext.hpp:141: Registering layer: (Unnamed Layer* 3) [Identity] for ONNX node:
[TensorRT] VERBOSE: ImporterContext.hpp:116: Registering tensor: 4 for ONNX tensor: 4
[TensorRT] VERBOSE: ModelImporter.cpp:179: [Cast] outputs: [4 -> ()],
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: [Constant]
[TensorRT] VERBOSE: ModelImporter.cpp:125: [Constant] inputs:
[TensorRT] VERBOSE: ModelImporter.cpp:179: [Constant] outputs: [5 -> ()],
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: [Mul]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 4
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 5
[TensorRT] VERBOSE: ModelImporter.cpp:125: [Mul] inputs: [4 -> ()], [5 -> ()],
[TensorRT] VERBOSE: ImporterContext.hpp:141: Registering layer: (Unnamed Layer* 5) [ElementWise] for ONNX node:
[TensorRT] VERBOSE: ImporterContext.hpp:116: Registering tensor: 6 for ONNX tensor: 6
[TensorRT] VERBOSE: ModelImporter.cpp:179: [Mul] outputs: [6 -> ()],
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: [Cast]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 6
[TensorRT] VERBOSE: ModelImporter.cpp:125: [Cast] inputs: [6 -> ()],
[TensorRT] VERBOSE: builtin_op_importers.cpp:320: Casting to type: float32
[TensorRT] VERBOSE: ImporterContext.hpp:141: Registering layer: (Unnamed Layer* 6) [Identity] for ONNX node:
[TensorRT] VERBOSE: ImporterContext.hpp:116: Registering tensor: 7 for ONNX tensor: 7
[TensorRT] VERBOSE: ModelImporter.cpp:179: [Cast] outputs: [7 -> ()],
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: [Floor]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 7
[TensorRT] VERBOSE: ModelImporter.cpp:125: [Floor] inputs: [7 -> ()],
[TensorRT] VERBOSE: onnx2trt_utils.cpp:1793: Original shape: (), unsqueezing to: (1,)
[TensorRT] VERBOSE: ImporterContext.hpp:141: Registering layer: (Unnamed Layer* 8) [Unary] for ONNX node:
[TensorRT] VERBOSE: onnx2trt_utils.cpp:1641: Original shape: (1,), squeezing to: ()
[TensorRT] VERBOSE: ImporterContext.hpp:116: Registering tensor: 8 for ONNX tensor: 8
[TensorRT] VERBOSE: ModelImporter.cpp:179: [Floor] outputs: [8 -> ()],
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: [Shape]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: input
[TensorRT] VERBOSE: ModelImporter.cpp:125: [Shape] inputs: [input -> (1, 5, 100, 100)],
[TensorRT] VERBOSE: ImporterContext.hpp:141: Registering layer: (Unnamed Layer* 10) [Shape] for ONNX node:
[TensorRT] VERBOSE: ImporterContext.hpp:116: Registering tensor: 9 for ONNX tensor: 9
[TensorRT] VERBOSE: ModelImporter.cpp:179: [Shape] outputs: [9 -> (4)],
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: [Constant]
[TensorRT] VERBOSE: ModelImporter.cpp:125: [Constant] inputs:
[TensorRT] VERBOSE: ModelImporter.cpp:179: [Constant] outputs: [10 -> ()],
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: [Gather]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 9
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 10
[TensorRT] VERBOSE: ModelImporter.cpp:125: [Gather] inputs: [9 -> (4)], [10 -> ()],
[TensorRT] VERBOSE: builtin_op_importers.cpp:986: Using Gather axis: 0
[TensorRT] VERBOSE: ImporterContext.hpp:141: Registering layer: (Unnamed Layer* 12) [Gather] for ONNX node:
[TensorRT] VERBOSE: ImporterContext.hpp:116: Registering tensor: 11 for ONNX tensor: 11
[TensorRT] VERBOSE: ModelImporter.cpp:179: [Gather] outputs: [11 -> ()],
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: [Cast]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 11
[TensorRT] VERBOSE: ModelImporter.cpp:125: [Cast] inputs: [11 -> ()],
[TensorRT] VERBOSE: builtin_op_importers.cpp:320: Casting to type: float32
[TensorRT] VERBOSE: ImporterContext.hpp:141: Registering layer: (Unnamed Layer* 13) [Identity] for ONNX node:
[TensorRT] VERBOSE: ImporterContext.hpp:116: Registering tensor: 12 for ONNX tensor: 12
[TensorRT] VERBOSE: ModelImporter.cpp:179: [Cast] outputs: [12 -> ()],
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: [Constant]
[TensorRT] VERBOSE: ModelImporter.cpp:125: [Constant] inputs:
[TensorRT] VERBOSE: ModelImporter.cpp:179: [Constant] outputs: [13 -> ()],
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: [Mul]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 12
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 13
[TensorRT] VERBOSE: ModelImporter.cpp:125: [Mul] inputs: [12 -> ()], [13 -> ()],
[TensorRT] VERBOSE: ImporterContext.hpp:141: Registering layer: (Unnamed Layer* 15) [ElementWise] for ONNX node:
[TensorRT] VERBOSE: ImporterContext.hpp:116: Registering tensor: 14 for ONNX tensor: 14
[TensorRT] VERBOSE: ModelImporter.cpp:179: [Mul] outputs: [14 -> ()],
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: [Cast]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 14
[TensorRT] VERBOSE: ModelImporter.cpp:125: [Cast] inputs: [14 -> ()],
[TensorRT] VERBOSE: builtin_op_importers.cpp:320: Casting to type: float32
[TensorRT] VERBOSE: ImporterContext.hpp:141: Registering layer: (Unnamed Layer* 16) [Identity] for ONNX node:
[TensorRT] VERBOSE: ImporterContext.hpp:116: Registering tensor: 15 for ONNX tensor: 15
[TensorRT] VERBOSE: ModelImporter.cpp:179: [Cast] outputs: [15 -> ()],
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: [Floor]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 15
[TensorRT] VERBOSE: ModelImporter.cpp:125: [Floor] inputs: [15 -> ()],
[TensorRT] VERBOSE: onnx2trt_utils.cpp:1793: Original shape: (), unsqueezing to: (1,)
[TensorRT] VERBOSE: ImporterContext.hpp:141: Registering layer: (Unnamed Layer* 18) [Unary] for ONNX node:
[TensorRT] VERBOSE: onnx2trt_utils.cpp:1641: Original shape: (1,), squeezing to: ()
[TensorRT] VERBOSE: ImporterContext.hpp:116: Registering tensor: 16 for ONNX tensor: 16
[TensorRT] VERBOSE: ModelImporter.cpp:179: [Floor] outputs: [16 -> ()],
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: [Unsqueeze]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 8
[TensorRT] VERBOSE: ModelImporter.cpp:125: [Unsqueeze] inputs: [8 -> ()],
[TensorRT] VERBOSE: onnx2trt_utils.cpp:1793: Original shape: (), unsqueezing to: (1,)
[TensorRT] VERBOSE: ImporterContext.hpp:141: Registering layer: (Unnamed Layer* 20) [Shuffle] for ONNX node:
[TensorRT] VERBOSE: ImporterContext.hpp:116: Registering tensor: 17 for ONNX tensor: 17
[TensorRT] VERBOSE: ModelImporter.cpp:179: [Unsqueeze] outputs: [17 -> (1)],
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: [Unsqueeze]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 16
[TensorRT] VERBOSE: ModelImporter.cpp:125: [Unsqueeze] inputs: [16 -> ()],
[TensorRT] VERBOSE: onnx2trt_utils.cpp:1793: Original shape: (), unsqueezing to: (1,)
[TensorRT] VERBOSE: ImporterContext.hpp:141: Registering layer: (Unnamed Layer* 21) [Shuffle] for ONNX node:
[TensorRT] VERBOSE: ImporterContext.hpp:116: Registering tensor: 18 for ONNX tensor: 18
[TensorRT] VERBOSE: ModelImporter.cpp:179: [Unsqueeze] outputs: [18 -> (1)],
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: [Concat]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 17
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 18
[TensorRT] VERBOSE: ModelImporter.cpp:125: [Concat] inputs: [17 -> (1)], [18 -> (1)],
[TensorRT] VERBOSE: ImporterContext.hpp:141: Registering layer: (Unnamed Layer* 22) [Concatenation] for ONNX node:
[TensorRT] VERBOSE: ImporterContext.hpp:116: Registering tensor: 19 for ONNX tensor: 19
[TensorRT] VERBOSE: ModelImporter.cpp:179: [Concat] outputs: [19 -> (2)],
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: [Constant]
[TensorRT] VERBOSE: ModelImporter.cpp:125: [Constant] inputs:
[TensorRT] VERBOSE: ModelImporter.cpp:179: [Constant] outputs: [20 -> (2)],
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: [Cast]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 19
[TensorRT] VERBOSE: ModelImporter.cpp:125: [Cast] inputs: [19 -> (2)],
[TensorRT] VERBOSE: builtin_op_importers.cpp:320: Casting to type: float32
[TensorRT] VERBOSE: ImporterContext.hpp:141: Registering layer: (Unnamed Layer* 23) [Identity] for ONNX node:
[TensorRT] VERBOSE: ImporterContext.hpp:116: Registering tensor: 21 for ONNX tensor: 21
[TensorRT] VERBOSE: ModelImporter.cpp:179: [Cast] outputs: [21 -> (2)],
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: [Shape]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: input
[TensorRT] VERBOSE: ModelImporter.cpp:125: [Shape] inputs: [input -> (1, 5, 100, 100)],
[TensorRT] VERBOSE: ImporterContext.hpp:141: Registering layer: (Unnamed Layer* 24) [Shape] for ONNX node:
[TensorRT] VERBOSE: ImporterContext.hpp:116: Registering tensor: 22 for ONNX tensor: 22
[TensorRT] VERBOSE: ModelImporter.cpp:179: [Shape] outputs: [22 -> (4)],
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: [Slice]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 22
[TensorRT] VERBOSE: ModelImporter.cpp:125: [Slice] inputs: [22 -> (4)],
[TensorRT] VERBOSE: ImporterContext.hpp:141: Registering layer: (Unnamed Layer* 25) [Slice] for ONNX node:
[TensorRT] VERBOSE: ImporterContext.hpp:116: Registering tensor: 23 for ONNX tensor: 23
[TensorRT] VERBOSE: ModelImporter.cpp:179: [Slice] outputs: [23 -> (2)],
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: [Cast]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 23
[TensorRT] VERBOSE: ModelImporter.cpp:125: [Cast] inputs: [23 -> (2)],
[TensorRT] VERBOSE: builtin_op_importers.cpp:320: Casting to type: float32
[TensorRT] VERBOSE: ImporterContext.hpp:141: Registering layer: (Unnamed Layer* 26) [Identity] for ONNX node:
[TensorRT] VERBOSE: ImporterContext.hpp:116: Registering tensor: 24 for ONNX tensor: 24
[TensorRT] VERBOSE: ModelImporter.cpp:179: [Cast] outputs: [24 -> (2)],
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: [Div]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 21
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 24
[TensorRT] VERBOSE: ModelImporter.cpp:125: [Div] inputs: [21 -> (2)], [24 -> (2)],
[TensorRT] VERBOSE: ImporterContext.hpp:141: Registering layer: (Unnamed Layer* 27) [ElementWise] for ONNX node:
[TensorRT] VERBOSE: ImporterContext.hpp:116: Registering tensor: 25 for ONNX tensor: 25
[TensorRT] VERBOSE: ModelImporter.cpp:179: [Div] outputs: [25 -> (2)],
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: [Concat]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 20
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 25
[TensorRT] VERBOSE: ModelImporter.cpp:125: [Concat] inputs: [20 -> (2)], [25 -> (2)],
[TensorRT] VERBOSE: ImporterContext.hpp:141: Registering layer: (Unnamed Layer* 29) [Concatenation] for ONNX node:
[TensorRT] VERBOSE: ImporterContext.hpp:116: Registering tensor: 26 for ONNX tensor: 26
[TensorRT] VERBOSE: ModelImporter.cpp:179: [Concat] outputs: [26 -> (4)],
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: [Upsample]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: input
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 26
[TensorRT] VERBOSE: ModelImporter.cpp:125: [Upsample] inputs: [input -> (1, 5, 100, 100)], [26 -> (4)],
Completed parsing of ONNX file
이 onnx 모델을 시각화 한 결과는 다음과 같다.
parsing 된 결과를 engine 으로 build 할 때 나는 에러는 다음과 같다.
input 단 다음에서 바로 Network 를 읽지 못한다.
Building an engine from file ./mini_onnx_model/upsample_test1.onnx; this may take a while...
2020-09-02 16:58:13 - __main__ - WARNING - No output nodes found, marking last layer's outputs as network outputs. Correct this if wrong.
2020-09-02 16:58:13 - __main__ - DEBUG - === Network Description ===
2020-09-02 16:58:13 - __main__ - DEBUG - Input 0 | Name: input | Shape: (1, 5, 100, 100)
[TensorRT] ERROR: Network must have at least one output
[TensorRT] ERROR: Network validation failed.
Traceback (most recent call last):
2. upsample_test (without scale_factor)
scale_factor 가 존재하지 않는 ONNX 모델 구조는 다음과 같다.
graph(%x : Float(1, 5, 100, 100)):
%1 : Tensor = onnx::Constant[value= 1 1 2 2 [ CPUFloatType{4} ]]()
%2 : Float(1, 5, 200, 200) = onnx::Upsample[mode="nearest"](%x, %1) # /home/seohee/.local/lib/python3.5/site-packages/torch/nn/functional.py:2512:0
return (%2)
이를 TensorRT에서 ONNX parsing 한 결과는 다음과 같다.
2020-09-02 16:58:34 - __main__ - INFO - TRT_LOGGER Verbosity: Severity.VERBOSE
Beginning ONNX file parsing
[TensorRT] VERBOSE: Registered plugin creator - ::GridAnchor_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::NMS_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Reorg_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Region_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Clip_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::LReLU_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::PriorBox_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Normalize_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::RPROI_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::BatchedNMS_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::FlattenConcat_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::CropAndResize version 1
[TensorRT] VERBOSE: Registered plugin creator - ::DetectionLayer_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Proposal version 1
[TensorRT] VERBOSE: Registered plugin creator - ::ProposalLayer_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::PyramidROIAlign_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::ResizeNearest_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Split version 1
[TensorRT] VERBOSE: Registered plugin creator - ::SpecialSlice_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::InstanceNormalization_TRT version 1
[TensorRT] VERBOSE: ModelImporter.cpp:202: Adding network input: x with dtype: float32, dimensions: (1, 5, 100, 100)
[TensorRT] VERBOSE: ImporterContext.hpp:116: Registering tensor: x for ONNX tensor: x
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: [Constant]
[TensorRT] VERBOSE: ModelImporter.cpp:125: [Constant] inputs:
[TensorRT] VERBOSE: ModelImporter.cpp:179: [Constant] outputs: [1 -> (4)],
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: [Upsample]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: x
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 1
[TensorRT] VERBOSE: ModelImporter.cpp:125: [Upsample] inputs: [x -> (1, 5, 100, 100)], [1 -> (4)],
[TensorRT] VERBOSE: ImporterContext.hpp:141: Registering layer: (Unnamed Layer* 0) [Resize] for ONNX node:
[TensorRT] VERBOSE: ImporterContext.hpp:116: Registering tensor: 2_1 for ONNX tensor: 2
[TensorRT] VERBOSE: ModelImporter.cpp:179: [Upsample] outputs: [2 -> (1, 5, 200, 200)],
[TensorRT] VERBOSE: ModelImporter.cpp:507: Marking 2_1 as output: 2
Completed parsing of ONNX file
이 ONNX 모델을 시각화 한 결과는 다음과 같다.
이 모델을 TensorRT Engine 으로 빌드한 결과는 다음과 같다. 정상적으로 변환이 되었다.
2020-09-02 16:58:34 - __main__ - DEBUG - === Network Description ===
2020-09-02 16:58:34 - __main__ - DEBUG - Input 0 | Name: x | Shape: (1, 5, 100, 100)
2020-09-02 16:58:34 - __main__ - DEBUG - Output 0 | Name: 2 | Shape: (1, 5, 200, 200)
[TensorRT] VERBOSE: Applying generic optimizations to the graph for inference.
[TensorRT] VERBOSE: Original: 1 layers
[TensorRT] VERBOSE: After dead-layer removal: 1 layers
[TensorRT] VERBOSE: After Myelin optimization: 1 layers
[TensorRT] VERBOSE: After scale fusion: 1 layers
[TensorRT] VERBOSE: After vertical fusions: 1 layers
[TensorRT] VERBOSE: After final dead-layer removal: 1 layers
[TensorRT] VERBOSE: After tensor merging: 1 layers
[TensorRT] VERBOSE: After concat removal: 1 layers
[TensorRT] VERBOSE: Graph construction and optimization completed in 0.0015973 seconds.
[TensorRT] VERBOSE: Constructing optimization profile number 0 [1/1].
[TensorRT] VERBOSE: *************** Autotuning format combination: Float(1,100,10000,50000) -> Float(1,200,40000,200000) ***************
[TensorRT] VERBOSE: --------------- Timing Runner: (Unnamed Layer* 0) [Resize] (Resize)
[TensorRT] VERBOSE: Tactic: 0 is the only option, timing skipped
[TensorRT] VERBOSE: Fastest Tactic: 0 Time: 0
[TensorRT] VERBOSE: Formats and tactics selection completed in 0.004643 seconds.
[TensorRT] VERBOSE: After reformat layers: 1 layers
[TensorRT] VERBOSE: Block size 1073741824
[TensorRT] VERBOSE: Total Activation Memory: 1073741824
[TensorRT] INFO: Detected 1 inputs and 1 output network tensors.
[TensorRT] VERBOSE: Layer: (Unnamed Layer* 0) [Resize] Weights: 0 HostPersistent: 0 DevicePersistent: 0
[TensorRT] VERBOSE: Total Host Persistent Memory: 0
[TensorRT] VERBOSE: Total Device Persistent Memory: 0
[TensorRT] VERBOSE: Total Weight Memory: 0
[TensorRT] VERBOSE: Builder timing cache: created 0 entries, 0 hit(s)
[TensorRT] VERBOSE: Engine generation completed in 0.69022 seconds.
[TensorRT] VERBOSE: Engine Layer Information:
[TensorRT] VERBOSE: Layer(Resize): (Unnamed Layer* 0) [Resize], Tactic: 0, x[Float(5,100,100)] -> 2[Float(5,200,200)]
Done. serialize !
그래서 TensorRT 변환 시 scale_factor 를 사용하지 않고 upsample 을 구현하여 변환을 해야하며, pytorch 단에서 다시 모델을 학습 시킬 필요는 없는 것 같다. 단순히 모델 구조에서 위와 같이 구현을 바꾸고 변환을 했을 때의 결과와 원래 모델 구조의 결과가 동일 했음을 확인 하였다.
따라서 학습 시에는 scale_factor 를 사용해도 되고, 학습 중이 아닐 때(test or deploy)는 scale_factor 를 사용하지 않고 우회해서 구현하여 사용하면 될 것 같다.
그런데 모델 구조에서 upsample 시 size = 200 처럼 fix 시킬 수 없는게 현실이다. 텐서 사이즈를 점진적으로 100 -> 200 -> 400 -> 이런 식으로 up scale 해줘야 하는데 단순히 output spaial size 를 지정해버려선 안된다. 따라서 다음과 같이 구현하는 것이 필요하다.
import torch.nn.functional as F
class upsample_test_without_scale_factor(nn.Module):
def forward(self, x):
sh = torch.tensor(x.shape)
return F.interpolate(x, size=(sh[2] * 2, sh[3] * 2), mode='nearest')
참고로... 이렇게 변환할 경우 다음과 같은 warning 이 뜬다. TRT 변환 시에만 constant 방식을 고수하면 될 듯.
TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
onnx 모델 구조, 시각화 결과(두번째 구현과 동일) 및 TensorRT engine 빌드 결과는 다음과 같다.
graph(%x : Float(1, 5, 100, 100)):
%1 : Tensor = onnx::Constant[value= 1 1 2 2 [ CPUFloatType{4} ]]()
%2 : Float(1, 5, 200, 200) = onnx::Upsample[mode="nearest"](%x, %1) # /home/seohee/.local/lib/python3.5/site-packages/torch/nn/functional.py:2512:0
return (%2)
2020-09-02 17:28:15 - __main__ - INFO - TRT_LOGGER Verbosity: Severity.VERBOSE
Beginning ONNX file parsing
[TensorRT] VERBOSE: Registered plugin creator - ::GridAnchor_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::NMS_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Reorg_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Region_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Clip_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::LReLU_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::PriorBox_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Normalize_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::RPROI_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::BatchedNMS_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::FlattenConcat_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::CropAndResize version 1
[TensorRT] VERBOSE: Registered plugin creator - ::DetectionLayer_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Proposal version 1
[TensorRT] VERBOSE: Registered plugin creator - ::ProposalLayer_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::PyramidROIAlign_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::ResizeNearest_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::Split version 1
[TensorRT] VERBOSE: Registered plugin creator - ::SpecialSlice_TRT version 1
[TensorRT] VERBOSE: Registered plugin creator - ::InstanceNormalization_TRT version 1
[TensorRT] VERBOSE: ModelImporter.cpp:202: Adding network input: x with dtype: float32, dimensions: (1, 5, 100, 100)
[TensorRT] VERBOSE: ImporterContext.hpp:116: Registering tensor: x for ONNX tensor: x
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: [Constant]
[TensorRT] VERBOSE: ModelImporter.cpp:125: [Constant] inputs:
[TensorRT] VERBOSE: ModelImporter.cpp:179: [Constant] outputs: [1 -> (4)],
[TensorRT] VERBOSE: ModelImporter.cpp:103: Parsing node: [Upsample]
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: x
[TensorRT] VERBOSE: ModelImporter.cpp:119: Searching for input: 1
[TensorRT] VERBOSE: ModelImporter.cpp:125: [Upsample] inputs: [x -> (1, 5, 100, 100)], [1 -> (4)],
[TensorRT] VERBOSE: ImporterContext.hpp:141: Registering layer: (Unnamed Layer* 0) [Resize] for ONNX node:
[TensorRT] VERBOSE: ImporterContext.hpp:116: Registering tensor: 2_1 for ONNX tensor: 2
[TensorRT] VERBOSE: ModelImporter.cpp:179: [Upsample] outputs: [2 -> (1, 5, 200, 200)],
[TensorRT] VERBOSE: ModelImporter.cpp:507: Marking 2_1 as output: 2
Completed parsing of ONNX file
Building an engine from file ./mini_onnx_model/upsample_test4.onnx; this may take a while...
2020-09-02 17:28:16 - __main__ - DEBUG - === Network Description ===
2020-09-02 17:28:16 - __main__ - DEBUG - Input 0 | Name: x | Shape: (1, 5, 100, 100)
2020-09-02 17:28:16 - __main__ - DEBUG - Output 0 | Name: 2 | Shape: (1, 5, 200, 200)
[TensorRT] VERBOSE: Applying generic optimizations to the graph for inference.
[TensorRT] VERBOSE: Original: 1 layers
[TensorRT] VERBOSE: After dead-layer removal: 1 layers
[TensorRT] VERBOSE: After Myelin optimization: 1 layers
[TensorRT] VERBOSE: After scale fusion: 1 layers
[TensorRT] VERBOSE: After vertical fusions: 1 layers
[TensorRT] VERBOSE: After final dead-layer removal: 1 layers
[TensorRT] VERBOSE: After tensor merging: 1 layers
[TensorRT] VERBOSE: After concat removal: 1 layers
[TensorRT] VERBOSE: Graph construction and optimization completed in 0.000148775 seconds.
[TensorRT] VERBOSE: Constructing optimization profile number 0 [1/1].
[TensorRT] VERBOSE: *************** Autotuning format combination: Float(1,100,10000,50000) -> Float(1,200,40000,200000) ***************
[TensorRT] VERBOSE: --------------- Timing Runner: (Unnamed Layer* 0) [Resize] (Resize)
[TensorRT] VERBOSE: Tactic: 0 is the only option, timing skipped
[TensorRT] VERBOSE: Fastest Tactic: 0 Time: 0
[TensorRT] VERBOSE: Formats and tactics selection completed in 0.00421987 seconds.
[TensorRT] VERBOSE: After reformat layers: 1 layers
[TensorRT] VERBOSE: Block size 1073741824
[TensorRT] VERBOSE: Total Activation Memory: 1073741824
[TensorRT] INFO: Detected 1 inputs and 1 output network tensors.
[TensorRT] VERBOSE: Layer: (Unnamed Layer* 0) [Resize] Weights: 0 HostPersistent: 0 DevicePersistent: 0
[TensorRT] VERBOSE: Total Host Persistent Memory: 0
[TensorRT] VERBOSE: Total Device Persistent Memory: 0
[TensorRT] VERBOSE: Total Weight Memory: 0
[TensorRT] VERBOSE: Builder timing cache: created 0 entries, 0 hit(s)
[TensorRT] VERBOSE: Engine generation completed in 0.376964 seconds.
[TensorRT] VERBOSE: Engine Layer Information:
[TensorRT] VERBOSE: Layer(Resize): (Unnamed Layer* 0) [Resize], Tactic: 0, x[Float(5,100,100)] -> 2[Float(5,200,200)]
Done. serialize !
참고자료 1 : https://pytorch.org/docs/stable/nn.functional.html
참고자료 2 : https://github.com/pytorch/pytorch/issues/27376
참고자료 3 : https://github.com/NVIDIA/TensorRT/issues/284
참고자료 4 : https://github.com/onnx/onnx-tensorrt/issues/361
'AI Development > TensorRT' 카테고리의 다른 글
[TensorRT] Implicit vs Explicit (4) | 2020.10.21 |
---|---|
[TensorRT] Squeeze + Unsqueeze + expand_as 조합의 Pytorch 모델 사용 시 나타날 수 있는 이슈 (3) | 2020.09.16 |
[TensorRT] TRT_LOGGER 이용해서 로그 확인하기 (0) | 2020.08.03 |
[TensorRT] Onnx 모델을 위한 Custom Plugin 구현 (작성중) (0) | 2020.08.03 |
[TensorRT] TensorRT GA vs RC => Use the GA version (0) | 2020.05.18 |