TY - GEN
T1 - Towards Resource-aware DNN Partitioning for Edge Devices with Heterogeneous Resources
AU - Zawish, Muhammad
AU - Abraham, Lizy
AU - Dev, Kapal
AU - Davy, Steven
N1 - Funding Information:
ACKNOWLEDGMENT This research was supported by Science Foundation Ireland and the Department of Agriculture, Food and Marine on behalf of the Government of Ireland VistaMilk research centre under the grant 16/RC/3835.
Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Collaborative deep neural network (DNN) inference over edge and cloud is emerging as an effective approach for enabling several Internet of Things (IoT) applications. Edge devices are mainly resource-constrained and hence can not afford the computational complexity manifested by DNNs. Thereby, researchers have resorted to a collaborative computing approach, where a DNN is partitioned between edge and cloud. Recent art on DNN partitioning has either focused on bandwidth-specific partitioning or relied on offline benchmarking of DNN layers. However, edge devices are inherently heterogeneous and possess inconsistent levels and types of resources. Therefore, in this work, we propose a resource-aware partitioning of DNNs for accelerating collaborative inference over edge-cloud. The proposed approach provides the flexibility of partitioning a DNN with respect to the available nature and scale of resources for a certain edge device. Unlike state-of-the-art, we exploit different types of DNN complexities for partitioning them on heterogeneous edge devices. For example, in a bandwidth-constrained scenario, our approach gained 40% efficiency as compared to the offline benchmarking approach. Therefore, given the different nature of edge devices' computational, storage, and energy requirements, this approach provides a suitable configuration for edge-cloud synergetic inference.
AB - Collaborative deep neural network (DNN) inference over edge and cloud is emerging as an effective approach for enabling several Internet of Things (IoT) applications. Edge devices are mainly resource-constrained and hence can not afford the computational complexity manifested by DNNs. Thereby, researchers have resorted to a collaborative computing approach, where a DNN is partitioned between edge and cloud. Recent art on DNN partitioning has either focused on bandwidth-specific partitioning or relied on offline benchmarking of DNN layers. However, edge devices are inherently heterogeneous and possess inconsistent levels and types of resources. Therefore, in this work, we propose a resource-aware partitioning of DNNs for accelerating collaborative inference over edge-cloud. The proposed approach provides the flexibility of partitioning a DNN with respect to the available nature and scale of resources for a certain edge device. Unlike state-of-the-art, we exploit different types of DNN complexities for partitioning them on heterogeneous edge devices. For example, in a bandwidth-constrained scenario, our approach gained 40% efficiency as compared to the offline benchmarking approach. Therefore, given the different nature of edge devices' computational, storage, and energy requirements, this approach provides a suitable configuration for edge-cloud synergetic inference.
KW - Collaborative DNN inference
KW - Edge intelligence
KW - Internet of things
KW - Resource efficiency
UR - http://www.scopus.com/inward/record.url?scp=85146941503&partnerID=8YFLogxK
U2 - 10.1109/GLOBECOM48099.2022.10000839
DO - 10.1109/GLOBECOM48099.2022.10000839
M3 - Conference contribution
AN - SCOPUS:85146941503
T3 - 2022 IEEE Global Communications Conference, GLOBECOM 2022 - Proceedings
SP - 5649
EP - 5655
BT - 2022 IEEE Global Communications Conference, GLOBECOM 2022 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2022 IEEE Global Communications Conference, GLOBECOM 2022
Y2 - 4 December 2022 through 8 December 2022
ER -