Optimizing Slimmable Networks for Multiple Target Platforms
DOI:
https://doi.org/10.7557/18.6288Keywords:
neural architecture search, computer visionAbstract
In this work, we extend platform-aware adaptive training to the weighted average of multiple target platforms, where the weighting is determined e.g. by the market share of the target platform. To simulate different market regimes, we generate different weight settings by a Chinese restaurant process to benchmark optimization strategies. We use a neural architecture search framework based on Markov Random Fields to efficiently find the optimal channel configurations for each platform, and investigate different sampling strategies to train a single slimmable network that can be deployed to multiple platforms at the same time. Empirical results on CIFAR-100 demonstrate improved performance over the original slimmable network across different weight settings, while maintaining efficient training.
References
D. Aldous. Exchangeability and related topics. Springer, 1985. doi: 10.1007/BFb0099421.
M. Berman, L. Pishchulin, N. Xu, M. Blaschko, and G. Medioni. AOWS: Adaptive and Optimal Network Width Search with Latency Constraints. CVPR, 2020. doi: 10.1109/CVPR42600.2020.01123.
H. Cai, C. Gan, T. Wang, Z. Zhang, and S. Han. Once-for-All: Train One Network and Specialize it for Efficient Deployment. ICLR, 2020. doi: https://doi.org/10.48550/arXiv.1908.09791.
B. Chen, P. Li, B. Li, C. Lin, C. Li, M. Sun, J. Yan, and W. Ouyang. BN-NAS: Neural Architecture Search with Batch Normalization. ICCV, 2021. doi: 10.1109/ICCV48922.2021.00037.
X. Dai, A. Wan, P. Zhang, B. Wu, Z. He, Z. Wei, K. Chen, Y. Tian, M. Yu, P. Vajda, and J. E. Gonzalez. FBNetV3: Joint Architecture-Recipe Search using Predictor Pretraining. CVPR, 2021. doi: 10.1109/CVPR46437.2021.01601.
A. Howard, M. Sandler, B. Chen, W. Wang, L.-C. Chen, M. Tan, G. Chu, V. Vasudevan, Y. Zhu, R. Pang, H. Adam, and Q. Le. Searching for MobileNetV3. ICCV, 2019. doi: 10.1109/ICCV.2019.00140.
A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv, 2017. doi: https://doi.org/10.48550/arXiv.1704.04861.
N. Ma, X. Zhang, H.-T. Zheng, and J. Sun. ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design. ECCV, 2018. doi: https://doi.org/10.1007/978-3-030-01264-9 8.
A. Mensch and M. Blondel. Differentiable dynamic programming for structured prediction and attention. ICML, 2018. doi: https://doi.org/10.48550/arXiv.1802.03676.
M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen. MobileNetV2: Inverted Residuals and Linear Bottlenecks. CVPR, 2018. doi: 10.1109/CVPR.2018.00474.
StatCounter. Mobile vendor market share United States of America. https://gs.statcounter.com/vendor-market-share/mobile/united-states-of-america. Accessed: 2020-11-11.
M. Tan and Q. V. Le. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. ICML, 2019. doi: https://doi.org/10.48550/arXiv.1905.11946.
M. Tan and Q. V. Le. EfficientNetV2: Smaller Models and Faster Training. ICML, 2021. doi: https://doi.org/10.48550/arXiv.2104.00298.
J. Yu and T. Huang. AutoSlim: Towards OneShot Architecture Search for Channel Numbers. NeurIPS Workshop, 2019. doi: https://doi.org/10.48550/arXiv.1903.11728.
J. Yu and T. S. Huang. Universally slimmable networks and improved training techniques. ICCV, 2019. doi: 10.1109/ICCV.2019.00189.
J. Yu, L. Yang, N. Xu, J. Yang, and T. Huang. Slimmable neural networks. ICLR, 2019. doi: https://doi.org/10.48550/arXiv.1812.08928.
J. Yu, P. Jin, H. Liu, G. Bender, P.-J. Kindermans, M. Tan, T. Huang, X. Song, R. Pang, and Q. Le. BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage Models. ECCV, 2020. doi: https://doi.org/10.1007/978-3-030-58571-6 41.
Downloads
Published
Issue
Section
License
Copyright (c) 2022 Zifu Wang, Matthew B. Blaschko
This work is licensed under a Creative Commons Attribution 4.0 International License.