About Me
TL; DR
- Seasoned Engineer and Technical Leader with a strong background and track record of success at Google, ByteDance, Alibaba and Intel.
- Extensive technical expertise in AI/ML solutions, software/hardware co-design, AI/ML & cloud infrastructure, accelerators, embedded systems, memory/storage technologies and computer architecture.
- Strong experience in XFN collaboration, communication and working with partners & system integrators.
- Ph.D. in Computer Engineering, Certified Google Cloud Architect & TensorFlow Developer.
See my LinkedIn profile for more details.
Things I’ve Worked On
ByteDance (Present)
Applied Research/Infrastructure Lab
- Lead the SW/HW co-design projects for accelerating storage and data analytics workloads. Drive XFN effort E2E, including prototyping, ROI estimation, project proposal, architectural design, planning and execution.
- Lead the research and exploration of emerging memory/storage technologies, including CXL, disaggregated memory architecture, Persistent Memory/NVRAM, in-memory computing, Neuromorphic computing.
- Next-gen infrastructure for AI/ML: As one of the few core members of the task force, I drive the XFN effort for the exploration and planning of the next-gen infra for AI/ML. Provide important recommendations to the C-level management, and help incubate new projects in optimizing data pipelines for AI/ML workloads.
Hardware Partner Service
- Led the XFN project for enabling next-gen streaming platform for ecosystem partners.
- Worked with Google Product Areas and BD teams with focus on SoC integration, hardware ecosystem scaling, program development and management.
- Onboarded partner devices with Google technologies and help partners scale.
- Managed internal and external stakeholders and product lifecycle from end to end.
- Tool development and automation, solution design, prototyping and technical troubleshooting.
- Successfully launched Google Assistant/Chromecast on embedded devices from top partners.
- Successfully onboarded top SoC partners with Project Matter and Google Ecosystem.
Edge TPU (Collaboration with Google Brain)
- “Moonshot” project of using Machine Learning to improve TPU design and model architecture.
- Developed Learned Performance Model, a key component for model architecture exploration.
- Developed the end-to-end pipeline for automated performance modeling on TPU.
- Work was featured by Google AI and published in CVPR-22 (ECV22) with patents pending.
Alibaba
Storage Innovation with Hardware/Software Co-optimization
- Created and led the AliFlash team for storage innovation.
- Designed new storage architecture with software/hardware co-design and optimization.
- Launched the first production Open Channel SSD in industry (AliFlash V3).
- Project management with internal/external collaborations.
Storage Innovation with Intel 3D XPoint Technology
- Led the XFN effort of deploying 3D XPoint technology in Alibaba’s infrastructure, making Alibaba one of the earliest adopters of this disruptive technology.
Research and Pathfinding of Innovative Technologies and Solutions
- Evaluation of emerging technologies, strategic analysis & recommendations for executives.
Intel
3D XPoint Storage Technology (Optane)
- As core developer of this disruptive product, I was responsible for media management, NVMe, Error Recovery/Injection, test automation, media characterization and performance analysis.
NVMe SSD Products
- Core developer of P3700 (Intel’s first NVMe SSD) and P4500 (Intel’s first 3D NAND SSD).
Streaming Media
- Core developer of streaming driver, firmware and platform software for Intel media SoC.
- Bootloader & kernel enablement, performance optimization of the embedded device.
Academic
University of Pittsburgh (PhD, Computer Engineering)
- New memory/storage technologies (Phase Change Memory, STT-RAM, Memrister)
- Computer architecture
- Memory hierarchy, modeling, performance, power
My Publications
- Exploring CXL-based KV Cache Storage for LLM Serving, Yupeng Tang, Runxiang Cheng, Ping Zhou, Tongping Liu, Fei Liu, Wei Tang, Kyoungryun Bae, Jianjun Chen, Wu Xiang, Rui Shi, to appear in NeurIPS 2024 Workshop MLforSys, Dec. 2024.
- Exploring Performance and Cost Optimization with ASIC-Based CXL Memory, Yupeng Tang, Ping Zhou, Wenhui Zhang, Henry Hu, Qirui Yang, Hao Xiang, Tongping Liu, Jiaxin Shan, Ruoyun Huang, Cheng Zhao, Cheng Chen, Hui Zhang, Fei Liu, Shuai Zhang, Xiaoning Ding, Jianjun Chen, EuroSys 2024 (Best Paper Award Runner-Up), Apr. 2024.
- Dynamic storage for adaptive mapping for data compression on a storage device, Ping Zhou, et. al. US Patent US20230273727A1, Aug. 2023.
- Space manager for transparent block device compression, Ping Zhou, et. al. US Patent US20230229324A1, Jul. 2023.
- Multi-dimensional solid state drive block access, Ping Zhou, et. al. US Patent US20230195345A1, Jun. 2023.
- Adaptive mapping for transparent block device level compression, Ping Zhou, et. al. US Patent US20230176734A1, Jun. 2023.
- System and method for allocating memory space, Ping Zhou, et. al. US Patent US20230122533A1, Apr. 2023.
- Searching for Efficient Neural Architectures for On-Device ML on Edge TPUs, Ping Zhou, et. al, 2022 Conference on Computer Vision and Pattern Recognition (CVPR-2022) ECV22 Workshop
- Universal and automatic end-to-end testing of smart TVs, Ping Zhou, et. al, Technical Disclosure Commons (Invention Disclosure), Dec. 2019
- Alibaba Open Channel SSD for Next-Generation Data Centers, Ping Zhou, et. al, Flash Memory Summit, Aug. 2018
- Throughput Enhancement for Phase Change Memories, Ping Zhou, Bo Zhao, Youtao Zhang, Jun Yang, IEEE Transactions on Computers (TC), DOI: 10.1109/TC.2013.76, Mar. 2013
- The Design of Sustainable Wireless Sensor Network Node using Solar Energy and Phase Change Memory, Ping Zhou, Youtao Zhang, Jun Yang, Design, Automation & Test in Europe (DATE), March 1, 2013
- Towards Successful Application of Phase Change Memories: Address Challenges from Write Operations, Ping Zhou, PhD Dissertation, 2012
- MRAC: A Memristor-based Reconfigurable Framework for Adaptive Cache Replacement, Ping Zhou, Bo Zhao, Youtao Zhang, Jun Yang, Yiran Chen, The 20th International Conference on Parallel Architectures and Compilation Techniques (PACT), Oct. 2011
- Fine-Grained QoS Scheduling for DRAM/PCM Hybrid Memory Systems, Ping Zhou, Yu Du, Youtao Zhang, Jun Yang, Non-Volatile Memories Workshop (NVMW), March 2011
- Fine-Grained QoS Scheduling for PCM-based Main Memory Systems, Ping Zhou, Yu Du, Youtao Zhang, Jun Yang The 24th IEEE International Parallel & Distributed Processing Symposium (IPDPS-2010), April 2010
- Phase Change Technology and the Future of Main Memory, Benjamin Lee, Ping Zhou, Jun Yang, Youtao Zhang, Bo Zhao, Engin Ipek, Onur Mutlu, Doug Burger, IEEE Micro Top Picks, vol. 30, no. 1, pp. 143-143, February 2010
- Energy Reduction for STT-RAM Using Early Write Termination, Ping Zhou, Bo Zhao, Jun Yang, Youtao Zhang, IEEE/ACM 2009 International Conference on Computer-Aided Design (ICCAD-2009), pp. 264-268, November, 2009
- A Durable and Energy Efficient Main Memory Using Phase Change Memory Technology, Ping Zhou, Bo Zhao, Jun Yang, Youtao Zhang, The 36th International Symposium on Computer Architecture (ISCA-2009), pp. 14-23, June, 2009. Among the 15 most cited papers in the history of ISCA.
- Frequent Value Compression in Packet-based NoC Architectures, Ping Zhou, Bo Zhao, Yu Du, Yi Xu, Youtao Zhang, Jun Yang, Li Zhao, The 14th Asia and South Pacific Design Automation Conference (ASP-DAC 2009), pp. 13-18, January 2009
My Inventions/Patents
- Dynamic storage for adaptive mapping for data compression on a storage device (US20230273727A1)
- Space manager for transparent block device compression (US20230229324A1)
- Multi-dimensional solid state drive block access (US20230195345A1)
- Adaptive mapping for transparent block device level compression (US20230176734A1)
- System and method for allocating memory space (US20230122533A1)
- NEURAL NETWORK ARCHITECTURE FOR IMPLEMENTING GROUP CONVOLUTIONS (WO2023059336A1)
- HARDWARE ACCELERATOR OPTIMIZED NEURAL NETWORK MODELS USING GROUP CONVOLUTIONS (WO2023059335A1)
- Universal and automatic end-to-end testing of smart TVs (defensive publication)
- SYSTEM AND METHOD FOR FLASH STORAGE MANAGEMENT USING MULTIPLE OPEN PAGE STRIPES (US 20210034301 A1)
- SYSTEM AND METHOD FOR OPTIMIZATION OF GLOBAL DATA PLACEMENT TO MITIGATE WEAR-OUT OF WRITE CACHE AND NAND FLASH (US 20200159419 A1)
- COLLABORATIVE COMPRESSION IN A DISTRIBUTED STORAGE SYSTEM, (US 20200042500 A1)
- METHOD AND SYSTEM FOR FACILITATING ATOMICITY ASSURANCE ON METADATA AND DATA BUNDLED STORAGE (US 20200034079 A1)
- RAPID SIDE-CHANNEL ACCESS TO STORAGE DEVICES (US 20190347204 A1)
- METHOD AND SYSTEM FOR DATA DESTRUCTION IN A PHASE CHANGE MEMORY-BASED STORAGE DEVICE (US 20190087587 A1)
- METHOD AND SYSTEM FOR ACTIVE PERSISTENT STORAGE VIA A MEMORY BUS (US 20190073132 A1)
- METHOD AND SYSTEM FOR MITIGATING WRITE AMPLIFICATION IN A PHASE CHANGE MEMORY-BASED STORAGE DEVICE (US 20190012111 A1)
- SYSTEM AND METHOD FOR FINE-GRAINED POWER CONTROL MANAGEMENT IN A HIGH CAPACITY COMPUTER CLUSTER (US 20180364795 A1)
- METHOD AND SYSTEM FOR IMPLEMENTING BYTE-ALTERABLE WRITE CACHE (US 20180349041 A1)
- HIGH-VOLUME, LOW-LATENCY DATA PROCESSING IN FLEXIBLY CONFIGURED LOCAL HETEROGENEOUS COMPUTING ENVIRONMENTS (US 20180329632 A1)
- PERSISTENT MEMORY FOR KEY-VALUE STORAGE (US 20180307620 A1)
Things I’m Interested In
- AI/ML stuff
- Quantum computing
- Cloud computing & infrastructure
- Linux, Emacs, Lisp, Python, C/C++, Go & other programming stuff
- Astronomy, physics, math
- Embedded systems, smart devices, IoT
- Emerging memory & storage technologies (PCM, STT-RAM, ReRAM, ….)
- In-memory computing & Neuromorphic computing
- (and many more…)