Zipeng (Mark) Fu

I am a master student in the Machince Learning Department at CMU.

Previously, I got my Bachelors of Science in Computer Science & Engineering and Applied Math at UCLA. I've been doing research on multi-agent reinforcement learning with Prof. Song-Chun Zhu and on imitation learing with Prof. Weinan Zhang.

Email  /  CV  /  Google Scholar  /  GitHub

profile photo
Research Interests

I'm interested in intelligent agents and robots who can ultimately generalize in perception and control like humans. Perceiving the world with few human-labelled supervisions, adapting quickly to new environments and acquiring new skills without catastrophic forgetting are crucial for achieving this goal.

Keywords: Deep Reinforcement Learning, Meta-Learning, Unsupervised Learning

Selected Publications
Multi-Modal Imitation Learning in Partially Observable Environments
Zipeng Fu, Minghuan Liu, Ming Zhou, Weinan Zhang
Preprint, 2019
GitHub

We propose to take advantage of InfoGAIL with RNN-based belief state representations for multi-modal imitation learning in partially observable environments. We confirm the effectiveness of multi-expert learning of our method in a 2-dimensional environment, in which expert trajectories consist of two human-distinguishable behaviors. Further experimental results in continuous-control locomotion tasks reveal that our method can also disentangle interpretable latent factors in unlabeled multi-task demonstrations.

Emergence of Pragmatics from Referential Game between Theory of Mind Agents
Luyao Yuan, Zipeng Fu, Jingyue Shen, Lu Xu, Junhong Shen, Song-Chun Zhu
Presented at Emergent Communication Workshop at NeurIPS, 2019

We integrate theory of mind (ToM) in a cooperative multi-agent pedagogical situation and propose an adaptive reinforcement learning (RL) algorithm to develop a communication protocol. With this ability,agent sconsider language as not only messages but also rational acts reflecting others' hidden states. Our experiments demonstrate the advantage of pragmatic protocols over non-pragmatic protocols. We also show the teaching complexity following the pragmatic protocol empirically approximates the recursive teaching dimension (RTD).

Reducing Overestimation of Value Mixing in Cooperative Deep Multi-Agent Reinforcement Learning
Zipeng Fu, Qingqing Zhao, Weinan Zhang
Preprint, 2019

We provide the theoretical analysis of the reason why traditional DQN training methods lead to significant value overestimation in multi-agent settings. We propose double QMIX, an end-to-end multi-agent Q-learning method with reduction of value overestimation, that trains decentralized agents' policies in a centralized setting.

Emergence of Theory of Mind Collaboration in Multiagent Systems
Luyao Yuan, Zipeng Fu, Linqu Zhou, Kexin Yang, Song-Chun Zhu
Emmergent Communication Workshop at NeurIPS, 2019

We incorporate Theory of Mind (ToM) into multiagent partially observable Markov decision processes (POMDPs) and propose an adaptive training algorithm to develop effective collaboration between agents with ToM. We evaluate our algorithms in two games, where our algorithm surpasses all previous decentralized execution algorithms without modeling ToM.

Unsupervised Incremental Structure Learning of Stochastic And-Or Grammars with Monte Carlo Tree Search
Luyao Yuan, Jingyue Shen, Zipeng Fu, Song-Chun Zhu
Preprint, 2019

We proposed an unsupervised AndOr grammar learning approach that iteratively searches for better grammar structure and parameters to optimize the grammar compactness and data likelihood. To handle the complexity of grammar learning, we developed an algorithm based on the Monte Carlo Tree Search to effectively explore the search space. Also, our method enables incremental grammar learning.

Adversarial Attack Against Scene Recognition System for Unmanned Vehicles
Xuankai Wang, Mi Wen, Jinguo Li, Zipeng Fu, Rongxing Lu, Kefei Chen
ACM TURC, 2019   (Best Paper Runner-up)

We generate adversarial examples againist scene recognition classification model of unmanned vehicles through experiments. We also try to improve the adversarial model robustness by the adversarial training. Extensive experiments have been conducted, and experimental results show that adversarial examples have an efficient attack effect on the neural network for scene recognition.

Energy Theft Detection With Energy Privacy Preservation in the Smart Grid
Donghuan Yao, Mi Wen, Xiaohui Liang, Zipeng Fu, Kai Zhang, Baojia Yang
IEEE IoT Journal, 2019

We propose an energy theft detection scheme with energy privacy preservation in the smart grid. Especially, we use combined convolutional neural networks (CNNs) to detect abnormal behavior of the metering data from a long-period pattern observation. In addition, we employ Paillier algorithm to protect the energy privacy. In other words, the users' energy data are securely protected in the transmission and the data disclosure is minimized. Our security analysis demonstrates that in our scheme data privacy and authentication are both achieved. Experimental results illustrate that our modified CNN model can effectively detect abnormal behaviors at an accuracy up to 92.67%.

Machine Learning for Glass Science and Engineering: A Review
Han Liu, Zipeng Fu, Kai Yang, Xinyi Xu, Mathieu Bauchy
Journal of Non-Crystalline Solids, 2019

The design of new glasses is often plagued by poorly efficient Edisonian ''trial-and-error'' discovery approaches. As an alternative route, the Materials Genome Initiative has largely popularized new approaches relying on artificial intelligence and machine learning for accelerating the discovery and optimization of novel, advanced materials. Here, we review some recent progress in adopting machine learning to accelerate the design of new glasses with tailored properties.



awesome template from here