571 project | Crosley, Singh

Welcome.

This is the project website for Thomas Crosley and Karanbir Singh's UW CSE 571 class project. On this page we will share updates on our progress. To view our open-source code you can visit the repo here.

Project Summary

We are working on action-conditional video prediction as described in the 2015 paper by Oh, et al. Action-Conditional Video Prediction using Deep Networks in Atari Games. The goal for the project is to implement the baseline algorithm on simple datasets, and then scale up the complexity of the data and make adjustments to the algorithm accordingly.

Artifacts

Here is a link to our presentation.

Here is our final paper.

Milestones

Below, we have set goals for the project so we can track progress through the end of the quarter.

Milestone 1 (Midpoint - November 21st)

Collect video datasets that include frame transitions that are stochastic given only the previous frame but deterministic given the frame and the action taken by the agent.
Set up development tools for training deep neural networks. We will be using the framework with theano as a back-end for development of the neural networks.
Implement the feed-forward version of the algorithm described in the paper in Keras.
Train the network on any basic datasets we have acquired.
Artifact: Datasets that are deterministic given an action. An implementation of the feed-forward system described in the paper that runs on basic datasets.
Results: (November 20th)
- Our presentation is here.
- We generate a Pacman-style dataset, where the next frame is random given the current frame, but deterministic given a [left, right, up, down, stay] action.
- We train two feed-forward networks - one with actions and one without.
- We evaluate our model and calculate mean-squared loss.
- We write visualization code for predicted vs. true frames.

Milestone 2 (November 28th)

Create a new dataset where the camera moves with the agent but is still a top-down view; the agent will be in the same place every time but the environment landmarks will move.
Concatenate the predictions as the agent moves to make a psuedo-map of the environment.
Adjust the architecture of the network/algorithm to address issues that arise for this new dataset and task.
Artifact: New datasets with the agent centered in all frames. A map of the environment from consecutive predictions.
Results: (November 28th)
- We generated datasets with smaller windows as the agent moved through a larger map, but preliminary results were tough to interpret. We changed directions.
- Code written to extend dataset-type - can generate grids of arbitrary size / walls / dots, although these inputs are much tougher to train on (size, etc))
- Progress made on informed exploration - agent uses the network to inform movement towards states different from its history.

Milestone 3 (Final - December 12th)

Create even more complex datasets including those with noisy motion models, noisy observations models, and videos that are stochastic conditioned on both the previous frame(s) and action.
Try adding to the architecture to address problems that arose with the more complex datasets. For example, try stochastic networks, sampling predictions, recurrent networks, etc as time allows.
Artifact: an implementation of one more complex architecture - a recurrent network, a stochastic network, etc. The advanced approach we choose to implement will be based on the short-comings we see from training on the more complex datasets.
Results: (December 12th)
- We present an informed exploration strategy using the network to move towards new states.
- We simulate a noisy sensor model and evaluate our network on these frames. We create a recurrent model that performs better at this task.
- We examine the effect of the hidden encoded vector size on training loss.
- We investigate action-transformations, particularly when the input action is all 1s to view the state of the encoded vector.
- We present findings in class and in our paper. (Links at top of site).

Authors and Contributors

This is the project website for @crosleythomas and @karanbirsingh.