Skip to content

Commit c31175c

Browse files
committed
Add readme
1 parent 00a8b74 commit c31175c

File tree

3 files changed

+140
-0
lines changed

3 files changed

+140
-0
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ We have a list of candidate papers to implement: https://github.com/chainer/mode
2424
- Neural Relational Inference for Interacting Systems [[paper](https://arxiv.org/abs/1802.04687)] [[code](https://github.com/chainer/models/tree/master/nri)]
2525
- SiamRPN and SiamMask [[paper](https://arxiv.org/abs/1812.05050)] [[code](https://github.com/STVIR/pysot)]
2626
- Learning to learn by gradient descent by gradient descent [[paper](https://arxiv.org/abs/1606.04474)] [[code](https://github.com/chainer/models/tree/master/learning_to_learn)]
27+
- Learning to Simplify: Fully Convolutional Networks for Rough Sketch Cleanup [[paper](http://www.f.waseda.jp/hfs/SimoSerraSIGGRAPH2016.pdf)] [[code](https://github.com/chainer/models/tree/master/simplifying_rough_sketches)]
2728

2829
## License
2930
MIT License (see `LICENSE` file).
Lines changed: 133 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,133 @@
1+
# Rough Sketch Simplification using FCNN in PyTorch
2+
3+
This repository contains code of the paper [Learning to Simplify: Fully Convolutional Networks for Rough Sketch Cleanup](http://www.f.waseda.jp/hfs/SimoSerraSIGGRAPH2016.pdf) which is tested and trained on custom datasets. It is based on Chainer.
4+
5+
## Overview
6+
7+
The paper presents novel technique to simplify sketch drawings based on learning a series of convolution operators. Image of any dimension can be fed into the network, and it outputs the image of same dimension as the input image.
8+
9+
![model](images/model.png)
10+
11+
The architecture consists of encoder and a decoder, the first part acts as an encoder and spatially compresses the image, the second part, processes and extracts the essential lines from the image, and the third and last part acts as a decoder which converts the small more simple representation to an grayscale image of the same resolution as the input. This is all done using convolutions.
12+
The down- and up-convolution architecture may seem similar to a simple filter banks. However, it is important to realize that the number of channels is much larger where resolution is lower, e.g., 1024 where the size is 1/8. This ensures that information that leads to clean lines is carried through the low-resolution part; the network is trained to choose which information to carry by the encoder- decoder architecture. Padding is used to compensate for the kernel size and ensure the output is the same size as the input when a stride of 1 is used. Pooling layers are replaced by convolutional layers with increased strides to lower the resolution from the previous layer.
13+
14+
15+
16+
## Contents
17+
- [Rough Sketch Simplification using FCNN in PyTorch](#rough-sketch-simplification-using-fcnn-in-pytorch)
18+
- [Overview](#overview)
19+
- [Contents](#contents)
20+
- [1. Setup Instructions and Dependencies](#1-setup-instructions-and-dependencies)
21+
- [2. Dataset](#2-dataset)
22+
- [3. Training the model](#3-training-the-model)
23+
- [5. Model Architecture](#5-model-architecture)
24+
- [6. Observations](#6-observations)
25+
- [Training](#training)
26+
- [Predicitons](#predicitons)
27+
- [Loss](#loss)
28+
- [7. Repository overview](#7-repository-overview)
29+
30+
31+
## 1. Setup Instructions and Dependencies
32+
33+
Clone the repositiory on your local machine.
34+
35+
36+
Start a virtual environment using python3
37+
``` Batchfile
38+
virtualenv env
39+
```
40+
41+
42+
Install the dependencies
43+
``` Batchfile
44+
pip install -r requirements.txt
45+
```
46+
47+
You can also use google collab notebook.
48+
49+
50+
## 2. Dataset
51+
52+
The authors have not provided dataset for the paper. So I created my own. I have uploaded the dataset on drive, the link to which can be found [here](https://drive.google.com/open?id=14NQTqITAiw8o-JgdnumQ-K0asLRwJy7q). Feel free to use it.
53+
54+
Create two folders inside the root directory of dataset, `Input` and `Taget` and place the images inside the corresponding directory. It is important to keep the names same for both input and target images.
55+
56+
## 3. Training the model
57+
58+
To train the model, run
59+
60+
```Batchfile
61+
python main.py --train=True
62+
```
63+
64+
optional arguments:
65+
66+
| argument | default | desciption|
67+
| --- | --- | --- |
68+
| -h, --help | None | show help message and exit |
69+
| --gpu_id GPU_ID, -g GPU_ID | -1 | GPU ID (negative value indicates CPU) |
70+
| --out OUT, -o OUT |result | Directory to output the result |
71+
| --batch_size BATCH_SIZE, -b BATCH_SIZE | 8 | Batch Size |
72+
| --height HEIGHT, -ht HEIGHT | 64 | height of the image to resize to |
73+
| --width WIDTH, -wd WIDTH | 64 | width of the image to resize to |
74+
| --samples SAMPLES | False | See sample training images |
75+
| --num_epochs NUM_EPOCHS | 75 | Number of epochs to train on |
76+
| --train TRAIN | True | train the model |
77+
| --root ROOT | . | Root Directory for Input and Target images. |
78+
| --n_folds N_FOLDS | 7 | Number of folds in k-fold cross validation. |
79+
| --save_model SAVE_MODEL | True | Save model after training. |
80+
| --load_model LOAD_MODEL | None | Path to existing model. |
81+
| --predict PREDICT | None | Path of rough sketch to simplify using existing model |
82+
83+
## 5. Model Architecture
84+
85+
![archi](images/archi.png)
86+
87+
## 6. Observations
88+
89+
90+
### Training
91+
92+
| Epoch | Prediction |
93+
| --- | --- |
94+
| 2 | ![epoch2](pred/2.png) |
95+
| 60 | ![epoch40](pred/60.png) |
96+
| 100 | ![epoch80](pred/100.png) |
97+
| 140 | ![epoch120](pred/140.png) |
98+
99+
### Predicitons
100+
101+
![pred3](pred/pred3.png)
102+
![pred2](pred/pred8.png)
103+
![pred1](pred/pred1.png)
104+
105+
106+
### Loss
107+
108+
![loss](images/loss.png)
109+
110+
## 7. Repository overview
111+
112+
This repository contains the following files and folders
113+
114+
1. **images**: Contains resourse images.
115+
116+
2. **pred**: Contains prediction images.
117+
118+
3. `dataset.py`: code for dataset generation.
119+
120+
4. `model.py`: code for model as described in the paper.
121+
122+
5. `predict.py`: function to simplify image using model.
123+
124+
6. `read_data.py`: code to read images.
125+
126+
7. `utils.py`: Contains helper functions.
127+
128+
8. `train_val.py`: function to train and validate models.
129+
130+
9. `main.py`: contains main code to run the model.
131+
132+
10. `requirements.txt`: Lists dependencies for easy setup in virtual environments.
133+
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
'cupy-cuda100>=6.4.0,<7.0.0'
2+
chainer==6.5.0
3+
numpy==1.17.3
4+
matplotlib==3.1.1
5+
opencv-python==3.4.7.28
6+
Pillow==4.3.0

0 commit comments

Comments
 (0)