Skip to content

Commit f5ef60c

Browse files
committed
add readme
1 parent d9a311b commit f5ef60c

File tree

17 files changed

+138834
-4
lines changed

17 files changed

+138834
-4
lines changed

.gitignore

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -189,7 +189,6 @@ install_manifest.txt
189189
*.slo
190190
*.lo
191191
*.o
192-
*.obj
193192

194193
# Precompiled Headers
195194
*.gch
@@ -276,7 +275,6 @@ artifacts/
276275
*_i.h
277276
*.ilk
278277
*.meta
279-
*.obj
280278
*.pch
281279
*.pdb
282280
*.pgc
@@ -558,5 +556,3 @@ xcuserdata
558556
*.xccheckout
559557
*.moved-aside
560558
*.xcuserstate
561-
562-
obj_files

README.md

Lines changed: 121 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
![Cover](img/cover.png)
2+
13
CUDA Path Tracer
24
================
35

@@ -70,6 +72,125 @@ A single choice between three options
7072
```
7173
## 4. Performance benchmark
7274

75+
#### 4.1 Stream Compaction, on open and close scene
76+
77+
78+
A sample output of number of rays after stream compaction after each iteration. Tested on scenes/cornell_refraction.json
79+
80+
Results 1:
81+
Tested with cornell_refraction.json there are a total of 7 materials and 9 geometries, all of them are either box or spheres.
82+
83+
![SC1](img/sc_bar.png)
84+
```
85+
num_paths: 588297 at depth 1 of iter18
86+
num_paths: 481950 at depth 2 of iter18
87+
num_paths: 398397 at depth 3 of iter18
88+
num_paths: 326704 at depth 4 of iter18
89+
num_paths: 266102 at depth 5 of iter18
90+
num_paths: 217403 at depth 6 of iter18
91+
num_paths: 178307 at depth 7 of iter18
92+
num_paths: 0 at depth 8 of iter18
93+
num_paths: 588302 at depth 1 of iter19
94+
num_paths: 482180 at depth 2 of iter19
95+
num_paths: 397826 at depth 3 of iter19
96+
num_paths: 326109 at depth 4 of iter19
97+
num_paths: 265074 at depth 5 of iter19
98+
num_paths: 216725 at depth 6 of iter19
99+
num_paths: 177920 at depth 7 of iter19
100+
num_paths: 0 at depth 8 of iter19
101+
num_paths: 588302 at depth 1 of iter20
102+
num_paths: 482791 at depth 2 of iter20
103+
num_paths: 398975 at depth 3 of iter20
104+
num_paths: 327015 at depth 4 of iter20
105+
num_paths: 266191 at depth 5 of iter20
106+
num_paths: 217700 at depth 6 of iter20
107+
num_paths: 178610 at depth 7 of iter20
108+
num_paths: 0 at depth 8 of iter20
109+
```
110+
111+
Results 2:
112+
Tested with cornell_refraction_close.json. Comparing to cornell_refraction.json, it has one more cube at back of the camera so light can't escape the room.
113+
114+
![SC1](img/sc_bar_close.png)
115+
```
116+
num_paths: 588103 at depth 1 of iter18
117+
num_paths: 552057 at depth 2 of iter18
118+
num_paths: 522357 at depth 3 of iter18
119+
num_paths: 493064 at depth 4 of iter18
120+
num_paths: 464186 at depth 5 of iter18
121+
num_paths: 437812 at depth 6 of iter18
122+
num_paths: 413273 at depth 7 of iter18
123+
num_paths: 0 at depth 8 of iter18
124+
num_paths: 588112 at depth 1 of iter19
125+
num_paths: 551931 at depth 2 of iter19
126+
num_paths: 522363 at depth 3 of iter19
127+
num_paths: 492947 at depth 4 of iter19
128+
num_paths: 464198 at depth 5 of iter19
129+
num_paths: 437899 at depth 6 of iter19
130+
num_paths: 413139 at depth 7 of iter19
131+
num_paths: 0 at depth 8 of iter19
132+
num_paths: 588110 at depth 1 of iter20
133+
num_paths: 552150 at depth 2 of iter20
134+
num_paths: 522766 at depth 3 of iter20
135+
num_paths: 493317 at depth 4 of iter20
136+
num_paths: 464677 at depth 5 of iter20
137+
num_paths: 438364 at depth 6 of iter20
138+
num_paths: 413396 at depth 7 of iter20
139+
num_paths: 0 at depth 8 of iter20
140+
```
141+
142+
The average render time is higher with close room scene. Since there is a lot less light terminated during rendering, more kernels need to be launched at each render pass, causing higher render time.
143+
144+
### BVH Reults
145+
146+
BVH is constructed on CPU, then ray-scene intersection kernel iteratively looks for closest hit along the BVH Tree. BVH accept three arguments: number of bins to split per axis ( when constructing BVH), max leaf size acceptable, and max depth acceptable.
147+
148+
*Test scene used is cow.obj, contains 2903 vertices, 5804 faces, and 1 diffuse material*
149+
150+
*Config 0 is using a single volume containing the whole mesh.*
151+
152+
#### Using max leaf count as bvh constraint
153+
154+
```
155+
Max leaf count is set, max depth is generated on the fly.
156+
config 0: Single volume
157+
config 1: Bins per axis: 32, Max depth: 33, largest leaf size: 10
158+
config 2: Bins per axis: 32, Max depth: 22, largest leaf size: 27
159+
config 3: Bins per axis: 32, Max depth: 11, largest leaf size: 63
160+
config 4: Bins per axis: 32, Max depth: 10, largest leaf size: 120
161+
config 5: Bins per axis: 32, Max depth: 8, largest leaf size: 229
162+
```
163+
![maxleaf](img/max_leaf.png)
164+
165+
#### Using max depth as bvh constraint
166+
```
167+
Max depth is set, max leaf count is generated on the fly.
168+
config 0: Single volume, 2995
169+
config 1: Bins per axis: 32, Max depth: 4, largest leaf size: 1253
170+
config 2: Bins per axis: 32, Max depth: 10, largest leaf size: 102
171+
config 3: Bins per axis: 32, Max depth: 20, largest leaf size: 30
172+
config 4: Bins per axis: 32, Max depth: 38, largest leaf size: 10
173+
config 5: Bins per axis: 32, Max depth: 100, largest leaf size: 162
174+
```
175+
![maxdepth](img/maxdepth.png)
176+
177+
#### BVH Conclusion
178+
The BVH performance varies a lot given different tree construction configuration. Overall the
179+
best BVH result is slight worse than adding a simple bounding volume culling. I think that BVH will show improvement over vanilla bounding volume when we use a much denser mesh with 10k or more vertices in it.
180+
181+
### Sorting by material type
182+
183+
Another option is to sort by material type during each render pass so that each block will have similar performance. Here is the result. Tested with cornell_refraction scene.
184+
```
185+
No sort: Not soring by material type.
186+
One partition: Only group one type of material. Used for performance comparison
187+
Full Partition: All materials are grouped by type. Using thrust::parition
188+
Stable Partition: All materials are grouped by type. Using thrust::stable_parition
189+
Full sort: All materials types are sorted thrust::stable_sort
190+
```
191+
![matsort](img/matsort.png)
192+
193+
Turning on material sort hinders performance. I expect to that scenes with many more material types will see noticeable benefit from using this option.
73194
74195
### 3rd-party code used
75196

img/cornell_cow.png

1.55 MB
Loading

img/cover.png

905 KB
Loading

img/matsort.png

85.1 KB
Loading

img/max_leaf.png

87.7 KB
Loading

img/maxdepth.png

89.3 KB
Loading

img/sc_bar.png

70.6 KB
Loading

img/sc_bar_close.png

72 KB
Loading

obj_files/box/issue-177.mtl

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
newmtl white
2+
Ka 0 0 0
3+
Kd 1 1 1
4+
Ks 0 0 0
5+
6+
newmtl red
7+
Ka 0 0 0
8+
Kd 1 0 0
9+
Ks 0 0 0
10+
11+
newmtl green
12+
Ka 0 0 0
13+
Kd 0 1 0
14+
Ks 0 0 0
15+
16+
newmtl blue
17+
Ka 0 0 0
18+
Kd 0 0 1
19+
Ks 0 0 0
20+
21+
newmtl light
22+
Ka 20 20 20
23+
Kd 1 1 1
24+
Ks 0 0 0

0 commit comments

Comments
 (0)