Skip to content

Commit 8970205

Browse files
committed
2 parents e1f43da + e02f758 commit 8970205

File tree

1 file changed

+47
-48
lines changed

1 file changed

+47
-48
lines changed

docs/project_report.rst

Lines changed: 47 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -17,47 +17,47 @@ Project: Efficient Python routines for analysis on massively multi-threaded plat
1717
Submitted by- Deepanshu Thakur
1818
******************************
1919

20-
I spend my last 3 months working on `GSoC project`_. My GSoC project was
21-
related with writing the bindings of the Hydra C++ library. Hydra is a header
22-
only C++ library designed and used to run on Linux platforms. Hydra is a
20+
I spent my last 3 months working on a `GSoC project`_. My GSoC project was
21+
related with writing the bindings of the Hydra C++ library. Hydra is a header-only
22+
C++ library designed and used to run on Linux platforms. Hydra is a
2323
templated C++11 library designed to perform common High Energy Physics data
24-
analyses on massively parallel platforms. The idea of this GSoC project is to
25-
provide the bindings of the Hydra library, so that the python support for
26-
Hydra library can be added and python can be used for the prototyping or
24+
analysis on massively parallel platforms. The idea of this GSoC project was to
25+
provide the Python bindings for the Hydra library, so that the Python support
26+
can be added to the overall Hydra project and Python can be used for the prototyping or
2727
development.
2828

2929

3030
.. _GSoC project: https://summerofcode.withgoogle.com/projects/#6669304945704960
3131

32-
My original proposal deliverables and my final output looks a little bit
33-
different and there are some very good reasons for it. The change of
32+
My original proposal deliverables and final output ended up looking a little bit
33+
different, and there are some very good reasons for it. The change of
3434
deliverables will become evident in the discussion of the design challenges
3535
and choices later in the report. In the beginning the goal was to write the
3636
bindings for the ``Data Fitting``, ``Random Number Generation``,
3737
``Phase-Space Monte Carlo Simulation``, ``Functor Arithmetic`` and
3838
``Numerical integration``, but we ended up having the bindings for
3939
``Random Number Generation`` and ``Phase-Space Monte Carlo Simulation`` only.
40-
(Though remaining classes can be binded with some extra efforts but we do
40+
(The remaining classes can be binded with some extra effort but we do
4141
not have time left under the current scope of GSoC, so I have decided to
42-
continue with the project outside the scope of GSoC.)
42+
continue with the project outside the scope of GSoC given my interest in the project.)
4343

4444

45-
Choosing proper tools
46-
*********************
45+
Choosing the proper tools
46+
*************************
4747

48-
Let me take you to my 3 months journey. First step was to find a tool or
49-
package to write the bindings. Several options were in principle available to
50-
write the bindings for example in the beginning we tried to evaluate the
51-
`SWIG`_.
48+
Let me take you though my three-month journey. First step was to find a tool or
49+
package to write the bindings with. Several options were in principle available to
50+
write the bindings. For example, at the beginning we tried to evaluate the
51+
`SWIG`_ project.
5252
But the problem with SWIG is, it is very complicated to use and second it
5353
does not support the ``variadic templates`` while Hydra underlying
5454
`Thrust library`_ depends heavily on variadic templates. After trying hands
5555
with SWIG and realizing it cannot fulfill our requirements, we turned our
56-
attention to `Boost.Python`_ which looks quite promising and a very large
57-
project but this large and complex suite project have so many tweaks and
58-
hacks so that it can work on almost any compiler but with added so many
59-
complexities and cost. Finally we turned our attention to use `pybind11`_.
60-
A quote taken from pybind11 documentation,
56+
attention to `Boost.Python`_, which looked quite promising. It is a very large
57+
project; but this large and complex suite project has so many tweaks and
58+
hacks so that it can work on almost any compiler. It does add much
59+
complexity and cost. Finally, we turned our attention to the newer `pybind11`_ project.
60+
A quote taken from the pybind11 documentation,
6161

6262
Boost is an enormously large and complex suite of utility libraries
6363
that works with almost every C++ compiler in existence. This compatibility
@@ -80,31 +80,30 @@ to go ahead with pybind11. Next step was to `familiarize myself`_ with pybind11.
8080
The Basic design problem
8181
************************
8282

83-
Now we needed to solve the basic design problem which is the `CRTP idiom`_.
84-
Hydra library relies on the CRTP idiom to avoid runtime overhead. I
83+
The basic design problem is the `CRTP idiom`_.
84+
The Hydra library relies on the CRTP idiom to avoid runtime overhead. I
8585
investigated a lot about CRTP and it took a little while to finally come up
86-
with a solution that can work with any number N. It means our class can accept
87-
any number of particles at final states. (denoted by N) If you know about
88-
CRTP, it is a type of static polymorphism or compile time polymorphism. The
89-
idea that I implemented was to take a parameter from python and based on that
86+
with a solution that can work with any number of final-state particles (denoted N) often used in Hydra applications.
87+
If you know about CRTP, it is a type of static polymorphism, or compile-time polymorphism. The
88+
idea that I implemented was to take a parameter from Python and, based on that
9089
parameter, I was writing the bindings in a new file, compiling and generating
91-
them on runtime with system calls. Unfortunately generating bindings at
90+
them on runtime with system calls. Unfortunately, generating bindings at
9291
runtime and compiling them would take a lot of time and so, it is not
93-
feasible for user to each time wait for few minutes before actually be
94-
able to use the generated package. We decided to go ahead with fixed number
95-
of values. Means we generate bindings for a limited number of particles.
96-
Currently python bindings for classes supports up to 10 (N = 10) number of
97-
particles at final state. We can make that to work with any number we want,
92+
feasible for a user to each time wait for a few minutes before actually being
93+
able to use the generated package from Python. We decided to go ahead with a fixed number
94+
of values of N. It means we generate the bindings for a limited number of particles.
95+
Currently the Python bindings for the Hydra classes support up to 10 (N = 10) number of
96+
particles in the final state. Note that we can make that to work with any number we want,
9897
as our binding code is written within a macro, so it is just a matter of
99-
writing additional 1 extra call to make it use with extra value of N.
98+
writing additional and trivial-to-add extra calls to make the bindings work for extra values of N.
10099

101100
.. _CRTP idiom: https://en.wikipedia.org/wiki/Curiously_recurring_template_pattern
102101

103102

104-
The Hydra Binding
105-
*****************
103+
The Hydra bindings
104+
******************
106105

107-
Now that the approach was decided, we jump into the bindings of Hydra.
106+
Now that the approach was decided, we jumped into the bindings of Hydra.
108107
(Finally after so many complications but unfortunately this was not the
109108
end of them.) We decided to bind the most important classes first,
110109
``Random Number Generation`` and ``Phase-Space Monte Carlo Simulation``.
@@ -121,20 +120,20 @@ to generate the phase space monte carlo simulation.
121120
[F. James, Monte Carlo Phase Space, CERN 68-15 (1968)]
122121
(https://cds.cern/ch/record/275743).
123122

124-
The Momentum and Energy units are GeV/C, GeV/C^2. The PhaseSpace monte
125-
carlo class depends on the ``Vector3R``, ``Vector4R`` and ``Events`` classes.
123+
The momentum and energy units are GeV/c and GeV/c^2, respectively. The PhaseSpace Monte
124+
Carlo class depends on the ``Vector3R``, ``Vector4R`` and ``Events`` classes.
126125
Thus PhaseSpace class cannot be binded before without any of the above classes.
127126

128127
The ``Vector3R`` and ``Vector4R`` classes were binded. There were some problems
129-
like generating ``__eq__`` and ``__nq__`` methods for python side but I solved
130-
them by creating ``lambda function`` and iterating over values and checking
128+
like generating ``__eq__`` and ``__nq__`` methods for the Python side but I solved
129+
them by creating ``lambda functions`` and iterating over values and checking
131130
if they satisfy the conditions or not. The ``Vector4R`` or four-vector class
132-
represents a particle. The idea is I first bind the particles class
131+
represents a particle. The idea is I first bound the particles class
133132
(the four-vector class) than I had to bind the ``Events`` class that will
134-
hold the Phase Space generated by the ``PhaseSpace`` class, and then bind the
133+
hold the Phase Space events generated by the ``PhaseSpace`` class, and then bind the
135134
actual ``PhaseSpace`` class. The ``Events`` class were not so easy to bind
136135
because they were dependent on the ``hydra::multiarray`` and without their
137-
bindings, the ``Events`` class was impossible to bind. Thanks to my mentor
136+
bindings, the ``Events`` class was impossible to bind. Thanks to my mentors
138137
who had already binded these bindings for ``Random`` class with some tweaks on
139138
the pybind11’s bind_container itself. We even faced some design issues of
140139
Events class in Hydra itself. But eventually after solving these problems,
@@ -165,7 +164,7 @@ After completing the PhaseSpace code, I quickly converted the code into macro
165164
for supporting up-to 10 particles.
166165

167166
Now the PhaseSpace class was working perfectly! Next step was to create a
168-
series of test cases and documentation and of-course the example of
167+
series of test cases, documentation, and of-course the example of
169168
PhaseSpace class in action. The remaining algorithms that I named at the
170169
start of the article are left to implement.
171170

@@ -178,17 +177,17 @@ things not only related with programming but related with high energy physics.
178177
I learned about *Monte Carlo Simulations*, and how they can be used to solve
179178
challenging real life problems. I read and studied a research paper
180179
( https://cds.cern.ch/record/275743/files/CERN-68-15.pdf ), learned about
181-
particle decays, learned the insights of C++ varidiac templates,
180+
particle decays, learned the insights of C++ variadic templates,
182181
wrote a blog about `CRTP`_, learned how to compile a
183-
python function and why simple python functions cannot be used in
182+
Python function and why simple Python functions cannot be used in
184183
multithreaded environments. Most importantly I learned how to structure
185184
a project from scratch, how important documentation and test cases are.
186185

187186

188187
.. _CRTP: https://medium.com/@deepanshu2017/a-curiously-recurring-python-d3a441a58174
189188

190189

191-
Special Thanks
190+
Special thanks
192191
**************
193192

194193
Shoutout to my amazing mentors. I would like to thank

0 commit comments

Comments
 (0)