Skip to content

Commit a8940cf

Browse files
committed
added some info from main
Merge branch 'gh-pages' of github.com:carpentries-incubator/managing-computational-projects into JC-reorganisation # Please enter a commit message to explain why this merge is necessary, # especially if it merges an updated upstream into a topic branch. # # Lines starting with '#' will be ignored, and an empty message aborts # the commit.
2 parents 5fca74e + c2f0835 commit a8940cf

File tree

3 files changed

+57
-33
lines changed

3 files changed

+57
-33
lines changed

README.md

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -12,13 +12,15 @@ Materials developed through this project will enable (1) a foundational understa
1212

1313
For details about the project and track management related information, please the [Project Management Repository](https://github.com/alan-turing-institute/data-training-for-bioscience/).
1414

15-
## Maintainer(s)
15+
## Developers and Maintainers
1616

17-
Current developers and maintainers of this lesson are
18-
19-
* Lydia France
2017
* Malvika Sharan
21-
* Federico Nanni
18+
* Julien Colomb
19+
20+
### Previous developers
21+
22+
* Lydia France was allocated as a developer on this project for six months in 0.5 FTE capacity.
23+
* Federico Nanni provided supervision for Lydia and contributed to the project planning
2224

2325
------
2426

episodes/02-motivation.md

Lines changed: 49 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -21,21 +21,28 @@ You are also likely to work on your project with other members of the lab, and t
2121

2222
<img src="../fig/skill-spectrum.jpg" alt="Researchers represented in a map indicating their journey to understand and apply computational approaches. Some may have just started their journey, some may have come far in the learning and some may have gained proficiency based on their research requirements." width="500"/>
2323

24-
_We want to acknowledge the data science knowledge will vary. The Turing Way project illustration by Scriberia for The Turing Way Community Shared under CC-BY 4.0 License. Zenodo. http://doi.org/10.5281/zenodo.3332807
24+
*We all may have dfferent research and data science expertise. The Turing Way project illustration by Scriberia for The Turing Way Community Shared under CC-BY 4.0 License. Zenodo. http://doi.org/10.5281/zenodo.3332807*
2525

26-
Contents of training introduces methods and concepts to manage individuals and teams working on any computational project, which in the current era is literally all research projects.
26+
> ## Why are you here
27+
> Discuss why you/learners are taking this course, what are the expectations.
28+
> Does the expectations align with the relevance of data science and content of this course?
29+
{: .discussion}
30+
31+
Contents of this training material introduces methods and concepts to manage individuals and teams working on any computational project, which in the current era is literally all research projects.
2732
It is *not* about learning how to write code, but building a foundational understanding for computational methods that could be applied to your research.
2833
Furthermore, this training will provide guidance for facilitating collaboration and data analysis using tools like research data management, version control or code review.
2934

30-
We believe that the data science skills you will learn in this training will make your research process better. In the following sections, we will detail what we mean by "better".
35+
We acknowledge the data science knowledge will vary.
36+
Nonetheless, we believe that the data science skills you will learn in this training will make your research process better. In the following sections, we will detail what we mean by "better".
3137

3238
## How data science will improve your research ?
3339

3440
<img src="../fig/healthy-research-tree.jpg" alt="Researchers pour water on a tree, the water represents data science, the tree is the research." width="500"/>
3541

36-
_Data science makes research flourish. The Turing Way project illustration by Scriberia for The Turing Way Community Shared under CC-BY 4.0 License. Zenodo. http://doi.org/10.5281/zenodo.3332807
37-
42+
*Data science makes research flourish. The Turing Way project illustration by Scriberia for The Turing Way Community Shared under CC-BY 4.0 License. Zenodo. http://doi.org/10.5281/zenodo.3332807*
3843

44+
> ## It is mostly about being efficient
45+
>
3946
> Data science brings some structure in how data is collected, processed and analysed, making it easier to collaborate on a project, to publish extra research outputs and leveraging some extra potential your data may have.
4047
In the past, it helped me drive new hypotheses, detect problems with the research design early, and reduce the sample size needed to drive a solid conclusion.
4148
Eventually, it made my research more robust and trustworthy.
@@ -45,7 +52,7 @@ But in the end, my real motivation is efficiency: very soon, the time I invested
4552

4653
There are different ways to organise the different foreseen improvement, we decided here to start with improvement in the final result, improvement in the research process, and finally aspects of community building.
4754

48-
### Nicer paper
55+
### Using code for nicer paper
4956

5057
#### Powerful statistics
5158

@@ -72,7 +79,11 @@ One can also automate the figure design choice, so that all figures look similar
7279
Similarly, the production of several version of the same figure is very easy.
7380
For example, one can use different color pallette, one using the palette usually used in the field (the one your supervisor wants to see), and one for color-blind readers.
7481

75-
**Example of single flights from different bees shown in supplemnentary data:** Menzel, R., Greggers, U., Smith, A., Berger, S., Brandt, R., Brunke, S., ...Watzl, S. (2005). Honey bees navigate according to a map-like spatial memory. Proceedings of the National Academy of Sciences of the United States of America, 102(8), 3040. doi: [10.1073/pnas.0408550102](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC549458/)
82+
> ## Single flights from different bees.
83+
>
84+
> See a good example of data representation in differen format single flights from different bees shown in supplemnentary data: *Menzel, R., Greggers, U., Smith, A., Berger, S., Brandt, R., Brunke, S., ...Watzl, S. (2005). Honey bees navigate according to a map-like spatial memory. Proceedings of the National Academy of Sciences of the United States of America, 102(8), 3040. doi: [10.1073/pnas.0408550102](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC549458/)*
85+
>
86+
{: .callout}
7687

7788
#### Reproducible analysis
7889

@@ -81,7 +92,7 @@ As a researcher, assuring computational reproducibility of your results is a rel
8192

8293
<img src="../fig/ReproducibleJourney.jpg" alt="Shows a landscape with different checkpoints fpr data, code, tools and result each of which require reproducible practices. There is a woman explaining her reproducibility journey to help new people start their journey" width="500"/>
8394

84-
_What to expect in your reproducibility journey. The Turing Way project illustration by Scriberia for The Turing Way Community Shared under CC-BY 4.0 License. Zenodo. http://doi.org/10.5281/zenodo.3332807
95+
*What to expect in your reproducibility journey. The Turing Way project illustration by Scriberia for The Turing Way Community Shared under CC-BY 4.0 License. Zenodo. http://doi.org/10.5281/zenodo.3332807*
8596

8697
The reproducibility of an experiment not only requires a detailed description of the methods and reagents used, but also a detailed description of the analysis performed.
8798
The ultimate description of the analysis is to provide all elements necessary for reproducing the analysis (computational reproducibility).
@@ -103,7 +114,6 @@ While the main recognition currency in academia is still (first) authorship in p
103114
In particular, datasets and software publication are officially reviewed in the evaluation of certain grant, for example for the Marie-curie european program.
104115
Data science principles will make it easier to publish datasets, software, reagents or hardware you are anyway producing during the research process.
105116

106-
107117
> By publishing datasets and code, you will not only help other researchers, but gain extra recognition for your work.
108118
However, open data and open code requires a specific documentation, which we will touch upon in this training.
109119
>
@@ -134,9 +144,6 @@ It makes also certain that difference in the figures are due to difference in th
134144

135145
#### Collaborative working
136146

137-
> Facilitating communication and sharing, will make it easier for your colleagues to help you.
138-
> {: .callout}
139-
140147
Within science teams, group work is critical for experimental design and implementation.
141148
In addition, there are rapid developments in how scientific results and methods are shared, and collaborations have never been more global or rapid.
142149
This means that several people will likely be working with the same data files.
@@ -145,12 +152,26 @@ Data science allows for the management of
145152
how one or multiple people work on the same project (as well as the same code).
146153
It requires different skillsets than those taught in traditional science courses *or* a typical coding class.
147154

155+
> ## Who can add to your research?
156+
>
157+
> Facilitating communication and sharing will make it easier for your colleagues to help you.
158+
> Can you think of people who can help you in your research, directly in your lab or at your institution ?
159+
> Would it help for them to have access to your data? How could they participate,
160+
> and how can you give them credit?
161+
>
162+
>> ## Needs from the future you
163+
>> It is very interesting to consider your future self as one collaborator in your project.
164+
>> Anything you may forget in the next three to five years should be documented,
165+
>> if you want your future self to be able to (re-)analyse the data you are collecting.
166+
>> Indeed, the advantage of working collaboratively in a project can indeed be translated directly in a project you drive mostly alone.
167+
>{: .solution}
168+
{: .discussion}
169+
148170

149171
#### Efficiency
150172

151173
> The time invested in your data and code will be paid multiple times by the efficiency improvement in your workflow, if that investment is done early in the project.
152-
Because one can consider your past self as one of your collaborator,
153-
the advantage of working collaboratively in a project can indeed be translated directly in a project you drive mostly alone.
174+
Because one can consider your past self as one of your collaborator, the advantage of working collaboratively in a project can indeed be translated directly in a project you drive mostly alone.
154175
>
155176
{: .callout}
156177

@@ -162,36 +183,39 @@ This applies directly to the example of working on article revisions - will you
162183
For instance, if a colleague cannot find what data goes with which figures, there are high chances that you will also be unable to find it three years from now.
163184
In addition, itt is not uncommon to modify the design of the figures multiple times (sometimes back and forth), often modifying all figures at once.
164185

186+
> ## Redoing all figures in minutes
165187
> Once a reviewer ask me to overlay individual data points onto all our 5 boxplots figures.
166188
The project was an old one, and I had not touched the data for years.
167-
Finding the right data and redo the all 5 figures would usually take ages using SPSS or excel.
168-
> But since I used code, I had all figures 15 minutes later.
169-
(Note, after seeing the new figures, the reviewer agreed that the original version was better).
170-
> {: .testimonial}
189+
>Finding the right data and redo the all 5 figures would usually take ages using SPSS or excel.
190+
>But since I used code, I had all figures 15 minutes later.
191+
>(Note, after seeing the new figures, the reviewer agreed that the original version was better).
192+
{: .testimonial}
171193

172194
Later on in the project, community advantages are coming in.
173195
Data and code reusability is not only a mark of research transparency and robustness, it also means you can reuse your own code and data.
174196
It also means you can reuse code and data produced by other researchers.
175-
The snow ball effect may be huge, and the objective of this lesson is to allow you to do **better science in less time** ( https://www.nature.com/articles/s41559-017-0160:)
176197

177-
> As an example it was estimated that research data management takes about 5% of your time, on the other hand, time lost due to poor data management is estimated to be 15%.
198+
The snow ball effect may be huge, and the objective of this course is to allow you to do **better science in less time**
178199

200+
> ## Invest in data science
201+
>As an example it was estimated that research data management takes about 5% of your time, on the other hand, time lost due to poor data management is estimated to be 15%.
202+
> See reference: *Lowndes, J. S. S., Best, B. D., Scarborough, C., Afflerbach, J. C., Frazier, M. R., O’Hara, C. C., Halpern, B. S. (2017). Our path to better science in less time using open data science tools. Nature Ecology & Evolution, 1(0160), 1–7. doi: 10.1038/s41559-017-0160*
203+
>
204+
{: .callout}
179205

180206
### Team and community building
181207

182208
<img src="../fig/research-foundation.jpg" alt="A house representing machine learing and AI is set upon bricks that one person is sliding below the house. On the bricks, we can read data science principles like open science, backups, reproducibiliy, and FAIR principles." width="500"/>
183209

184-
_Data science foundations. The Turing Way project illustration by Scriberia for The Turing Way Community Shared under CC-BY 4.0 License. Zenodo. http://doi.org/10.5281/zenodo.3332807_
210+
*Data science foundations. The Turing Way project illustration by Scriberia for The Turing Way Community Shared under CC-BY 4.0 License. Zenodo. http://doi.org/10.5281/zenodo.3332807*
185211

186212
Data science tools will make it easier not only to collaborate with researchers in your lab, but also with researchers outside of your lab, or even with non-researchers (citizen science or software professionals).
187213
These may bring valuable expertise in the project.
188-
Being part of a collaborative community will also create
189-
impact beyond citations and papers, something which starts to be valued by funding agencies, and which make research more fun, valued and interesting.
214+
Being part of a collaborative community will also create impact beyond citations and papers, something which starts to be valued by funding agencies, and which make research more fun, valued and interesting.
190215

191216
We may also add to the pot that creating a network around your research is a critical aspect of building a career in academia.
192217
Being known as a good and skilled collaborator can open doors to many opportunities.
193218

194-
195219
## A journey starts
196220

197221
> You step into the Road, and if you don't keep your feet, there is no knowing where you might be swept off to.
@@ -209,12 +233,10 @@ For instance, *The Turing Way* guide for data science and research provides seve
209233

210234
<img src="../fig/ttw-welcome.jpg" alt="drawing" width="400"/>
211235

212-
_ The Turing Way project illustration by Scriberia for The Turing Way Community Shared under CC-BY 4.0 License. Zenodo. http://doi.org/10.5281/zenodo.3332807
213-
236+
*The Turing Way project illustration by Scriberia for The Turing Way Community Shared under CC-BY 4.0 License. Zenodo. http://doi.org/10.5281/zenodo.3332807*
214237

215238
## References
216239

217-
218240
* A Quick Guide to Organizing Computational Biology Projects
219241
Noble WS (2009) A Quick Guide to Organizing Computational Biology Projects. PLOS Computational Biology 5(7): e1000424. https://doi.org/10.1371/journal.pcbi.1000424
220242
* Seddighi, M, Allanson, D, Rothwell, G, Takrouri, K. Study on the use of a combination of IPython Notebook and an industry-standard package in educating a CFD course. Comput Appl Eng Educ. 2020; 28: 952– 964. https://doi.org/10.1002/cae.22273

episodes/09-rdm.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -142,7 +142,7 @@ You can find a more detailed [overview of the FAIR principles by GO FAIR](https:
142142

143143
### Summary of "FAIR - How To"
144144

145-
> We have provided an additional lesson to discuss the How-Tos of FAIR principles in the context of data and software. Please see details in [](../../_extra/-4-FAIRHowTo.md).
145+
> We have provided an additional lesson to discuss the How-Tos of FAIR principles in the context of data and software. See [FAIR How-To for data and software](../../_extra/-4-FAIRHowTo.md) for detail.
146146
> - Reference: E. L.-Gebali, S. (2022). BOSSConf_2022_Research_Data_Management. Zenodo. doi: [10.5281/zenodo.6490583](https://doi.org/10.5281/zenodo.6490583)
147147
>
148148
{: .calllout}

0 commit comments

Comments
 (0)