Skip to content

Conversation

@danjong99
Copy link

BK!

@danjong99
Copy link
Author

Hi, I'm struggling to make packages. Can you please see the outline of my project and give me some feedback?

@leej3
Copy link

leej3 commented May 3, 2020

Well done. This is excellent. Keep up the good work. You had a package previously but you moved the setup.py down a directory. Licence, setup.py, readme etc should be in the root directory (not in a subdirectory). I've altered the circleci configuration file so that the bk_packages subdirectory is used instead of ".". Everything works with that small change. You can merge the pull request I submitted for this if you wish. You can instead choose to move things up one directory. Whatever you'd prefer.

Other general comments:

  • The visualization notebook is great. Nice usage of a notebook to quickly display what the repo/package is about.
  • You should remove reduceCells notebook now. Deleting it (and committing the deletion) will not remove the gradual progress you demonstrate with it in your git history. Large amounts of code should be part of the package (and tested as much as possible).
  • The data directory should likely go inside the tests directory. This isn't a big deal and there's no need to change it but in general data is needed for tests and so should be hidden away in that directory. It is somewhat a matter of preference though. Also note that git is pretty terrible for storing data. You can store small example datasets but anything over a few hundred kilobytes is generally a bad idea. Every time you change it slightly, if it is a binary file, it now has both versions of the file in history. Not good in the long run. The size of the repository quickly becomes unmanageable. No need to worry about it now but in the future consider using tools like git-annex/datalad/git-lfs/urls for downloading the data on the fly.
  • If you have time look into entry points or the scripts keyword for setuptools to make the installation provide a command line tool to users. I would rename main.py to be something more meaningful. And even rename the main function, to more usefully describe what the pipeline does.
  • have a look at the pep8 style guide. It is good to slowly learn the standards ways in which style can impart information. For example your functions should not be camel case, all lower case with words separated by "_" would be best. For strings a triple quote is multiline (so no need to use them on every line. When you want to add a comment, unless there is a good reason, it is likely best to put it on its own line. Going beyond 80 characters on a line is discouraged.

@danjong99
Copy link
Author

hi John, I have changed file location upon your comments. Thanks, Now, I thought that class would be a good choice to implement what I think. So, I have changed my codes a little bit, putting functions inside two classes, parent and child. The thing is I made basics methods as a seperate .py and generated editable package and call modules and methods ( you can see my test_example.ipynb under tests folder ), somehow, these basic methods are not callable in classes. Can you please help me? Thanks,

@leej3
Copy link

leej3 commented May 4, 2020

You will need to import all the functionality you require at the top of your file that defines the class. You should do all of this at the top of the file. So pandas, numpy,etc. and then all of your own code. So from bk_packages.basic_methods import func_a, func_b,...

@leej3
Copy link

leej3 commented May 4, 2020

It might be worth considering making the functions methods of the class. It depends what encapsulation you are going for with the class but often if you have lots of functions that are all dependent on the same settings/data then there is little use in having them independent of the class that you have defined to encapsulate their usage. I haven't looked at the code enough to know whether they are useful independently (or if using a class is appropriate).

@danjong99
Copy link
Author

Hi John, this is my final version. Please take this last pull request version for final score. I know we still have more time. But, I need to go back to lab works (actually, I poured all my last 7 days to finish this and it's ironically good time for this type of works). I used to use R for seq analysis and now python seems to be very nice option for future works. Plus, I will definitely take the mini-course you recommended. Thank you for ur lecture in this semester.

@leej3
Copy link

leej3 commented May 6, 2020

Will do. Good job. Looks like you learned a lot.

@leesup
Copy link

leesup commented May 11, 2020

You did an amazing job on your project! You provided a really impressive project description utilizing the beauty of markdown. Just to comment on what you said about avoiding well-built packages as much as possible as this is a learning course. I actually strongly recommend using them even for learning. I would suggest adding more comments especially to docstrings in the future. Overall, the project was fantastic.

-Paul

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants