Note
If you want to use the paceship package without further ado, then install it from GitHub in your environment!
pip install git+https://github.com/PACEHackWeek/paceship
And use it!
import paceship
?paceship.simple_bbox
Welcome to our open science project, serving as a demonstration and practice repository for the PACE Data Hackweek.
- DEMONSTRATION: The stages below offer an opinionated process for authoring an open science project that uses Python. Click on the version "tag" noted in the subsections below to peek into the history of this repository at that stage.
- PRACTICE: The repository is for real! Not only can you trace the stages of development, you can use the repository to practice sharing your code with fellow hackweek participants.
You can explore the state of this repository at any stage described below on GitHub (i.e. without needing a local clone)
simply by following the v0.x links.
Navigate back to the main branch to see the current state of the repository.
Note that you should not clone this repository as a way of starting your own project, nor is it a template repository.
Instead, try following these stages from scratch for your own project!
Stage 1 (v0.1)
Create a code repository on GitHub1 and start composing a README in your browser.
- open https://github.com/new (sign in if prompted)
- enter your project's "code name"
in "Repository name" - select the option to "Add a README file"
- click the "Create repository" button
A README is a special document to all writers of code—it is the first place to look for information and is the "landing page" for any repository on GitHub. Use your README to welcome readers and introduce the project. You can edit this GitHub Flavored Markdown file and create a commit right on the GitHub platform.
Stage 2 (v0.2)
Start authoring a Jupyter Notebook on the CryoCloud.
- clone your repository
- create a
notebooksfolder with a new notebook - add a
.gitignoreto prevent "data" from ending up in your code repository
Nearly all projects will have notebooks, and some projects may only have notebooks! Keeping your notebooks in a subfolder makes your project ready for future complexification, and provides a dedicated area for "data" that are not a good fit for code repositories.
Stage 3 (v0.3)
Change some of your code into functions or classes to allow re-use and improve modularity. Learn more about "Defining Your Own Python Function" Python functions in the Real Python lesson. Open science projects often don't need classes, unless your project evolves into a package that needs "The Power of Object-Oriented Programming".
- choose part of your notebook that can be "modularized"
- sort "inputs" into an argument list, providing default values for optional arguments
- define the function in your "Setup" section, and use it!
Stage 4 (v0.4)
While Jupyter Notebooks are text-based (they are JSON files), they can have large binary outputs encoded as text and have other drawbacks when it comes to collaboration with git. Jupyter Notebooks also make it possible to execute workflows "out of order", leading to problems with reproducibility. For both of these reasons, you may want to work with Shell or Python scripts for part of your project.
TODO:
- scripts/png-to-s3 move figure to s3 temporary storage
Stage 5 (v0.5)
So now you want to use a function or class in multiple notebooks?
The most robust way forward—this seems like a big deal but is not—is to migrate your functions and classes to a Python package in the same project.
A package is something you can import in your notebooks, once it is made visible to the Python interpreter (i.e. "installed").
- use the
uvcommand line tool for initializing the packageuv init --package
- tell git to ignore the ".python-version" file
- move your functions and classes to ".py" files inside the
srctree - add your dependencies to your "pyproject.toml" file
uv add numpy pillow
- use
pipto perform an "editable" install into your Python interpreterpip install -e .
Stage 6 (v0.6)
Prepare for users and contributors!
TODO:
- CONTRIBUTING.md
- LICENSE
- citation.cff
- zenodo github integration
You should fork this repository for practice contributing to an open science project.
Consider adding a notebook or a script by dropping it into the 🛰️PACESHIP repository through a pull request.
You could also add functions to the src/paceship folder containing the Python package .
Like many projects, we include a CONTRIBUTING.md guide to support first-time contributors, so take a look and ask a hackweek mentor for friendly supervision if desired!
When working on the package from a local clone, install it from your clone in "editable" mode.
First, make sure you are in the paceship directory, then:
pip install -e .
We would like to thank the US-OCB office for sponsoring the PACE Data Hackweek, and to acknowledge all hackweek participants who improve this repository through feedback or contributions.
Footnotes
-
Or use whatever website works for you and your collaborators! It may be another cloud platform (e.g. https://bitbucket.org/) or a platform hosted by your workplace (e.g. https://git.smce.nasa.gov). ↩