Skip to content

Add add_vector_layer function #445

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Conversation

giswqs
Copy link
Contributor

@giswqs giswqs commented Feb 5, 2025

Fix #437

This PR adds:

  • add_vector_layer function for loading any geopandas supported vector format, including shapefile, geopackage, kml
  • vector_to_geojson function for converting any geopandas supported vector format to GeoJSON
  • Update the notebook example to include the add_vector_layer examples.

📚 Documentation preview: https://jupytergis--445.org.readthedocs.build/en/445/
💡 JupyterLite preview: https://jupytergis--445.org.readthedocs.build/en/445/lite

Copy link
Contributor

github-actions bot commented Feb 5, 2025

Binder 👈 Launch a Binder on branch giswqs/jupytergis/vector

@mfisher87 mfisher87 added the enhancement New feature or request label Feb 5, 2025
Copy link
Contributor

github-actions bot commented Feb 5, 2025

Integration tests report: appsharing.space

**kwargs,
)
else:
df = gpd.read_file(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this will download the GeoJSON if I do something like:

url = "https://github.com/opengeos/datasets/releases/download/world/countries.geojson"
doc.add_vector_layer(path=url, name="GeoJSON")

This is unfortunate since it will embed the GeoJSON data into the JGIS file, which may be an unwanted behavior. Keeping the GeoJSON source pointed by URL may be better.

So I guess we'd want to consider two approaches in this function:

  • if it's a URL, we keep the URL as-is and don't embed the data
  • if it's a local file or some in-memory data in Python, we don't have the choice but to embed the data so we keep this logic here

@mfisher87
Copy link
Member

@martinRenou @giswqs how can we get this PR moving forward?

On Martin's comment above, I prefer not embedding the data in the project file. GeoJSON files can get quite large, and I think if we're ever going to be embedding data in the project file we should consider a string-encoded binary format.

@martinRenou
Copy link
Member

we should consider a string-encoded binary format

There are cases we embed JSON data already (e.g. when passing data from Python's memory directly, instead of through file url/path), we should definitely improve things and consider another approach 👍🏽

@mfisher87
Copy link
Member

Yeah. I think it makes sense to punt on a better format until later :)

@arjxn-py arjxn-py requested a review from martinRenou June 2, 2025 11:10
@mfisher87
Copy link
Member

We're looking to get this PR over the finish line. We need to rebase to start with.

Then, we discussed a plan which we're not sure how to implement, and I'd like to revisit the minimal version of this that we can complete tomorrow.

  • When adding a layer in a format OpenLayers doesn't support with the Python API, convert the data to a local GeoJSON file (aka "the cached file") in a temp location instead of embedding in the project file JSON. Use geopandas as the mechanism to convert the data as this PR already does.
  • Store the path to the cached file as a new schema element.
  • When loading the project, check if the cached file exists, and if not, re-hydrate the cache. BUT: If we're using geopandas as the conversion mechanism, how do we trigger that when opening a project file by double-clicking on it, as opposed to calling add_vector_layer from a Notebook with a running Python kernel?

I think we've bitten off a big task in this plan and that we can break it into two concerns:

  1. Provide a convenience method to add a vector layer "smartly" -- less API surface area for users to keep in their brain. Fairly achievable in a small amount of time.
  2. Work towards eliminating the process of embedding GeoJSON in the project file. Requires some big architectural thinking and broad code changes. We all feel this is architecturally important, but don't have an ideal solution.

@arjxn-py
Copy link
Member

arjxn-py commented Aug 14, 2025

Thanks all for looking into this. I agree with most of the things mentioned. Rebase shall be easy since no conflicts.

When adding a layer in a format OpenLayers doesn't support with the Python API, convert the data to a local GeoJSON file (aka "the cached file") in a temp location instead of embedding in the project file JSON. Use geopandas as the mechanism to convert the data as this PR already does.

I like that we would not be embedding the geojson in .jGIS file - but can we also think of giving user an option if they don't want to save the data in a temp location but maybe a defined path? - Since we also need to consider the portability/shareability of the .jGIS file.

When loading the project, check if the cached file exists, and if not, re-hydrate the cache.

Infact, I don't think our typescript logic supports something like this yet. In theory, if we want to do this - we'd have to store the path of both Cached Source & Actual Source in the schema & do the thing you're suggesting.

BUT: If we're using geopandas as the conversion mechanism, how do we trigger that when opening a project file by double-clicking on it, as opposed to calling add_vector_layer from a Notebook with a running Python kernel?

That's the core blocker - we simply can't use geopandas without a python server - either we find a similar tool to do that conversion on the typescript part or we raise an informative error admitting it as a current limitation when not using a python server.

I think we've bitten off a big task in this plan and that we can break it into two concerns:

  1. Provide a convenience method to add a vector layer "smartly" -- less API surface area for users to keep in their brain. Fairly achievable in a small amount of time.
  2. Work towards eliminating the process of embedding GeoJSON in the project file. Requires some big architectural thinking and broad code changes. We all feel this is architecturally important, but don't have an ideal solution.

Agreed - this split makes sense. Thanks again, everyone. I hope each of you gets to contribute your piece, and that you’re enjoying the hack!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support Adding Shapefile Layers via Python API
4 participants