Skip to content

Conversation

ericholscher
Copy link
Member

@ericholscher ericholscher commented Aug 14, 2025

This is similar to #12386,
but trying to do a simpler version first.

Keep in mind that this will, whenever possible, be executed purely in SQL, and so the delete() methods of individual object instances will not necessarily be called during the process. If you’ve provided a custom delete() method on a model class and want to ensure that it is called, you will need to “manually” delete instances of that model (e.g., by iterating over a QuerySet and calling delete() on each object individually) rather than using the bulk delete() method of a QuerySet.

https://docs.djangoproject.com/en/5.2/topics/db/queries/#deleting-objects

This is similar to #12386,
but trying to do a simpler version first.
@ericholscher ericholscher requested a review from a team as a code owner August 14, 2025 17:28
@ericholscher ericholscher requested a review from stsewd August 14, 2025 17:28
# Remove imported files and pageviews faster
if version:
version.imported_files.all().delete()
version.page_views.all().delete()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also called when a version is deactivated, do we want to remove pageviews in that case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, tbh. I guess it's pretty aggressive, and might cause people to lose data?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to call these on Project.delete and Version.delete only

version.imported_files.all().delete()
else:
project.imported_files.all().delete()
project.page_views.all().delete()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't necessary called when a project is deleted, it's also called from the admin. I'd put this under the delete method only (most of this data can be recovered, but pageview can't).

stsewd added a commit that referenced this pull request Sep 4, 2025
For some related fields, Django first does a select, then it probably
loads each object into memory? But after that it does the deletion using
a `IN` statement with the ID of the objects...

Using raw_delete avoids that, but it can also cause integrity problems,
as Django does its own implementation of on_delete, so I chose models
that don't have additional relations and also shouldn't have problems as
they are always deleted when the project is deleted. Another source of
problems is when models have foreign fields with on_delete=SET_NULL, as
django will do an UPDATE over the model with all the related fields...
for example for latest_build, django will run an update over all
projects with an `IN (... all builds IDs..)`, to set the latest_build ID
to null..., which is silly since the whole project model is being
deleted, but django doesn't know that lates_build can only contain a
build from the project. A raw delete could help, but builds and versions
have a lot of related fields in other models, so I'm a little hesitant.

Closes #12410
Ref #10040
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants