Skip to content

Conversation

@klaernie
Copy link

@klaernie klaernie commented Oct 9, 2024

As the list of email is generated before all emails can be iterated, there is always a chance that another program accesses the Maildir and decides to delete a file.

In my use case we iterate over a directory scanning for the most recently received mail, and have a daily cleanup job remove older mails. Expectedly whenever the cleanup job runs our script fails to complete, since the script needs to scan and parse 30k mails, and the cleanup job only needs to delete files (so it always overtakes the scanning script).

The solution is to ignore only "No such file or directory" errors when opening a mail, and if this error occurs directly go to the next message. All other errors are reported back to the user as before.

I've considered the following alternatives:

  • read all emails into memory directly, before even parsing them => possibly too much mail to sensibly keep in memory
  • only one call of readdir() per next_message() call => already POSIX.1-2024 suggests in the description of readdir and opendir that there are no guarantees if files created since the opendir or last rewinddir are returned at all, or files skipped that have been deleted. Conversely they make the remark, that applications conventionally have requested buffers with more than one directory entry, so no matter what we do in perl code, the libraries will already have a cache filled with possibly outdated files.

Hence I deemed both alternatives non-feasible.

As the list of email is generated before all emails can be iterated,
there is always a chance that another program accesses the Maildir and
decides to delete a file.

In my use case we iterate over a directory scanning for the most
recently received mail, and have a daily cleanup job remove older mails.
Expectedly whenever the cleanup job runs our script fails to complete,
since the script needs to scan and parse 30k mails, and the cleanup job
only needs to delete files (so it always overtakes the scanning script).

The solution is to ignore only "No such file or directory" errors when
opening a mail, and if this error occurs directly go to the next
message. All other errors are reported back to the user as before.

I've considered the following alternatives:
- read all emails into memory directly, before even parsing them
  => possibly too much mail to sensibly keep in memory
- only one call of readdir() per next_message() call
  => already POSIX.1-2024 suggests in the description of readdir and
  opendir that there are no guarantees if files created since the
  opendir or last rewinddir are returned at all, or files skipped that
  have been deleted. Conversely they make the remark, that applications
  conventionally have requested buffers with more than one directory
  entry, so no matter what we do in perl code, the libraries will
  already have a cache filled with possibly outdated files.

Hence I deemed both alternatives non-feasible.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant