When monitoring /var/log/journal/remote events are missed when two remote hosts send messages at the same time (updates)

**Context**
- 1 x log server (VM on Apple M2)
- 2 x log client (VM on Apple M2)
- Using systemd-journal-remote on server (http)
- Using systemd-journal-upload on clients (http)

**Testing Method**
We first cleanup the server with:

```
rm /var/log/journal/remote/*
```

On each client we run (side by side / simultaneously:
```shell
for x in {1..10000}; do echo "EVENT $x" | logger ; done
```

So we expect 20,000 records on the server, which we validate:

```
root@log-server:/home/ansible# journalctl -D /var/log/journal/remote/ --no-pager | grep -v Boot | wc -l
20000
```

I've tested this many times to confirm that the systemd-journal-remote/upload mechanism is OK.

**The failing code**

```python
import select
from systemd import journal

 j = journal.Reader(path="/var/log/journal/remote")
 j.seek_tail()
 j.get_previous() # prevent wrap around

 p = select.poll()
 p.register(j, j.get_events())

 print("Starting....")
 event_counter = 0
 while p.poll():
      for x in j:
          event_counter += 1
      if j.process() != journal.APPEND:
          print("...WAITING....")
      else:
          print(f"Event counter: {event_counter}")
```
First of all, I'm not sure if this code is right. If I've made a glaring mistake, I apologise in advance for this report.

That said, I get correct results if I run my '10K log generator command' on one client at a time.

```
Event counter: 9915
Event counter: 9933
Event counter: 9945
Event counter: 9950
Event counter: 9987
Event counter: 10000
```

If I run the command at the same time on each client at once, I would expect 20K records, but it gets nowhere near that:

```
Event counter: 10228
Event counter: 10228
Event counter: 10228
Event counter: 10228
Event counter: 10228
Event counter: 10228
Event counter: 10228
```

In this example I'm missing almost half of all expected events. Yet I can confirm on the log server with the 'journalctl' command that there are actually 20K log records stored on the server.

Even if I test with just 1K events per client and run that simultaneously it's also clearly visible:

```
Event counter: 1096
Event counter: 1141
Event counter: 1215
Event counter: 1430
```

Happy to run any additional tests if that helps. 
If I made a huge mistake, please let me know.

**Update 1**
I've been testing this issue on virtual machines on my mac. 
I've also a few physical x86 micro pc's and the issue is the same.
Rate-limiting the logging with sleep .01 doesn't impact the result.

Question, is python-systemd built to handle remote logging from multiple clients through polling?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

When monitoring /var/log/journal/remote events are missed when two remote hosts send messages at the same time (updates) #134

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

When monitoring /var/log/journal/remote events are missed when two remote hosts send messages at the same time (updates) #134

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions